Meta Platforms showcased new AI features for Facebook, Instagram, and WhatsApp at its annual Meta Connect conference. However, the most significant announcement came in the form of a research paper published by Meta researchers on arXiv.org. The paper introduced Llama 2 Long, an AI model that outperforms competitors in generating responses to long user prompts. Llama 2 Long was built upon Meta’s open-source Llama 2 model and underwent continual pretraining and upsampling of long texts. This achievement emphasizes the effectiveness of Meta’s open-source approach and demonstrates that open-source AI can compete with closed-source models offered by well-funded startups.
To create Llama 2 Long, Meta researchers enhanced the original Llama 2 model by including more longer text data sources and making modifications to the positional encoding. The researchers decreased the rotation angle of the Rotary Positional Embedding (RoPE) encoding, enabling the model to include more distant tokens, resulting in a broader knowledge base. Using reinforcement learning from human feedback and synthetic data generated by Llama 2 chat itself, the researchers improved Llama 2 Long’s performance in various tasks, including coding, math, language understanding, common sense reasoning, and answering user prompts.
The release of Llama 2 Long has garnered admiration and excitement within the open-source AI community. It serves as validation for Meta’s open-source approach and demonstrates that open-source models can compete with closed-source models. This development also highlights the potential of Meta’s AI technology and its ability to outperform leading competitors. The research paper’s publication further solidifies Meta’s position as an innovative and influential player in the AI field.