In an era of Artificial intelligence, where giant tech firms including Open AI, Google, Meta, are in a constant rivalry against each other, launching their new and improved versions of AIs one after another. It is like a race, everyone unveils their versions claiming it to be better than others.
In the same AI race, Meta unveiled Llama 3 on Thursday, marking the next generation of its open-source large language model (LLM).
Meta claims this new Llama to be the most capable openly available LLMs to date. It said, “We believe these are the best open source models of their class, period.”
Llama 3 comes in two versions:
The first version is Llama 3 8B with 8 billion parameters, and second is Llama 3 70B, with 70 billion parameters.
Parameters here determine the model's knowledge acquired during training, which impacts its abilities. Generally, the higher the parameter count, the better the LLM is at tasks like text analysis and generation.
Meta said in a blog post, that its newest models saw "substantially reduced false refusal rates, improved alignment, and increased diversity in model responses," as well as progress in reasoning, generating code, and instruction.
This new generation of Llama shows state-of-the art performance across various industry benchmarks and introduces enhanced reasoning capabilities.
Meta in its blog post added, "This next generation of Llama demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.”
High Quality Training
Meta emphasises that it has used high-quality data for training its latest model, in a blog post, the firm said, “Llama 3 is pretrained on over 15T tokens that were all collected from publicly available sources.”
It added, “Our training dataset is seven times larger than that used for Llama 2, and it includes four times more code.” The company noted that over 5% of the Llama 3 pre-training data set consisted of high-quality non-English data from over 30 languages.
Meta has also utilised synthetic data, a slightly controversial method, to train on longer documents. It also asserts that Llama 3's training data was sourced entirely from publicly available sources.
Furthermore, Meta is in the process of training models with over 400 billion parameters. Future versions will hopefully support more than just text as a data format, be multilingual, and have enhanced coding and reasoning capabilities.
Integration in Meta AI
Both Models have been integrated into Meta AI, the company's AI assistant.
"Thanks to our latest advances with Llama 3, Meta AI is smarter, faster, and more fun than ever before," the company said in a blog post.
“With this new model, we believe Meta AI is now the most intelligent AI assistant that you can freely use,” said Mark Zuckerberg in an Instagram post on Thursday, adding that Meta AI would also be able to animate photos and speed up image generation.
Improved Model from Predecessors
According to Meta, both models significantly outperform their predecessor, Llama 2 and addresses the shortcomings of earlier models. It said, “Thanks to improvements in pretraining and post-training, our pre-trained and instruction-fine-tuned models are the best models existing today at the 8B and 70B parameter scale.”
Availability
As per Meta's blog post, a range of cloud computing platforms such as AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake will soon offer the new Llama 3d Model. Support for the models will also be provided by hardware platforms provided by
AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.
More Work needs to be done
Meta asked for user feedback while acknowledging that there was still work to be done. It plans to include improvements in reasoning and coding performance soon, as well as features like multilingual and multimodal capabilities.
As of right now, the 8 billion and 70 billion-parameter models can only produce text and have limited language capabilities outside of English. In response, Meta offers reassurance that they are working hard to address these limitations by creating more sophisticated versions.
“We are embracing the open source ethos of releasing early and often to enable the community to get access to these models while they are still in development,” the company mentioned.
It added, “Our goal in the near future is to make Llama 3 multilingual and multimodal…and continue to improve overall performance across core LLM capabilities such as reasoning and coding.”
Meta says that more models are slated to be released soon.
©️ Copyright 2024. All Rights Reserved Powered by Vygr Media.