BOOK THIS SPACE FOR AD
ARTICLE ADDespite Apple's initial delay in entering the AI space, after Apple's Worldwide Developer Conference, the company has gone all in on AI. Apple Intelligence will offer AI solutions for nearly all of Apple's offerings, and the company is not stopping there. Rather, Apple is now moving further into AI language models.
Last Thursday, Apple released DCLM-Baseline-7B, a 7 billion parameter language model, on Hugging Face. The model is part of the DataComp for Language Models (DCLM) benchmark, an initiative to improve the quality of training datasets for language models.
Also: Want to try GPT-4o mini? 3 ways to access the smarter, cheaper AI model - and 2 are free
At 7 billion parameters, this model is comparable to popular models such as Llama 2, Gemma, and more. When tested on the Massive Multitask Language Understanding (MMLU) benchmark against popular models around the same size, DCLM-Baseline-7B performed competitively, even outperforming Mistral 7B, as seen below.
Apple/Hugging FaceDespite its impressive performance, one of the DCLM-Baseline-7B's biggest standouts is that the model is truly open-sourced, with "open data, open weight models, open training code," as highlighted by Vaishaal Shankar, a research scientist at Apple.
We have released our DCLM models on huggingface! To our knowledge these are by far the best performing truly open-source models (open data, open weight models, open training code) 1/5
— Vaishaal Shankar (@Vaishaal) July 18, 2024Many are commending Apple for this approach as it allows other researchers and developers to build on the models and further grow advancements in the space. The model was trained on the DCLM-BASELINE data, combined with StarCoder and ProofPile2 data, to reach proficiency in other tasks such as coding and math.
Also: Every iPhone model that can be updated to Apple's iOS 18 (and which ones can't)
In addition to releasing DCLM-Baseline-7B, model weights, training code, and dataset, Apple also included a powerful 1.4 billion parameter version in the package.
This isn't Apple's first go-around with AI models, having released others such as Ferret-UI, a multimodal large language model (MLLM), and Reference Resolution As Language Modeling (ReALM), a conversational AI system. In the fall, when iOS 18 and Apple Intelligence become available, we'll be able to see Apple compete in the AI space and better gauge the potential success of its AI efforts.