Table of Contents
Stability AI, the company funding the development of open-source generative AI models like Stable Diffusion and Dance Diffusion, today announced the launch of its StableLM suite of large language models(LLM) just like ChatGPT. After developing models for multiple domains, including image, audio, video, 3D and biology, this is the first time the developer is jumping into the language model game currently dominated by tech heavyweights such as OpenAI, Meta and Stanford.
The suite’s first offering, the StableLM open-source language model, is now available in alpha, featuring 3 billion and 7 billion parameters, both trained on 800 billion data tokens, with larger 15-billion to 65-billion parameter models to follow.
“Language models will form the backbone of our digital economy, and we want everyone to have a voice in their design,” the Stability AI team wrote in a blog post on the company’s site.
In 2022, Stability AI introduced Stable Diffusion, a groundbreaking open-source image model that offers a transparent and scalable alternative to proprietary AI. With the release of the StableLM suite, the company aims to demonstrate how small, efficient models can provide high performance with the appropriate training.
StableLM is an extension of the company’s foundational AI technology, which promotes transparency, accessibility and support in AI design. Stability AI believes that the release represents another significant step towards making foundational AI technology accessible to all, with numerous applications, including generating text and code.
Open-source – The way forward
The StableLM suite builds on Stability AI’s prior work, including the groundbreaking Stable Diffusion image model, which offered an open-source alternative to proprietary generative AI image models such as DALL-E. In addition, the Stable language model can generate text and code, making it ideal for various downstream applications.
Despite its small size, the model is surprisingly effective in conversational and coding tasks (similar to OpenAI’s ChatGPT) due to its training on an experimental dataset. Stability AI has a track record of open-sourcing earlier language models, such as GPT-J, GPT-NeoX, and the Pythia suite, trained on The Pile open-source dataset.
StableLM-Alpha models are trained on the new dataset that builds on The Pile, which contains 1.5 trillion tokens. The new “experimental dataset” is supposedly three times larger than The Pile, the context length for the StableLM models being 4,096 tokens.
The AI ethics
Stable AI has claimed that models like StableLM demonstrate their commitment to AI technology that is transparent, accessible, and supportive:
- Transparent. We open-source our models to promote transparency and foster trust. Researchers can “look under the hood” to verify performance, work on interpretability techniques, identify potential risks, and help develop safeguards. Organizations across the public and private sectors can adapt (“fine-tune”) these open-source models for their own applications without sharing their sensitive data or giving up control of their AI capabilities.
- Accessible. We design for the edge so that everyday users can run our models on local devices. Using these models, developers can build independent applications compatible with widely-available hardware instead of relying on proprietary services from one or two companies. In this way, the economic benefits of AI are shared by a broad community of users and developers. Open, fine-grained access to our models allows the broad research and academic community to develop interpretability and safety techniques beyond what is possible with closed models.
- Supportive. We build models to support our users, not replace them. We are focused on efficient, specialized, and practical AI performance – not a quest for god-like intelligence. We develop tools that help everyday people and everyday firms use AI to unlock creativity, boost their productivity, and open up new economic opportunities.
In a post, the company announced that the StableLM suite also includes a set of research models that are instruction fine-tuned, using a combination of five recent open-source datasets for conversational agents. As a proof of concept, the company fine-tuned the StableLM model with Stanford Alpaca’s procedure using a combination of five recent datasets for conversational agents: Stanford’s Alpaca, Nomic-AI’s gpt4all, RyokoAI’s ShareGPT52K datasets, Databricks labs’ Dolly and Anthropic’s HH, and will be releasing these models as StableLM-Tuned-Alpha.
The LLM race just got bigger
The 800 billion-token training dataset is notable compared to Meta’s LLaMA language model, trained on 1 trillion tokens for 7 billion parameters.
Recently, Menlo Park-based firm Together announced the launch of RedPajama, an open-source project developed in collaboration with several AI institutions including Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research and MILA Québec AI Institute.
That project is quite similar to Stability AI’s approach, aiming to create large language models (LLMs) that are fully open-source and lead the industry in performance. The initial dataset released by RedPajama contains 1.2 trillion tokens and adheres to the LLaMA recipe, despite being significantly smaller than Meta’s LLaMA model. Its dataset is publicly available on Hugging Face, while Apache 2.0 scripts on Github can be used to reproduce the results.
In addition to its work on the StableLM suite, Stability AI is kicking off its crowd-sourced RLHF program and working with community efforts such as Open Assistant, an initiative to create an open-source dataset for AI assistants.
The company plans to release more models soon and says it is excited to collaborate with developers and researchers to roll out the StableLM suite. We will be releasing more models soon and are growing our team. If you are passionate about democratizing access to this technology and experienced in LLMs, please apply here!
In any case, similar to OpenAI, Stability AI hasn’t shied away from controversy, historically.
The company is in the crosshairs of legal cases that allege that it infringed on the rights of millions of artists by developing AI art tools using web-scraped, copyrighted images. And a few communities around the web have tapped Stability’s tools to generate pornographic celebrity deepfakes and graphic depictions of violence.
Moreover, despite the philanthropic tone of its blog post, Stability AI is also under pressure to monetize its sprawling efforts — which run the gamut from art and animation to biomed and generative audio. Stability AI CEO Emad Mostaque has hinted at plans to IPO, but Semafor recently reported that Stability AI — which raised over $100 million in venture capital last October at a reported valuation of more than $1 billion — “is burning through cash and has been slow to generate revenue.”