Categories: Image

Introducing Stable LM 3B: Bringing Sustainable, High-Performance Language Models to Smart Devices

Today, we proudly launch an experimental version of Stable LM 3B, the latest in our suite of high-performance generative AI solutions. At 3 billion parameters (vs. the 7 to 70 billion parameters typically used by the industry), Stable LM 3B is a compact language model designed to operate on portable digital devices like handhelds and laptops, and we’re excited about its capabilities and portability.

Similar to our last Stable LM release, one of the key advantages of Stable LM 3B is its smaller size and efficiency. Unlike larger ones, these models require fewer resources and come with lower operating costs, making them highly accessible for most users. Not only does this make them more affordable, but it also makes them environmentally friendly, as they consume far less power. But do not let its size fool you; Stable LM 3B is highly competitive – it outperforms the previous state-of-the-art 3B parameter language models and even some of the best open-source language models at the 7B parameter scale. 

The development of Stable LM 3B broadens the range of applications that are viable on the edge or on home PCs. This means that individuals and companies can now develop cutting-edge technologies with strong conversational capabilities – like creative writing assistance – while keeping costs low and performance high. 

Compared to our previous Stable LM release, this version is significantly better at producing text while maintaining its fast execution speed. It has improved downstream performance on common natural language processing benchmarks, including common sense reasoning and general knowledge tests. To achieve this remarkable performance, Stable LM 3B has undergone extensive training. It was trained for multiple epochs on high quality data, resulting in a language model surpassing its predecessors’ performance at similar sizes.

Stable LM 3B is also versatile. While it is a general language model, it can be fine-tuned for alternative use, such as programming assistance. This could enable companies to cost-effectively customize this model on their data, e.g., as a customer support assistant, a coding assistant for a specialized programming language, etc.

Developers should be mindful that Stable LM 3B is a base model. That means it needs to be adjusted for safe performance in specific applications, such as a chat interface. Depending on their use case, developers must evaluate and fine-tune the model before deployment.  Our instruction fine-tuned model is undergoing safety testing right now, and we’re planning to release it soon.

We firmly believe that smaller, customizable models like Stable LM 3B will play an increasing role in practical use cases for generative AI and that open models will become the standard for auditable, trusted AI. This is an intermediate release ahead of our full release, and we encourage the community to try the model by downloading the weights on the Hugging Face platform. This current model is released under the open-source CC-By-SA 4.0 license. 

For further information on this release or to provide feedback, please email us at research@stability.ai

AI Generated Robotic Content

Share
Published by
AI Generated Robotic Content
Tags: ai images

Recent Posts

Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation

We just released RadialAttention, a sparse attention mechanism with O(nlog⁡n) computational complexity for long video…

9 hours ago

Mixture of Experts Architecture in Transformer Models

This post covers three main areas: • Why Mixture of Experts is Needed in Transformers…

9 hours ago

Your First Local LLM API Project in Python Step-By-Step

Interested in leveraging a large language model (LLM) API locally on your machine using Python…

9 hours ago

Use Amazon SageMaker Unified Studio to build complex AI workflows using Amazon Bedrock Flows

Organizations face the challenge to manage data, multiple artificial intelligence and machine learning (AI/ML) tools,…

9 hours ago

Capital One builds agentic AI modeled after its own org chart to supercharge auto sales

Capital One's head of AI foundations explained at VB Transform on how the bank patterned…

10 hours ago

A Pro-Russia Disinformation Campaign Is Using Free AI Tools to Fuel a ‘Content Explosion’

Consumer-grade AI tools have supercharged Russian-aligned disinformation as pictures, videos, QR codes, and fake websites…

10 hours ago