Categories: Image

Stability AI Releases New Japanese language model, Japanese StableLM, Marking Entry into International Language Model  Market

Today, Stability AI released its first Japanese language model (LM), Japanese StableLM Alpha, the best-performing openly available LM created for Japanese speakers.

Japanese StableLM is a 7 billion-parameter general-purpose language model. It stands as the top-performing publicly available Japanese language model, according to a benchmark suite against four sets of other Japanese LMs.

Japanese StableLM Base Alpha 7B will be released under the commercially available Apache License 2.0. Japanese StableLM Instruct Alpha 7B is a model created for research purposes and is released exclusively for research use. For details, please refer to the Hugging Face Hub page.

“We are proud of our first big step towards contributing to the Japanese generative AI ecosystem,” said Meng Lee, Project Lead of Japanese StableLM. ”We look forward to continuing to create models across several modalities, built specifically to reflect Japanese culture, language and aesthetics”.

Japanese StableLM Base Alpha 7B

Japanese StableLM Base Alpha 7B is trained for text generation using large-scale data sourced mainly from the Web. The training data is predominantly composed of Japanese and English text, with the remaining 2 percent of material in the form of source code. 

In addition to open datasets, the training data includes datasets created by Stability AI Japan and datasets created with the cooperation of the Japanese team of the EleutherAI Polyglot project, along with members of Stability AI Japan’s community.

For training, we used software that is an extension of EleutherAI‘s GPT-NeoX. For example, the model architecture incorporates new technologies such as SwiGLU and xPos. A cumulative total of 750 billion tokens were processed across epochs.

Japanese StableLM Instruct Alpha 7B

The Japanese StableLM Instruct Alpha 7B model is a language model that is additionally tuned to follow user instructions. 

Supervised Fine-tuning (SFT) was employed for the additional training, and multiple open datasets were used. As discussed below, SFT also significantly improves the performance evaluation score by lm-evaluation-harness.

Performance Evaluation

To evaluate performance, we tested the model on tasks that include sentence classification, sentence pair classification, question answering, and sentence summarization. We measured the performances using the lm-evaluation-harness benchmark of EleutherAI.

 Similarly to the conventions in the Open LLM Leaderboard, the average of the scores in the eight tasks is calculated and used for the overall evaluation of each model. Japanese StableLM Instruct Alpha 7B scored 54.71, which places it far ahead of other Japanese models. Stability AI Japan is also in the process of improving the evaluation methodology for testing these models.

Terms of Use

The models are available on Hugging Face Hub, and can be tested for inference and additional training. For more information, please visit the Hugging Face hub pages linked below:

For more details, please refer to the Hugging Face Hub pages.

About Stability AI

Stability AI is an open access generative AI company working with partners to deliver next-generation infrastructure globally. Headquartered in London with developers around the world, Stability AI’s open philosophy provides new avenues for cutting-edge research in imaging, language, code, audio, video, 3D content, design, biotechnology, and other scientific research. For more information, visit https://stability.ai/.

AI Generated Robotic Content

Share
Published by
AI Generated Robotic Content
Tags: ai images

Recent Posts

AI, Light, and Shadow: Jasper’s New Research Powers More Realistic Imagery

Jasper Research Lab’s new shadow generation research and model enable brands to create more photorealistic…

11 hours ago

Gemini 2.0 is now available to everyone

We’re announcing new updates to Gemini 2.0 Flash, plus introducing Gemini 2.0 Flash-Lite and Gemini…

11 hours ago

Reinforcement Learning for Long-Horizon Interactive LLM Agents

Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response…

11 hours ago

Trellix lowers cost, increases speed, and adds delivery flexibility with cost-effective and performant Amazon Nova Micro and Amazon Nova Lite models

This post is co-written with Martin Holste from Trellix.  Security teams are dealing with an…

11 hours ago

Designing sustainable AI: A deep dive into TPU efficiency and lifecycle emissions

As AI continues to unlock new opportunities for business growth and societal benefits, we’re working…

11 hours ago

NOAA Employees Told to Pause Work With ‘Foreign Nationals’

An internal email obtained by WIRED shows that NOAA workers received orders to pause “ALL…

12 hours ago