Categories: Image

Announcing Stable Audio, a product for music & sound generation

Stability AI, the world’s leading open generative AI company, today announced the launch of Stable Audio, the company’s first AI product for music and sound generation.

Stable Audio is a first-of-its-kind product that uses the latest generative AI techniques to deliver faster, higher-quality music and sound effects via an easy-to-use web interface. Stability AI offers a basic free version of Stable Audio, which can be used to generate and download tracks of up to 20 seconds, and a ‘Pro’ subscription, which delivers 90-second tracks that are downloadable for commercial projects.

“As the only independent, open and multimodal generative AI company, we are thrilled to use our expertise to develop a product in support of music creators,” said Emad Mostaque, CEO of Stability AI. “Our hope is that Stable Audio will empower music enthusiasts and creative professionals to generate new content with the help of AI, and we look forward to the endless innovations it will inspire.”

Stable Audio is ideal for musicians seeking to create samples to use in their music, but the opportunities for creators are limitless. Audio tracks are generated in response to descriptive text prompts supplied by the user, along with a desired length of audio. For instance, “Post-Rock, Guitars, Drum Kit, Bass, Strings, Euphoric, Up-Lifting, Moody, Flowing, Raw, Epic, Sentimental, 125 BPM” can be entered with a request for a 95-second track, and it would deliver this track.

Here are some more generated tracks and their prompts:

The underlying model was trained using music and metadata from AudioSparx, a leading music library, in a partnership between the companies that will generate both economic and creative value for all parties.

Stable Audio is the first music generation product enabling the creation of high-quality, 44.1 kHz music for commercial use via latent diffusion. The latent diffusion architecture uses audio conditioned on text metadata as well as audio file duration and start time, allowing for control over the content and length of the generated audio. You can read more about the research behind the model here. For further information or to provide feedback on the release, we welcome you to contact us at research@stability.ai.

You can try Stable Audio at www.stableaudio.com.

Stable Audio Named One of TIME’s Best Inventions of 2023

We’re pleased to announce Stable Audio has been named one of TIME’s Best Inventions of 2023!While Stability AI is best known for our image models, we wanted to expand into music generation to empower music enthusiasts and creative professionals to generate new content with the help of AI. Ideal for musicians…

October 25, 2023

In "Image"

Mel Spectrogram Inversion with Stable Pitch

Vocoders are models capable of transforming a low-dimensional spectral representation of an audio signal, typically the mel spectrogram, to a waveform. Modern speech generation pipelines use a vocoder as their final component. Recent vocoder models developed for speech achieve a high degree of realism, such that it is natural to…

September 7, 2022

In "FAANG"

Revolutionizing image generation through AI: Turning text into images

Creating images from text in seconds—and doing so with a conventional graphics card and without supercomputers? As fanciful as it may sound, this is made possible by the new Stable Diffusion AI model. The underlying algorithm was developed by the Machine Vision & Learning Group led by Prof. Björn Ommer…

September 7, 2022

In "AI/ML News"

AI Generated Robotic Content