Stability AI, the world’s leading open generative AI company, today announced the launch of Stable Audio, the company’s first AI product for music and sound generation.
Stable Audio is a first-of-its-kind product that uses the latest generative AI techniques to deliver faster, higher-quality music and sound effects via an easy-to-use web interface. Stability AI offers a basic free version of Stable Audio, which can be used to generate and download tracks of up to 20 seconds, and a ‘Pro’ subscription, which delivers 90-second tracks that are downloadable for commercial projects.
“As the only independent, open and multimodal generative AI company, we are thrilled to use our expertise to develop a product in support of music creators,” said Emad Mostaque, CEO of Stability AI. “Our hope is that Stable Audio will empower music enthusiasts and creative professionals to generate new content with the help of AI, and we look forward to the endless innovations it will inspire.”
Stable Audio is ideal for musicians seeking to create samples to use in their music, but the opportunities for creators are limitless. Audio tracks are generated in response to descriptive text prompts supplied by the user, along with a desired length of audio. For instance, “Post-Rock, Guitars, Drum Kit, Bass, Strings, Euphoric, Up-Lifting, Moody, Flowing, Raw, Epic, Sentimental, 125 BPM” can be entered with a request for a 95-second track, and it would deliver this track.
Here are some more generated tracks and their prompts:
The underlying model was trained using music and metadata from AudioSparx, a leading music library, in a partnership between the companies that will generate both economic and creative value for all parties.
Stable Audio is the first music generation product enabling the creation of high-quality, 44.1 kHz music for commercial use via latent diffusion. The latent diffusion architecture uses audio conditioned on text metadata as well as audio file duration and start time, allowing for control over the content and length of the generated audio. You can read more about the research behind the model here. For further information or to provide feedback on the release, we welcome you to contact us at research@stability.ai.
You can try Stable Audio at www.stableaudio.com.
None of the video gen models do a real CRT terminal animation look. Weights +…
Zero-shot text classification is a way to label text without first training a classifier on…
GRASP is a new gradient-based planner for learned dynamics (a “world model”) that makes long-horizon…
Recent work has shown that probing model internals can reveal a wealth of information not…
As the demand for generative AI continues to grow, developers and enterprises seek more flexible,…
An autonomous robot from the company Honor ran a half marathon in 50:26, beating the…