Stability.ai presents the groundbreaking Stable Audio model, an artificial intelligence system capable of composing songs with both captivating beginnings and endings. This cutting-edge technology operates at remarkable speed, running on the formidable NVIDIA A100. In just 1 second, a single A100 card can generate a song lasting a staggering 95 seconds.
The architecture of Stable Audio incorporates the inclusion of temporal sound data, which ensures control over the overall length of the composition. The core diffusion model boasts a staggering 907 million parameters and was trained on a dataset comprising 800,000 audio files, equating to over 19,500 hours.
Currently, Stability.ai has only released sample audio files created using this impressive model. However, they have announced plans to release the open-source model and source code in the near future. This follows Meta’s recent release of the MusicGen model, also designed for song creation.
Created using the remarkable Ideogram artificial intelligence.
TLDR: Stability.ai introduces Stable Audio, an AI-powered model capable of creating songs with captivating intros and outros in record time. The core diffusion model, with its massive parameter count and extensive training dataset, ensures control over song length. Sample audio files have been released, with plans to open-source the model and source code in the future, following Meta’s MusicGen release.