Stability.ai presents the groundbreaking Stable Audio model, an artificial intelligence system capable of composing songs with both captivating beginnings and endings. This cutting-edge technology operates at remarkable speed, running on the formidable NVIDIA A100. In just 1 second, a single A100 card can generate a song lasting a staggering 95 seconds.

The architecture of Stable Audio incorporates the inclusion of temporal sound data, which ensures control over the overall length of the composition. The core diffusion model boasts a staggering 907 million parameters and was trained on a dataset comprising 800,000 audio files, equating to over 19,500 hours.

Currently, Stability.ai has only released sample audio files created using this impressive model. However, they have announced plans to release the open-source model and source code in the near future. This follows Meta’s recent release of the MusicGen model, also designed for song creation.

