ElevenLabs and Stability AI drop new AI music models, can they catch up with Sono?



short

  • ElevenLabs has launched Music v2, which is able to switch genres mid-track, create songs section by section, and draw specific parts.
  • Stability AI has released Stable Audio 3.0, a family of four models with open weights for three different genres, trained on licensed data, generating tracks up to six minutes and twenty seconds long.
  • Both versions rely heavily on licensed training data, but Suno, valued at $2.45 billion with nearly 100 million users, is still the platform most people reach for first.

Two significant AI music updates were released this week, neither of which came from Suno.

ElevenLabs, the Polish-founded voice AI company valued at $11 billion following a $500 million Series D in February, has launched. music v2. Decreased artificial stability – people of stable spread Stable sound 3.0a family of four models with open weights and tracks spanning more than six minutes.

The background is the Recording Industry Association of America Copyright claims From 2024 against Suno and Udio, making “training on licensed data” the most important phrase in any AI music ad. Both ElevenLabs and Stability rely heavily on this, ensuring that you won’t have issues with the output you create.

Music Version 2: One track, opera to heavy metal, uninterrupted

The Music v2 is ElevenLabs’ second music model, arriving about 10 months after the first. The key pitch is consistency under pressure. According to Elevenlabs, a single track can go from opera to heavy metal and back, held together by fast rapping, and include non-musical sound effects — all without breaking up the composition.

The generative audio tends to break down completely when the prompts get complex, so this is something worth watching, especially in longer pieces of music.

Interior drawing is now actually useful: select a section, recreate it, and leave everything else untouched. Users can also create songs section by section—intro, verse, and chorus—maintaining continuity throughout rather than treating each section as a standalone generation. Multilingual support has also improved, although ElevenLabs has not released specific details.

The model powers three platforms: ElevenMusic for creators, ElevenAPI for developers, and ElevenCreative for brands. It’s live on ElevenMusic and ElevenCreative now; API access is early access via the sales team.

ElevenLabs is also discounting Music v1 and v2 prices by up to 50% for ElevenAPI and up to 40% for self-service for ElevenCreative. The company hit $500 million in annual recurring revenue in April 2026. Music is still a small part of that, but ElevenMusic, which launched as a consumer app in April, is a direct snapshot of Suno’s user base.

Steady Sound 3.0: Open weights, on the machine, actually longer

Stable sound 2.0 It topped at three minutes and was already behind Suno when it launched in 2024. Stable Audio 3.0 offers four models: Small SFX (sound effects on device), Small (full music composition on device), Medium (up to 6:20, more powerful devices), and Large (API only). Three of the four have open face-hugging weights.

Small models run with 459 million parameters each, without the need for a GPU. (The parameters are what measure the ability of an AI model, basically.) It averages 1.4 billion parameters and generates its 6:20 output in about 1.31 seconds on the H200 GPU. Large, $2.7 billion, API-only for organizations with revenues greater than $1 million. 2G precision means you get exactly the path length you requested, not an approximation.

It is also supported in ComfyUI for local settings

The architecture is new: the audio-semantic autoencoder calls SAME stabilization, and is designed to maintain melodic coherence on longer outputs. LoRA fine-tuning is supported, so artists can adapt models to their own catalogs. There is also Inpainting – single-segment, multi-segment, and causal continuation to extend the path beyond the original endpoint.

For context, a LoRA (Low-Rank Adaptation Model) model is like a mini-model that specifies how the full model generates its output. If you train LoRA on blues music, the model will produce blues music, and if you train LoRA on BB King blues music, the model will produce songs that will sound like BB King. Inpainting means that the model can fix small errors in its creation. So, for example, if the model is hallucinating something at the 2:30 mark, you can select a few seconds of the song, ask the model to change it to whatever you want, and the model will create a clip of the song that perfectly fits that time frame and blends in with the actual song as a whole.

It was stability Technically credible in AI music for years Without commercial penetration. Open-weight play is the Stable Diffusion strategy applied to audio — grow the developer community, and see what is created. The licensing is more straightforward than anything Stable Audio has ever shipped, with partnerships with Universal Music Group and Warner Music Group.

Target: Sono, the AI ​​music king

If ChatGPT is the king of AI text, then Suno is the king of AI music. The company behind the model was valued at $2.45 billion in November 2025, had annual recurring revenues exceeding $300 million, and was used by nearly 100 million people.

It produces about 7 million songs daily. Warner Music settled its suit against Suno in November 2025; Sony and UMG remain in federal court.

To avoid these copyright wars, ElevenLabs has licensing deals with Believe, Kobalt, and Merlin. Stability has Warner and Universal. Udio has settled in with the three disciplines and is now a walled garden – nothing you create can leave the platform.

Steady Audio 3.0 Small and Medium is available on Hugging Face now. Live streaming is done via the Stability AI API. Music v2 is free for ElevenMusic users, with commercial tiers through ElevenCreative and ElevenAPI.

Daily debriefing Newsletter

Start each day with the latest news, plus original features, podcasts, videos and more.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *