Ezra Sandzer-bell’s Post

View profile for Ezra Sandzer-bell, graphic

Fractional CMO for B2B and B2C Music Tech Startups | Founder at AudioCipher Technologies | Marketing at Audio Design Desk

Last week Suno teased a new audio-to-audio feature, but they're not alone in this space. 👀 You probably saw the demo - it shows a person tapping on a watering can and turning it into a heavy psych rock track. (It's on their Twitter account here if you missed it: https://lnkd.in/gYMcJfYZ) Some people mocked the demo, pointing out that it's just BPM-matching. I have a feeling that the actual feature will go deeper, based on what other companies like SoundGen, CassetteAI and Stability AI previously brought to market. Before I dive into competitors and alternatives, let me say this. When I saw Suno's announcement, I nearly fell out of my chair. If they've got a multimodal system that combines audio input with text prompts, it's probably going to blow everyone else out of the water. 🤺 But they have at least one serious competitor to look out for. Back in November 2023, Google DeepMind shared a screenshot of an interface for their unreleased Lyria model. Lyria's design piggybacked on claims from the MusicLM team (January 2023), saying they could combine melodic conditions with text to generate song arrangements. That never materialized. When they rebranded as MusicFX, audio-to-audio was still missing. Even this month, with the big AI music reveal at Google's I/O event, we saw combinatory music seeds but no audio-to-audio. 🙄 So what the heck, Google? You going to release this thing or not? Anyway, pulling back to look at the big picture, there are several subcategories under the umbrella of "audio to audio" and only a few of them turn melodies into songs. In this new AudioCipher Technologies article, I've mapped out some subcategories and identified the big players. Here's a high level summary: 🎶 Melodic conditioning: Humming, whistling, or performing a solo melody on an instrument and turning it into a complete arrangement 💿 Music samples into songs: Turning multi-instrumental music arrangements into new music clips and extending them to generate new song sections. 🎙 Voice cloning for singers: Transferring an audio recording of a singing voice into another vocalist's style and timbre. 🎸 👉 🎷 Tone transfer: Using AI models that were trained on an instrument to turn user input, like a guitar performance, into new instruments like a violin or saxophone. 😚 👉 🎸🥁 Style transfer: Changing the style of audio and music inputs. It combines melody conditioning, tone transfer, and remixing in a single function. 🎹 Audio-to-midi-to-audio: A more precise approach to tone transfer. ML is used in the first step, but standard virtual instruments are used during the MIDI-to-audio step. Check out the article below for a complete overview, and some video demos of how these tools work. I'll follow up with another report when Suno and Google roll out their models to the public. #ai #aimusic #suno #audio #music https://lnkd.in/gfnPiC5G

Audio to Audio AI: Melody-to-Song, Style Transfer & More

Audio to Audio AI: Melody-to-Song, Style Transfer & More

audiocipher.com

All the same things you can do with images and video you can do with audio. But many of these features would trigger an immediate boss battle with the music industry which is why nobody wants to be first. Not a technical hurdle at all

👏 Once again another fine breakdown. "Audio-to-midi-to-audio: A more precise approach to tone transfer. ML is used in the first step, but standard virtual instruments are used during the MIDI-to-audio step." Thanks so much for calling this out. I have to say that some of the "AI" demos of "Tone transfer" really feel like the kind of demos we did at Opcode in the 90's with Studio Vision audio-to-midi and Steinberg Media Technologies audio-to-midi demos from a few years ago (2007 or ?). Funny enough but both did a sax to a MIDI instrument. People are amazed. I know you wouldn't be 😀

Diamond Duggal

Music Producer | AI Music Consultant | Founder at DesiRock | Founder at SUPRODA

2mo

Great article..! Audio to audio is definitely the missing link in original AI music creation for professionals especially on Suno and Udio and trying to force the melody is currently a frustrating process of creating many versions and multiple edits...

Drew Thurlow

Entertainment Executive | Music Tech & AI | Streaming & DSPs | Artist & Label Relations | Recorded Music & Publishing

2mo

Interestingly, Mikey at Suno told me their actual musical training data is multimodal

Christopher Wieduwilt (The AI Musicpreneur)

Helping music creators grow with AI (aimusicpreneur.com)

2mo

Great insights as always Ezra! Here's me waiting patiently for that Suno release:

  • No alternative text description for this image
Bence Csernak

AI Builder and Engineer • AI Adoption and Integration Specialist • Strategic Product Designer • User Experience

2mo
See more comments

To view or add a comment, sign in

Explore topics