Adobe Firefly can now generate AI sound effects for videos – and I’m seriously impressed
Adobe / Elyse Betters Picaro / ZDNETJust a year and a half ago, the latest and greatest of Adobe’s Firefly generative AI offerings involved producing high-quality images from text with customization options, such as reference images. Since then, Adobe has pivoted into text-to-video generation and is now adding a slew of features to make it even more competitive.Also: Forget Sora: Adobe launches ‘commercially safe’ AI video generator. How to try itOn Thursday, Adobe released a series of upgrades to its video capabilities that give users more control over the final generation, more options to create the video, and even more modalities to create. Even though creating realistic AI-generated videos is an impressive feat that shows how far AI generation has come, one crucial aspect of video generation has been missing: sound. Adobe’s new release seeks to give creative professionals the ability to use AI to create audio, too. Generate sound effects The new Generate Sound Effects (beta) allows users to create custom sounds by inserting a text description of what they’d like generated. If users want even more control over what is generated, they can also use their voice to demonstrate the cadence or timing, and the intensity they’d like the generated sound to follow. For example, if you want to generate the sound of a lion roar, but want it to match when the subject of your video is opening and closing its mouth, you can watch the video, record a clip of you making the noise to match the character’s movement, and then accompany it with a text prompt that describes the sound you’d like created. You’ll then be given multiple options to choose from and can pick the one that best matches the project’s vibe you were going for. Also: Adobe Firefly now generates AI images with OpenAI, Google, and Flux models – how to access themWhile other video-generating models like Veo 3 can generate video with audio from text, what really stood out about this feature is the amount of control users have when inputting their own audio. Before launch, I had the opportunity to watch a live demo of the feature in action. It was truly impressive to see how well the generated audio matched the input audio’s flow, while also incorporating the text prompt to create a sound that actually sounded like the intended output — no shade to the lovely demoer who did his best to sound like a lion roaring into the mic. Generate visual avatars Another feature launching in beta is Text to Avatar, which, as the name implies, allows users to turn scripts into avatar-led videos, or videos that look like a live person reading the script. When picking an avatar, you can browse through the library of avatars, pick a custom background and accents, and then Firefly creates the final output. More