top of page

AI In Sound Design: We're Not There Yet

AI is a useful creative tool.. except when it isn’t.

Everyone is talking about AI right now, and of course everyone is trying to sell you AI right now. We are all being told how wonderful it is and how much more wonderful it will be as it evolves, except, strangely:

Anytime someone uses AI in the specific field in which they work they are constantly reporting back its limitations and its errors and how it will even outright lie.

It is also worth keeping in mind that many of the folk who are trying to sell you AI solutions are the exact same folk that were trying to sell you NFTs and before that crypto and before that….

So lets explore a single potential use case for AI that is already being marketed as a time and cost saving solution for the entertainment industry: Sound Design

AI in Sound Design - The Hidden Complexities


Arguably, you could train AI on millions of existing sound effects and get it to a point where it could generate convincing sounds from “thin air”. Bang, solution achieved, sack all the sound designers, save millions worldwide, celebrate the win. Then the next time you create a film, TV or video game product you could just ask the AI to……to what exactly? 

“Dear AI, please make me a sound for waves on a beach?”

Good luck with that!

Do I think AI couldn’t make a sound of waves on a beach? No, I am 100% sure you could train an AI to do that, so why do I think this is not a viable answer for media production? This is the very core of the issue and the very core of this article.

I have been a sound designer for over 25 years,
and part of that role involves collecting massive libraries of sounds. I have tens of thousands of sounds I have recorded myself, on top of that I have purchased or been given access to hundreds of thousands of other sound effects. Almost every day I access those libraries and search through them for the sounds I need.

Sometimes I find a single sound file that is perfect for the needs of a project and other times I will layer and combine dozens of sounds together to create new sounds. The process of providing sounds for media is extremely complex and multi-faceted.

It is not just “get sound of beach”, it is a process of investigation, consultation and examination.
I need to understand what the client wants, what the project needs and what the narrative of the project is trying to convey. At the most basic level I need to understand if those waves are large or small, calm or violent, loud or soft? Is the beach sand or stone, is the weather hot or cold?

All of the above would define what kind of wave sounds might be appropriate. But once I have a short list of potential sounds, I then need to determine what the scene is depicting.

Sounds Are Not Just Sounds - They Tell A Story


Humans tell stories, that is what we do, and most stories are an emotional journey and sound’s principal role is to support that narrative. So, if the scene with the beach is a memory of times past with sad undertones as our hero remembers the tragic loss of her parents, then I am going to provide a very different sound than if we are depicting the Normandy invasion landings.

I have literally spent hours going through beach sounds trying to find the exact best sound to support the emotional framework of a project narrative to realize that nothing in the hundreds of beach sounds that I have is suitable.

The solution then was to drive 100 kilometers and spend a day on the coast capturing more suitable content.

This is one example where the effort required to get the perfect sound was considerable, if we push that task back onto the AI, how do we even convey to the AI the feelings we need to achieve and how does AI know what to create.

The Human Element Defies Annotation


We can and do categorize sound effects into many different piles. We have an excellent Universal Category System that helps designers refine their searches to find the sounds they want.
But those categories are for places and species and things.

I know in an instant if the cat meow I am hearing is neutral or a plaintive cry for help, but often very little within the sound metadata indicates that, so how would AI ever know one meow from another? Sure, we can work on providing more data, and provide better training to the AI, but then there is the human element.

One of the greatest challenges for any creative professional is getting the exact information they need from the client to be able to create the best content. Your average client knows basically nothing about sound effects and sound design, it’s not their job, it is mine.

We will ask them questions, request context, read the script, look at the visuals and then we will attempt to create something that is suitable for the project.

Often, we then get asked for revisions or told “this is not quite right” or worse still we get asked “can you make this sound more blue?” Many creative professionals struggle to get useful and actionable feedback from clients, and now we have people suggesting you can just talk directly to the AI and it will give you the sounds you need.

I await to hear the results of those AI requests. I am sincerely curious to see what AI can come up with.

Understanding What A Sound Designer Does In the First Place


As I mentioned I have access to hundreds of thousands of sound files. It could be argued that my job should be a simple case of just doing a search for sound X and then dropping it into any project. And I could do that, if I wanted every single project to basically sound the same.

Sure, a door sounds like a door and so in some cases I can use a sound directly from a library, but for any of the sounds that actually matter the very identity of the project relies on creating original sound content. 

Don’t believe me? How do you think the average audience would react if every single sci fi film since Star Wars just made all their space planes sound like Tie Fighters, or if every monster on screen sounded exactly like Godzilla?

We specifically and purposefully craft unique sounds for each project to allow them to have a unique identity of their own. By its very design AI can only copy what has come before. So very quickly everything made by AI is going to start to sound the same or very similar. In fact, the Tie Fighter sound is a perfect example.

We would never have had the incredible and ground breaking sounds of Star Wars if AI had been around at the time, it would have generated sounds based on everything that had come before.
All the space planes would have just gone “whoosh”.

So if we already have a situation where human clients struggle to communicate their needs to human sound designers, openly admit they do not have the words and phrases to explain exactly what it is they want and the most valuable sound designers in the industry are those with years of experience interpreting the needs of the client, then I am unsure where AI slots into this equation.

The Current Limitations Of AI In Sound Design


In the short term an AI might be able to help me quickly create an interesting layer within the many layers I combine to make a final sound. But even as an experienced sound designer I am not confident that it could provide me with regular useful solutions.

We already use “AI” in various forms in the many plugins and scripts that automate aspects of our work. AI is not new and AI will continue to evolve and contribute to the creative process, but beware of anyone trying to sell you a bottle of SnakeAI.

I can tell you with absolute confidence that if you think an AI program is going to provide you with the all sounds you need without the input of a creative team then you are sadly being taken advantage of by the same set of people who last week were trying to sell you some other “tech advancement”.
AI may become part of the solution, we are a very long way from it being THE solution.

Trailer Sound Designer, Location Recordist, Sound Librarian
Game Audio Outsourcing & Custom Music

Comentarios


LI & ORTEGA

Site Links

Social Media

  • LinkedIn
  • Instagram
  • Facebook

Li & Ortega Pte Ltd is a Singapore-incorporated company that operates out of:

Boston, MA | Austin, TX | Milwaukee, WI | Seattle, WA | Singapore | Melbourne, Australia

© 2025 Li & Ortega Pte Ltd

7030 Ang Mo Kio Avenue 5 #09-46, Northstar @ AMK Singapore 569880

bottom of page