Why don’t they just enhance forensic audio to make it intelligible?

Image from pixabay
Image from Pixabay

When we listen to poor quality audio we often have the sense of being nearly able to hear the words.

That makes us susceptible to being misled by claims that speech has been ‘enhanced’ to make it ‘clearer’. We feel so sure we’d be able to hear the words if only the background noise could be removed, or the sounds could be fixed up a bit, that we are willing to trust those who say they have techniques do this.

In fact the possibilities for making unintelligible audio intelligible are extremely limited. Unfortunately widespread false beliefs about speech make impossible feats seem plausible.

But we see it on television all the time!

The media presents many miraculous examples of totally unintelligible audio being made perfectly clear through the twiddling of a few knobs.

The classic 1974 Francis Ford Coppola movie The Conversation, starring Gene Hackman, gives an early example. The movie below should start at 5min 30sec, and you’ll need to watch about 2 minutes from there to get the key effect – but feel free to wind back and enjoy the whole scene if you have time. It is still a suspenseful movie, even after all these years!

Impressive, isn’t it?

By twiddling the knobs and listening to the effect, he is able to turn something completely unintelligible into a clear, and menacing, sentence. We see similar, though less cinematographic, scenes in many crime-scene-investigation-type shows. And yes, they are impressive.

But like many things in the movies, it is a trick. Any scene which purports to show unintelligible audio being made intelligible is produced by first creating the intelligible audio, then adding the effect that makes it unintelligible.

That way, the effect can be reversed at the right moment, with a dramatic hey presto flourish.

It is a bit like taking a nice clear photo, adding a blurry mask to it, and then removing the blurry mask to reveal the original clear picture. The difference is, everyone knows the possibilities of photoshopping, so there is no hey presto effect.

With audio, most people have no idea what is technically possible and what is not. Scenes like the one above only work because our society has the belief that such effects are possible.

But there must be something in it!

We expect a certain amount of poetic licence for the sake of entertainment.

Let's face it, we see a lot of things on the telly, and they're not all true
Let’s face it, we see a lot of things on the telly, and they’re not all true

The problem is, our media-fed familiarity with this type of thing gives the impression that similar, if not quite so dramatically successful, effects must be possible in real life.

In fact, nothing remotely like this kind of effect is possible.

Unfortunately, however, our false beliefs about speech make them seem far more plausible than they should.

And it has to be said, there are some audio engineers who advertise their forensic capabilities by creating bogus materials in exactly the way we have just described.

All that creates a serious problem for justice in our legal system. To begin with the end in mind, we can say that the most dramatic effect of ‘enhancing’ indistinct audio is to enhance the believability of inaccurate transcripts.

This module gives you a look at how that works. Ready to dive in? Just a suggestion – quite a few things in this module are more believable once you have done Rethink Speech 101: Unlearning. It’s fine to do this Enhancing module first if you are interested in the topic – but do be aware there is important background at Rethink Speech 101: Unlearning.