I wonder if this technique could be used in computerized language translation and other voice recognition areas. Your voice becomes like a fingerprint in that a recording of it could be compared to a wiretap, etc.

Computer, Name That Tune!

A British company called Shazam is making big bucks off the mathematical analysis of music. Avery Wang, one of the company’s founders, recently met with me in a coffee shop in California to explain how his company is doing it.

Wang wasted no time before beginning a demonstration. He put down his coffee cup and held his cell phone in the air, far from his ear.
“Let’s see what’s playing,” he said. The light jazz playing in the coffee shop was almost drowned out by the forced hiss of the espresso machine and the caffeine-charged chatter around us. A bar on the cell phone counted down 10 seconds, and then some text appeared on the tiny screen.

“Esbjorn Svensson Trio, Strange Place for Snow,” he said, holding the cell phone close enough to read the song title on the display.

He was analyzing the music’s spectrogram, which is a graph that shows the intensity of the sound wave at each frequency over time. He found that the energy peaks, also called spectral peaks, showed up in both the original song and the version he received over the cell phone.
To speed up the program and increase its robustness, he used a standard technique from computer science known as “hashing,” where he compares pairs of spectral peaks from each version. In many of the pairs of peaks taken from the signal sent by cell phone, at least one peak came from noise. But he found that he could identify the piece even if only 5% of the pairs came from the original music.

  1. hhopper says:

    Wow! Audio DNA! This could have a potentially huge impact in the courts.

  2. Lauren the Ghoti says:

    #0 – Uncle Dave

    “Your voice becomes like a fingerprint in that a recording of it could be compared to a wiretap, etc.”

    Voice spectral analysis came out of Bell Labs over 60 years ago; but it’s not as infallible as many supposed in the ’60s, when it was commonly and erroneously believed that voices are unique, as fingerprints are.

  3. andron says:

    http://www.411song.com/ has been doing this for over a year… what makes these guys different?

  4. KVolk says:

    I guess thats cool I won’t have to rack my brain anymore for some song you hear in public and you can’t think of it so it drives you nuts all day.

  5. Jonas Åström says:

    This feature is also present in many SonyEricsson cell phones. My W850i has it (TrackID), for instance. It’s a really cool feature 🙂


