Fooling YouTube Audio Fingerprints: Speed Works, Volume Doesn’t, and All That Matters Is the First 30 Seconds
YouTube and the music industry have a complicated relationship. Warner Music Group yanked its artists from the site and user uploads that contain tracks from the company’s albums are automatically muted. But rival Universal Music Group recently announced a planned new joint music video venture with the site called Vevo.
The main sticking point in all these deals is (of course!) money, but a big part of figuring out who that money goes to hinges on YouTube’s content identification program. For music, YouTube uses (at least in part) Audible Magic’s audio fingerprinting services.
How good is this audio fingerprinting tool, and how does it work? That’s a secret kept in a black box. But enterprising YouTube user retnirpregnif donated a bunch of his or her time to the greater good to try to figure out what’s going on. The experimenter uploading 82 modified versions, mostly of the same repetitive song, I Know What Boys Like by The Waitresses (which I think would be on Universal Music Group, though that’s not specified).
Here are some of the most interesting results:
- The account was not taken down, despite 35 of the 82 test videos triggering Content ID notices.
- The best way to keep a track up without altering it too much is to increase or decrease the speed of playback by about 5 percent (a slowed down example that didn’t get filtered is embedded above). Pitch alterations of more than 6 percent will also work.
- The fingerprinter may only look at the first 30 seconds of a song. It did not catch an upload of the song when the first 0:30 was muted.
- The fingerprinter is really good at changes in amplification, even when it caused a lot of distortion.
Further reading: Way back in the day, we tested audio fingerprinting ourselves, but on TV shows instead of music tracks. It didn’t do well, to say the least.
Popular
- BitTorrent After The Pirate Bay: Do You Still Need Trackers?
- Tumblr Marriage Proposal: Behind the Scenes of Justin and Marissa's Engagement
- Get Ready for Flash Player 10.1 to Stream P2P Video to Millions, Swap Files BitTorrent-style
- Ten Sites for Free and Legal Torrents
- The Megawoosh Waterslide Viral: How It Was Really Done
- Six Steps To Get More HD From Your Scientific Atlanta Set-top Box
Recent
- BitTorrent After The Pirate Bay: Do You Still Need Trackers?
- Microsoft and Nielsen Partner for 1 vs. 100 Measurement
- Premium Content Drives Connected Device Adoption
- Site Sponsor: Twistage
- Tumblr Marriage Proposal: Behind the Scenes of Justin and Marissa’s Engagement
- Sungale’s Sub-par Portable Media Player
Network
- Weekend Vid Picks: Twilight Parodies For Bitter Boyfriends [NewTeeVee]
- Skype CEO Outlines Platform Ambitions, Hiring Plans [GigaOM]
- Earth2Tech Week in Review [Earth2Tech]
- WWD Weekend Reading List [WebWorkerDaily]
- WinMo Wrap: Marketplace Hits All WM 6.x Phones; Opera Mobile Advances [jkOnTheRun]
- Weekly App Store Picks: November 21, 2009 [TheAppleBlog]
© 2009 The GigaOM Network. Marketing consulting by ACS.


It’s a good idea. But if a song is resampled 5% down, won’t it sound odd or won’t the listener notice that?
Your exsample sounds so deep or is the original the same?
I don’t believe there’s any way under today’s technology to have such a system actually working accurately. It will likely cause tons of mismatches until 20 years from now whenever anything that sounds like a song is identified. Another step back for Youtube and more issues for those who use it.
And even if it was correctly functional such a system is something shameful, and only encourages more online oppression because some content owners have the false hope it would help them in some way. Shame watching everything regress when it could get better…