On Collaborative Filtering



Categories: Uncategorized

A recent set of technologies have been devised to help websites learn

about their users and take intelligent actions accordingly. These

technologies, called “recommendation engines” or “collaborative

filtering,” examine a user’s past viewing habits and compare them with

other users who have similar interests. If your interests were found

to parallel another group of users, then the system could start making

suggestions: suppose you normally never listened to any country music,

but you liked bands X, Y, and Z, a whole lot. Now if a whole bunch of

other people who don’t normally like country and also like X, Y, and Z

suddenly all are listening to (and loving) this one country band, the

system might suggest it to you and be relatively confident that you’ll

like it.

This technology is neither brand new or obscure: Amazon.com uses it

extensively on their website to recommend books to buyers. Indeed,

Firefly applied recommendation engine technology to CD purchases on

the web many years ago. Unfortunately for them, they took too long to

license out their technology (they wanted to be the only people on the

‘Net with that technology) and were subsequently steamrolled over as

companies like Net Perceptions came to market swiftly with

sophisticated engines. Microsoft quietly picked Firefly off with what

was rumored to be a humiliatingly cheap acquisition.

But the fact that no online digital music providers have yet to openly

embrace this technology seems surprising to me: this technology is

absolutely key to the success of online audio. Why? Because new

publishing and distribution infrastructures will make it very easy for

artists to publish profusely on the web. Like the 500-channel

television, the diversity of content is appreciated, but the sheer

quantity of music on the Net could prove so overwhelming as to

discourage listeners (and potential buyers) from seeking out the music

they would enjoy. Techie geeks refer to this as “the signal-to-noise

ratio problem:” if you only hear one band you like (signal) for every

twenty you don’t (noise), you won’t want to spend your time poking

around for that one band.

The record industry had a fairly effective technique for increasing

the signal-to-noise ratio for music: the original point of those

practicing A&R, or “Artists and Repertoire,” at record labels was to

seek out the good bands that the majority of the population would

enjoy. But the Internet offers us what no A&R man could – the

potential for individuals to have access to the bands they love both

big and small, from all around the world. Recommendation engines make

this possible, and reduce the signal-to-noise ratio by presenting

music that, based on your prior listening tastes, you’re likely to


Ultimately, this obviates A&R: once somebody has heard a band that she

deems pleasant to listen to, it will be recommended to those of

similar taste – if they like it, it may get recommended to their

friends, etc. In this way, the popularity of music is decided on more

by the taste of the people than the marketing push of a major label. A

small folk artist in Oklahoma could become vastly popular in Northern

India; who knows? Everybody benefits from this technology: artists,

who get better exposure; consumers, who hear more music that they

enjoy; and sites, which have more satisfied customers than before.

To date, people have argued that online audio sites have not yet

adopted this technology because of a paucity of content: when there

are only 15 artists on a site, a recommendation engine is hardly

appropriate. But with the rising tide of acceptance of online

distribution, floods of artists have been flocking to centralized

music portals like eMusic, MP3.Com, and Audio Explosion. This newfound

influx has left the sites unable to provide tasteful experiences for

their users, leaving them instead awash in a flow of “exotic” (to

phrase it kindly) amateur music from around the world. They have

reached the size and maturity to move to collaborative filtering.

And move they must: the stakes are large in this brave new world and

the listeners plentiful. The winners will quickly adopt and manage

these new technologies, and those less nimble will be left wondering

why more people didn’t come listen to their 15,000 artists.