At the end of last year, I tentatively made the prediction that "the catalogue of music recordings readily available in the northern hemisphere will continue to increase by 50% every five years until 02025 when it may start to plateau or saturate". But I can't test this prediction until I have some reliable measure of the catalogue and of how much of it is 'readily available'.
So far I'm drawing a blank even on the simple measure of how many CD titles have so far been issued. Last week Gracenote announced that their CDDB® database for music recognition has been used two billion times to identify CDs. They claim CDDB "contains the largest online database of music information in the world". As of today it has data for 3,598,785 CDs and 46,002,354 songs (note the iTunes Music Store has only 2.5% of these songs available).
Is CDDB a good measure of the total catalogue of CDs? I've heard reports of up to 5% of CDs not being recognised by CDDB — though the only time I've experienced this was with a spoken word CD — which would suggest that CDDB underestimates the total catalogue. However, it also overestimates the number of CDs because the database contains several duplicate entries. I have the six CDs of the Anthology of American Folk Music, edited by Harry Smith on my iPod, with metadata taken from the CDDB. But two of the six CDs appear twice in the database, and one appears three times. You can see this by going to the web interface for the database, and searching on 'Anthology of American Folk Music' (n.b. a fourth volume was released separately from the original three-volume, six-CD set). Try it for other albums as well.
There is no quick way to assess the scale of the underestimates or overestimates in CDDB, which undermines its usefulness as a measure of the CD catalogue.
Another annoyance with CDDB is the inconsistent presentation of data. The metadata I got from CDDB for the six CDs has four different ways of writing the disc titles and numbering the discs in the collection:
If you're trying to locate a particular disc on a small iPod screen (or even on a 15" computer screen), this inconsistency makes it very difficult to work out which is which, and what order they're supposed to go in.
As I wrote last year, Gracenote is one of the new 'gatekeeper' organisations that has a 'silent' influence on how people discover and locate digital music. CDDB is used by AOL, Apple, Philips and Sony in their digital music offerings, and may establish itself as pre-eminent in the market through the network effect, which would have worrying implications if it doesn't clean up its data.
Happily there is a 'community' music metadatabase alternative to CDDB, in the shape of MusicBrainz, which — on my very small sample — seems more reliable: my search for 'Anthology of American Folk Music' found no duplicates and only one disc title in a slightly different format from the others.
But MusicBrainz cannot currently be a measure for the total CD catalogue, since, at the time of writing, it claims only 252,602 CDs and 3,102,305 tracks (compared with CDDB's 3.6 million and 46 million respectively).Posted by David Jennings in section(s) Curatorial, Future of Music, Long Now, Music and Multimedia on 10 April 02005 | TrackBack