Using Linux to Create a Audio CD Library
By John Weatherwax
Back a few years ago, some friends of mine got very interested in creating audio CDs from *.mp3 files. At the time, they had access to a large directory structure with numerous directories (read: CDs), each containing around 10-13 mp3 audio files. At first, when the need arose to create a CD, one of the many "interactive CD burning software" packages would be used. This entailed selecting the relevant mp3 files with a mouse, dragging them to another "window", selecting any CD-specific options, and then finally pressing the "go" button. In addition to this manually intensive process, the software itself usually took a good deal of time to complete the "burn", during which one dare not do any serious work on that computer, for the CD buffer may run dry, and the burn process would be ruined. Because of the amount of effort that went into creating a single CD from a set of MP3 files, only the most "important" CDs were created in this way.
At this same time, I began to wonder if I could use Linux to script the entire burn process. My idea was to write a script that would operate on a text file containing a list of directory one wanted CDs burned from. When the script executed, the first directory on the list would be used, and a CD would be burned. This directory would then be removed from the list. A user could run the script from the command line, and a CD would be produced without the use of a graphical user interface. As an additional bonus, this script could be run by cron in the middle of the night. Then, the user would need only leave a blank CD in the drawer at the end of the day, and come back to a newly created CD the next morning. One wouldn't get a huge number of CDs created at one time, but, over weeks and months, one naturally would obtain a sizable collection. In addition, running this script at night meant all the cycle-time is done when the computer is idle, so there is no disruption of one's daily routine.
</begin legal blub/>
Before I begin this article, I want to state that in no way am I
endorsing any illegal behavior. This article is for educational
purposes only. I do not support, endorse, or approve of the illegal
copying of music or other media.
</end legal blub/>
Before I began the task of scripting this burn process, I had a relatively good knowledge of shell scripting, but no understanding of the audio CD format, audio file formats, etc. Luckily, one doesn't need to know too much: One is able to find all the audio pieces on the Web, so the scripting was not too difficult.
Before I go into the scripting, there were some pieces of software that must be installed on your system, to perform the tasks we need to do. Most of these pieces come standard with a great number of Linux distributions, but I've included the links here, just in case a reader doesn't have them in his/hers. By the way, a person can check if they have the program named "programName" by typing at the shell prompt:
$ type programName
If you see something like "programName not found in blah blah blah", then you do not have this program, and you need to download the program from the links below, and install it.[1]
The first required program is the CD burning program itself, called "cdrecord". This program will interact at the device level with your CD-ROM drive, and perform the actual burning process. It can be obtained at http://cdrecord.berlios.de/old/private/cdrecord.html.
The next is an audio normalization program called "normalize". This program is used to ensure that all of the tracks on the audio CD play at the same "volume level". This is also delivered standard with several Linux distributions, but can also be downloaded at http://www1.cs.columbia.edu/~cvaill/normalize/.
The next is the program called "sox". This can be obtained at http://sox.sourceforge.net/
Next is a program called mpg321, which will allow one to decode mp3 into WAV files. This can be found at http://mpg321.sourceforge.net/.
Next is a program called cdrdao, which is used to write audio or data CD-Rs in disk-at-once (DAO) mode, based on a textual description of the CD contents: http://cdrdao.sourceforge.net/
The final required program is a Perl script that implements all of the suggestions in "Greg Wierzchowski's" Linux MP3 CD Burning mini-HOWTO. This Perl script basically glues all the above programs together. By the way, the Linux MP3 CD Burning mini-HOWTO is an excellent tutorial on all of the issues encountered in burning CDs, and I highly recommend reading it. The HOWTO can be found at http://tldp.org/HOWTO/MP3-CD-Burning/index.html.
The Perl script can be found at the following location: http://outflux.net/software/pkgs/mp3cd/
The first step I performed was to create a text file containing the lists of the desired directories. A very simple one would look like the following:
% cat desiredCDs.txt Aerosmith/Greatest Hits 3 Doors Down/Under The Sun ZZ Top/Greatest Hits
I broke this task up into two scripts. The first (called "mp32AudioCD") is a wrapper for mp3cd that will create a audio CD, given a directory of *.mp3 files. The second script (called "checkForCD_N_Burn.sh") is a wrapper to mp32AudioCD that extracts a single directory from the "desiredCDs.txt" file, and passes it to mp32AudioCD. The script checkForCD_N_Burn.sh is called from cron, every morning. I'll present the scripts below, and then explain the various pieces separately.
With that introduction, the shell script checkForCD_N_Burn.sh looks like the following. Note I've appended line numbers to the left of the code:
1 #! /bin/sh 2 3 CDLIST=~/desiredCDs.txt 4 TOPDIR=/mnt/mp3s 5 6 # Check if the file exists: 7 if [ ! -e $CDLIST ]; then 8 exit 1 9 fi 10 11 # Read the first directory to burn: 12 read currentCD < $CDLIST 13 14 # Check to make sure this directory is present on disk: 15 if [ ! -d "$TOPDIR/$currentCD" ]; then 16 echo "$0: CD $currentCD not found" 17 exit 1 18 fi 19 20 echo "Attempting to burn: $currentCD" 21 22 if ! ~/bin/mp32AudioCD "$TOPDIR/$currentCD"; then 23 echo "$0: CD burn failed" 24 exit 1 25 else 26 # Delete this line from the CD list 27 sed '1d' $CDLIST > /tmp/cdListOneLess 28 mv /tmp/cdListOneLess $CDLIST 29 fi
In this code, we declare the two variables: CDLIST, which points to the text file above, and TOPDIR, which points to the top of the directory structure that contains the subdirectories desired. After some argument checking, on line 12, we read the first directory, specified in the text file desiredCDs.txt, into the variable currentCD. This would be the string "Aerosmith/Greatest Hits", in the above example. Lines 15-18 check to make sure that we have requested a valid directory, and line 20 calls the function "mp32AudioCD" with the correct argument. If this command succeeds, we remove this directory from the desiredCD.txt list on lines 27-28.
Next, I show the code for the routine mp32AudioCD
1 #!/bin/sh 2 3 iso_dir=/tmp/isoimage_dir 4 5 echo "Removing data in $iso_dir" 6 # Clean up anything in the $iso_dir: 7 rm -f $iso_dir/* 8 9 echo "Copying data into $iso_dir" 10 cp "$1"/*.mp3 $iso_dir 11 # Exit if we have an error coping MP3 files: 12 if [ $? -ne 0 ]; then 13 exit 1 14 fi 15 16 # Got to mp3 directory: 17 cd $iso_dir 18 19 # Call mp3cd script: 20 $HOME/bin/mp3cd -E -c "--driver generic-mmc --speed 1" *.mp3 21 22 exit 0
Here, on line 3, we define a temporary directory "iso_dir" to copy the mp3 files to. This needed is because mp3cd converts the mp3 files into WAV files, and I needed this done on a local machine. On line 17, we change into this directory, and run mp3cd with some command flags. The flags I use here are:
-E, --no-eject Don't eject drive after the burn. -c, --cdrdao ARGS Pass the option string ARGS to cdrdao.
Here the -c command switch stores some specific options that I pass to cdrdao.
With this introduction, we can perform a test run. With all the directories set correctly, at a shell prompt enter:
$ ~/bin/checkFor_CD_N_Burn.sh
You should see some statements explaining what CD you are about to burn, followed by the burn process beginning. When the process finishes, you should have a new CD in your drive.
All that remains to be done is to create a cron script that will run this program every night, creating a new CD every morning. I choose to run this program only on Monday through Friday evenings, at 1:30 A.M. (Even computers need rest!) To have this happen at a shell, enter the following:
crontab -e
This should open an editor with your existing crontab file. In this file, put the following line:
30 1 * * 1-5 /home/wax/bin/checkForCD_N_Burn.sh
Save the file and you are done. The line above runs the checkFor_CD_N_Burn.sh script every Monday through Friday evening, at 1:33 AM. Thus, every Monday-Friday morning, you should have a new CD in your drive. I think you'll find that a great number of CDs can be created from *.mp3 very easily, this way.
Write, if you have any questions or suggestions for improvements to this script. I'd love to hear them. Flames to /dev/null.
[1] Rick Moen comments: While it's useful and worthwhile to know about a program's "upstream" development site, where (among other things) the author's latest source code can be downloaded, there are a few disadvantages that should be noted (and some alternative locations that should be usually be preferred, instead, if such are findable):
1. Absent extraordinary measures on your part, your Linux distribution's package-tracking system won't know about the program's presence on your system. Therefore, it won't know to avoid installing conflicting programs, removing libraries it depends on, etc.
2. You won't get any tweaks and enhancements that may be normal (or necessary!) for applications on your Linux distribution — unless you yourself implement them. You won't get security patches, either, except those written by the upstream author.
3. Along those same lines, the desirable version to compile and run may well not be the author's latest release: Sometimes, authors are trying out new concepts, and improvements & old bugs fixed are outweighed by misfeatures & new bugs introduced.
4. As a person downloading the upstream author's source code directly, you have to personally assume the burden of verifying that the tarball really is the author's work, and not that of (e.g.) a network intruder who cracked the download ftp site substituted a trojaned version. Although this concern applies mostly to software designed to run with elevated privilege, it's not a strictly academic risk: Linux-relevant codebases that have been (briefly) trojaned in this fashion, in recent years, on the upstream author's download sites, include Wietse Venema's TCP Wrappers (tcpd/libwrap), the util-linux package, sendmail, OpenSSH, and the Linux kernel (CVS gateway's archive, only). Unless you are prepared to meaningfully verify the author's cryptographic signature — if any — on that tarball, you risk sabotaging your system's security. (None of those upstream trojanings escaped into Linux distributions because of distribution packager vigilance. Make sure you can be as good a gatekeeper, or rely on those who already do the job well.)
All of the above are problems normally addressed (and the burden of solving them, shouldered) by Linux distributions' package maintainers, so that you won't have to. It's to your advantage to take advantage of that effort, if feasible. The memory of when a thousand Linux sysadmins, circa 1993, would need to do all of that work 999-times redundantly, is still fresh to us old-timers: We call those the Bad Old Days, given that today one expert package maintainer can instead do that task for a thousand sysadmins. And yes, sometimes there's nothing like such a package available, and you have no reasonable alternative but to grab upstream source tarballs — but the disadvantages justify some pains to search for suitable packages, instead.
Depending on your distribution, you may find that there are update packages available directly from the distribution's package updating utilities, or from ancillary, semi-official package archives (e.g., the Fedora Extras and "dag" repositories for Fedora/RH and similar distributions), or, failing that, third-party packages maintained by reputable outside parties, e.g., some of the Debian-and-compatible repositories registered at the apt-get.org and backports.org sites. Although those are certainly not unfailingly better than tarballs, I would say they're generally so.
The smaller, less popular, and less dependency-ridden a package is, the more you might be tempted to use an upstream source tarball. For example, I use locally compiled versions of the Leafnode pre-2.0 betas to run my server's local NNTP newsgroups, because release-version packages simply lack that functionality altogether. On the other hand, that package's one dependency, the Perl PCRE library, I satisfy from my distribution's official packages, for all the reasons stated above.
[RM's post-publication addendum: I should have also mentioned that package-oriented distributions generally also have simple toolsets to craft local packages from tarballs when that is the best option for whatever reason, rounding out my listing of ways to avoid the lingering menace of unpackaged system software on software architectures with good, useful package management. (Linux distributions that either deliberately or otherwise lack software & configuration management using package management tools are, obviously, a separate topic, not addressed here.)]
[And, in case it wasn't obvious, binary tarballs from upstream distribution sites aren't siginficantly better than source ones. Either way, you get software your distro package regime knows nothing about, that won't get maintenance updates, that isn't built for your distro specifically, that poses code-authentication challenges you probably aren't equipped to shoulder, that doesn't benefit from distro-package-maintainer quality control, and that you might not even be able to figure out how to cleanly remove. All you gain with binary tarballs over source tarballs is avoiding the need to compile locally.]
John Weatherwax started running Linux when his undergraduate Physics
laboratory switched to it from a proprietary UNIX system. He was
overwhelmed with the idea that by individual donations of time and
effort such a powerful operating system could be created. His
interests are particularly focused on numerical software and he is
currently working on some open source software for the solution of
partial differential equations. He hopes to complete that project
soon.
When he is not working on his various scientific endeavors he spends
his free time with his wife and their 9 month old daughter.