Articles / Sound and Music Software

Sound and Music Software

freshmeat's Sound/Audio software category lists more than 200 varied applications dealing with audio and MIDI. The newcomer to this collection may find himself (forgivably) a bit bewildered, but I hope to dispel some of that confusion with this review. freshmeat has already defined the application subcategories (many of which are self-explanatory), so I will focus on how to access the particular software to suit a specific purpose.

Starting Out...

When I considered how to present this review, it occurred to me that there were essentially only three kinds of software presented in the category:

  • Software for recording/receiving & storing sound
  • Software for editing sound
  • Software for playing & transmitting sound

Given those broad types of applications, I further defined the potential user base into two basic groups:

  • Normal users
    • Professional
    • Everyone else
  • Developers
    • Professional
    • Everyone else

An audio professional is typically understood to be someone working in the commercial recording arts fields, such as an engineer employed by a recording studio or a soundperson running the mixing boards for a touring performance group. The term should be understood here to include composers and researchers at universities and conservatories offering courses centered on digital audio, such as electroacoustic music composition or digital signal processing and analysis techniques.

Yet another distinction must be made between sound and music software. Music software may be considered a subset of sound software, and it includes a number of applications that themselves do not deal directly with audio (notation programs and MIDI sequencers are two good examples). Music software can itself be divided into software directed at two primary groups of users, one comprised of musicians who want to use Linux as a content production platform, and one made up of normal users who want to listen to music under Linux.

Linux can lay claim to an abundance of software for all of these groups, though, of course, in varying stages of development. The listings here on freshmeat provide a good overview of available programs, and my Linux Sound & Music Applications pages include the Web's most complete listing of such software.

The Driver Situation

Thanks to the work of such organizations as 4Front Technologies and the ALSA group, Linux can now claim fairly comprehensive support for the most popular soundcards and audio chipsets. However, it must be noted that consumer-grade cards are definitely not solutions for professional-quality recording, even with digital I/O. Such cards may be fine in their own right, and they are certainly excellent solutions for less demanding situations (games, media players, audio effects, and so forth). Thankfully, a few forward-thinking manufacturers of professional-quality audio boards have made their driver specifications available to the Linux audio development community, and we may see further involvement from other pro-level board manufacturers in the future.

Linux support for hardware and software at the pro-audio level is not extensive. However, that support is growing, and some excellent pro-audio solutions already exist for Linux. The ALSA and OSS/Linux driver packages support some higher-end digital audio boards such as the MidiMan Delta series and the Hammerfall cards from RME. The software side is equally small in numbers but high in quality.

The Latency Issue

An untuned Linux kernel can cause audio latencies of up to 150 milliseconds or higher. Latencies of this order will cause disruption in recording and playback, so considerable effort has gone into creating a means for dramatically lowering latency. Thanks to the work of Ingo Molnar, Andrew Morton, Benno Senoner, and Roger Larsson, a Linux user can apply a simple kernel patch that will reduce latencies to well under 3 milliseconds. The process requires recompiling the kernel, but the patch is very easy to apply. Interested readers should take a look at my article on achieving low latency under Linux. Be sure to check the Resources for the most current news regarding the low-latency patches.

Interfaces: Console Or X ?

Sound editing and analysis programs are especially well-suited to the graphic user interface, but almost all other audio applications should be usable from the commandline. Console-based audio applications may be particularly valuable for an embedded music system or for systems set up for visually-impaired users. Given Linux's Unix heritage, it's not surprising to find a large number of applications designed for the console, and given Linux's meta-Unix heritage, it's not surprising to find an equal number of graphic frontends for many of those console apps. Hopefully, you will be able to find what you're looking for in just the interface you need, whether it's GTK, QT, FLTK, Tcl/Tk, X/Motif, or the unadorned commandline.

Mixers

Your software mixer is the first visible layer between your soundcard and your audio applications. Its most basic feature is its control of the volume levels for each of your soundcard's device channels (CD, line-in, microphone, etc.), but some mixers allow access to more extended features of your soundcard, such as effects processors, rear speaker controls, and digital I/O. The mixer also lets you select particular channels as sources for recording to disk via some of the software listed below.

Mixers typically configure themselves to your soundcard's capabilities (as supported by your soundcard driver), so, for the most part, mixers are mixers. A few are optimized for either ALSA or OSS/Linux, but the most significant difference between mixers lies in the choice of user interface.

The venerable aumix is commonly bundled with mainstream Linux distributions, sporting either a console (ncurses) or X (GTK) interface, but you can find a software mixer for virtually any Linux graphics toolkit. If you can't find one for your favorite GUI here on freshmeat, check the listings at the famous Ibiblio repository or in the Linux soundapps Mixers section.

Audio Media Players

The variety of commonly-encountered sound and music file formats breaks down into the following categories:

  • Purely digital audio file formats such as WAV, AIFF, AU, SND, and CD audio
  • Compressed audio formats such as MP3, OGG, and SHN
  • Tracker formats such as MOD, IT, and XM
  • MIDI files in MID and other formats
  • Streaming audio (RealPlayer, Shockwave/Flash, QuickTime, Vorbis, MP3)

Linux media players are available for all of these formats (and many others), in every imaginable interface style. Some players are available for specific formats, some are multi-format audio players, and some are truly multimedia players; for example, MikMod, mpg123, and TiMidity handle tracker modules, MP3s, and MIDI files, respectively. Fortunately for developers, each of these programs can be modularized, making it possible to easily embed their functions into other applications. XMMS is a multimedia player that lets users add new format players to the core application, expanding the original player to incorporate nearly all the formats listed above (Shockwave/Flash and QuickTime are the notable exceptions). Some video formats (such as AVI and MPEG) can be played (with sound, of course), streaming audio is supported, and other plugins are available for DSP effects, surround sound, and visual output displays.

Please note that CD players can be found under the CD Audio and Players categories, again in a rather bewildering variety of interfaces. My old favorite is Ti Kan's fine Xmcd for Motif, but you'll certainly be able to find a CD player here that suits your own user interface preference. Note also that multimedia players such as XMMS and Andy Lo A Foe's AlsaPlayer are capable of CD playback, too.

RealPlayer and Flash

Real.com thoughtfully provides Unix/Linux users with a compatible version of their RealPlayer multimedia playback software. A plugin for XMMS is available, as are some commandline "wrappers" for RealPlayer, but they all depend on proprietary software from Real. In other words, if you want to enjoy RealPlayer content in Linux, you'll have to go to Real.com for the software. A similar situation exists regarding Macromedia's Flash Internet multimedia technology. A plugin for Linux is available from the company, it is proprietary software with no source code, and it is the only way you can view Flash content under Linux.

QuickTime and Shockwave

QuickTime has seen incomplete support under Linux for many years, but recently it became possible to run any and all QuickTime content via the Crossover Plugin from CodeWeavers. It is also capable of handling Shockwave-enabled Web sites and includes support for viewers for Microsoft Word and Excel files. The Crossover software is commercial and proprietary, but the CodeWeavers crew also contributes development to the WINE project.

Recording/Audio Capture Software

Professional Level

Users looking for audio recording and processing software for professional purposes should especially consider Kai Vehmanen's ecasound and Paul Davis's Ardour.

The development of Ardour is squarely aimed at the high-end professional recording market. An optimal system for Ardour includes a dual-CPU machine, fast, large hard disks, and the RME Hammerfall (or comparable) digital audio board. Other specific necessary hardware and software is listed on the Ardour homepage. At the time of this writing, the project is still under heavy development, and Paul Davis (Ardour's principal author) advises that, at this point, Ardour should be considered a developer's project, i.e., it is not yet ready for normal users (although some public tarballs have recently been released).

Developer Kai Vehmanen has created a whole family of outstanding Linux audio applications, including the ecasound multitrack recorder, the ecawave soundfile editor, the ecamegapedal effects processor, and many others. ecasound is the centerpiece of the group. It is capable of high-quality multichannel and multitrack recording, realtime effects processing and signal routing, and audiofile playback in a variety of formats. When combined with one of the supported pro-audio boards, ecasound is indeed a professional-quality audio solution for Linux music makers, and its ease of use makes it an excellent choice for semi-pro and high-quality desktop audio recordists.

Note: Ecasound's native interface is the command shell, but graphic interfaces are also available. Please see Kai's ecasound projects page for a list of the latest GUIs for ecasound.

Semi-professional/Desktop Level

Nick Copeland's SLab is a hard disk recording system that will mix up to 64 tracks of recorded audio. SLab is optimized for use with the OSS/Linux drivers, but it can be built for use with the ALSA drivers in native mode. It currently supports only mono and stereo recording with consumer-grade soundcards, but future versions will support true multitrack/multichannel recording. It is an X application that uses Tcl/Tk for its GUI. Its interface and controls should be familiar to anyone with experience recording on multitrack tape recorders.

Multitrack is another mid-level recorder designed specifically for the OSS/Linux or OSS/Free drivers. It will probably work with ALSA in OSS-emulation mode, but your mileage may vary, depending on the particular soundcard you're using. It has a pleasant interface both in X and in SVGA mode at the console. It will record up to 16 tracks, supports full-duplex recording, and has a number of amenities to make life easier for the desktop recordist (the guitar tuner is a nice touch).

Audacity runs on Linux/BSD systems, MacOS, and Windows, making it an ideal solution if you need to work under multi-platform conditions. It began its existence as a soundfile editor, but may now be considered a multitrack recording/editing environment as well.

Lightweight/Fun

This sort of program is what you want if you'd like to embed small soundfiles into a personal Web page, create your own sounds to bind to your keyboard and mouse actions, or attach a soundbite to your next email to your grandmother. Many lightweight recorders exist for simple sound recording under Linux, such as the Quick Record and Ksoundrecord applets for the GNOME and KDE desktops. See the Players & Recorders section on the Linux soundapps pages for a list of other lightweight recorders.

CD Rippers & The MP3

An MP3 file is not actually an audio file; it's a compression format for audio files in common formats such as WAV and AIFF. You don't actually record an MP3 directly. You can convert your existing audio files by using an MP3 encoder such as LAME or BladeEnc. You can also convert CD audio to MP3 files by first "ripping" the audio tracks from the CD with a tool such as cdparanoia. The ripper lifts the audio data from the CD and converts it to WAV files, which are then ready for converting to MP3s with an encoder. See the Conversion section for the various utilities and frontends that facilitate this process.

Storage

In "The Old Days", we stored our sonic masterworks on a cassette or reel tape. Then along came the digital revolution, and we could use digital audio tape (DAT) for far better fidelity, and now we have CD burners right in our common desktop computers. Data and/or audio files can be stored directly on CD, and there are some superb Linux tools for doing that job, many of which have been in long-term development. Jörg Schilling's cdrecord is the indispensable utility for burning data or audio files to a compact disc. It comes with a rich set of process control options, but its native interface is the unadorned console. Fortunately, GUIs exist for cdrecord in Java (BurnIt), GTK (X-CD-Roast and Gcombust, my favorite), and QT (CD Bake Oven).

Audio Editors & Analysis Software

After recording, you may want to fix or process your new soundfiles. You'll need a soundfile editor for that work, and Linux provides a rather wide range of such programs. Once again, your particular needs (professional, semi-pro, desktop user) will help determine which editor you want. In my opinion, the most powerful soundfile editor for Linux (and a variety of other *NIX operating systems) is Bill Schottstaedt's Snd, but the other Linux soundfile editors have their own standout features. For example, Richard Kent's DAP includes an impressive loop editing panel that is particularly useful for editing the loop points in AIFF soundfiles. Josh Green's excellent Smurf audio editor is a good example of an even more single-minded dedicated application; it edits soundfonts, and only soundfonts.

Your graphics environment is a determining factor when selecting a soundfile editor. Audio editors come dressed in every graphics toolkit available to Linux, including X/Motif (Snd and Xforge), GTK (Snd again, GLAME, and GSMP), Java (LAoE), QT (ecawave), InterViews (MiXViews), Tcl/Tk (SoundStudio and WaveSurfer), wxWindows (Audacity), and even ncurses (Snuggles). Of course, you will have to weigh the editor's development status against the importance of a familiar graphics environment, and you may find it necessary to install the supporting toolkit in order to run the editor that suits your needs.

As I mentioned earlier, I've considered editing in a very broad sense. The ubiquitous MP3 also has its share of editing utilities, including frame-level editors (such as mp3asm and MPGEDIT) and the various utilities for managing audio databases, ID3 information tags, and M3U playlists.

Audio analysis is an aspect of sound editing software, but dedicated audio analysis software offers features such as a greater variety of analysis methods, realtime spectral display of an incoming signal, and sound transformations not commonly encountered in typical audio editors. Aglaophone, baudline, and eXtace are excellent examples of fascinating and instructive analysis display programs. Other interesting audio analysis packages include the HASAS passive sonar signal analysis system and the Ceres3 spectral domain soundfile editor.

Compressed/Streaming Audio

CD-quality audio requires 10 megabytes of storage for one minutes of sound. In this day of cheap and abundant disk space, perhaps we don't need to concern ourselves with soundfile compression schemes. However, when we consider the transfer of audio over a network, we find that uncompressed CD-quality audio still requires too much network bandwidth to make it feasible either as a static file transfer format or as streaming audio. Audio compression is a complex topic, but its practical demonstration includes the RealPlayer and QuickTime formats, as well as the ubiquitous MP3 and the more recent challenger, OGG/Vorbis.

The essential software for creating your own MP3s should include a ripper, an encoder, and a track information tag (ID3) editor. These components are all available as standalone programs, and they are often incorporated into a "complete solution" frontend. David Oliphant's Grip is a perfect example frontend, with preset configurations for a variety of rippers (such as the default cdparanoia), MP3 and Ogg encoders (such as LAME and OggEnc), a built-in tag editor, and a handy CD player.

Becoming an Internet broadcaster requires a little more than some MP3 or Ogg files. You'll need a streaming audio server and a (preferably) broadband network connection. With your cable, DSL, or T1/3 line, you can use popular "netcasting" file streamers such as Shoutcast, Icecast, or Real.com's RealSystem Producer. Database management programs are necessary for broadcasting large collections of files (Andromeda and MAES are excellent examples for X, while SMDP should work well for the console).

For those of us who just can't stand the thought of dealing with the legal hassles of the MP3, there's the relatively new OGG/Vorbis format. Jack Moffitt's splendid Icecast is a great choice for streaming OGG files, and you can find many other OGG tools in the Players, Conversion, and CD Audio categories.

Content Production: MIDI, MODs, And Software Synthesis

This heading broadly covers programs for music composition and notation and for designing sounds. Music composition programs for Linux include MIDI sequencers such as Anthem and MusE, the excellent SoundTracker module (MOD and XM) tracker, and notation-based editors such as NoteEdit and Brahms. And if you're looking for loop-based sample sequencers you'll definitely want to check out SpiralLoops and terminatorX

Music typesetting (the preparation of a manuscript for publication) is well-represented by the GNU Project's LilyPond and the TeX-based MusiXTeX. If your music notation needs are more oriented towards small-to-medium band and orchestral arrangements, I recommend looking at Mup.

Sound design programs for Linux run the gamut from text-based audio programming environments (such as the famous Csound and the not-yet-so-famous sfront) to the more modern software synthesizer interface, complete with graphic control panels (SpiralSynth, Cumulus, and the Ultramaster RS-101 are excellent examples).

Special mention should be made of jMax and PD. These programs have both evolved from the famous MAX iconic audio programming environment, jMax taking a route following the development of Java, while PD utilizes Tcl/Tk for its GUI. Both have recently acquired powerful multimedia capabilities with the addition of video and other graphics add-ons. Motion, mass, color, and sound can be related in arbitrary ways, making jMax and PD very appealing to artists working in true multimedia environments.

Note that some of the packages listed in the Speech category can also be used for general-purpose software synthesis and audio analysis.

Speech Software

This section is definitely one of those "rubber bag" categories. Here you can find software that deals with almost any manifestation of the spoken voice, such as the Bayonne GNU telephony server project, voice control software (CVoiceControl), text-to-speech converters (KDE SPeaker and Spk), and speech synthesizers (Festival is particularly noteworthy). It also includes Internet telephone software (such as Speak Freely) and GAIM-to-speech converters based on Viavoice output or the Festival synthesizer.

Development Tools & Toolkits

Linux is well-known as an especially rich software development environment. Mainstream Linux distributions typically include a wealth of languages, graphics toolkits, programming utilities, and many other amenities for the developer. Naturally, in such an environment, Linux audio support is evolving rapidly at both the system and the application development levels. I have already mentioned the work done at the kernel level to dramatically reduce audio performance latencies, and work in that area continues to evolve, accommodating new filesystems (reiserfs, ext3) and other kernel enhancements. At some point, it seems likely that the kernel tree will absorb the ALSA audio API and soundcard drivers, bringing system-level professional-standards capabilities to Linux audio applications developers, in an Open Source and GPLed package.

Meanwhile, applications-level toolkits are thriving. Most of these packages expect only the base-level OSS/Free API common to the kernel drivers, ALSA, and OSS/Linux, but they are quite powerful and versatile. For example, Kare Sjolander designed his SNACK toolkit as a set of audio tools and utilities optimized for the analysis and editing of speech, but it can certainly be used in many other Linux audio development domains. Other interesting audio development tools listed on freshmeat include the jWave package (for creating and handling WAV files under Java), Michael Pruett's Audio File Library (a Linux translation of the Silicon Graphics library), and the TSE3 MIDI sequencer engine.

That Belonging Feeling...

Programmers looking for new projects or support for their own projects should consider joining the Linux Audio Development group. LAD discussions are on-topic and very well-informed; many of its members are key developers with ALSA and other major Linux audio development projects.

Users of Linux audio software may be happy to find the Linux Audio Users group. Like LAD (its elder sibling), LAU is a lively and focused forum for the discussion of all things having to do with Linux audio from the user's perspective.

Some Other Resources

I've already mentioned my Linux Sound & Music Applications Web site, but there are some hard-copy resources I must also mention.

Two books are already available for Linux music and sound users and programmers. My own Book Of Linux Music & Sound is squarely aimed at the normal applications user, while Jeff Tranter's Linux Multimedia Guide is targeted towards developers working with the basic OSS/Free API.

Csound enthusiasts will find Richard Boulanger's Csound Book to be their indispensable reference, while newcomers may find Riccardo Bianchini's Virtual Sound a little gentler amanuensis.

I also recommend Curtis Roads's magnificent (and massive) Computer Music Tutorial and Charles Dodge's somewhat more manageable Computer Music: Synthesis, Composition, and Performance for exhaustive expositions of all you'll ever want to know about the essentials of computer music and digital audio.

Recent comments

21 Sep 2004 06:16 Avatar hansjuergen

open source MPEG-4 AAC encoder and decoder on Freshmeat
I would like to add to this comprehensive article that the FAAC project is listed on Freshmeat.net now, providing multichannel AAC codecs with gapless playback and MP4 tagging support. They have already been implemented in other open source projects and commercial applications, see the project page:

http://freshmeat.net/projects/faac/

20 Jan 2002 21:23 Avatar Serhstar15

Re: So....which Media Player for a normal user

> For a user with a decent, but certainly
> not great ( or even good ), set of
> speakers and a consumer sound card (sb
> live ), what are the recommended media
> players. I'm especially looking for one
> that will handle as many formats as
> possible, since I don't want to mess
> around with several.
>
> Which audio players do video?


For the Windows platform, I have found that Winamp is fairly simple and versitile. The program is free at www.winamp.com. It plays a wide variety of sound formats, and many plugins are available (also free) to play formats which are not supported by the base program. It offers DSP plugins and an EQ to tweak the sound to your liking. A plugin called Vidamp plugin will play video. Skins and "visualization" plugins add to your viewing pleasure.
If you can't find a plugin to play a certain format, you can ask around in the winamp forums, or you can write your own plugin. (with VB I believe.)

The more plugins you add, the more versitile, but it can become a power hog, so only pick what you need (or would like).

There are many audio-enhancing and surround-emulating plugins, although my SB Live card sounds pretty good with just a pair of Soundworks speakers.

Good Luck!

17 Jan 2002 06:05 Avatar learfox

Don't forget YIFF
One of the original sound servers designed for UNIX games back in the OSS era:

http://wolfpack.twu.net/YIFF

18 Dec 2001 14:20 Avatar gauze

Re: Gramofile for recording and signal processing

> Gramofile
> is the only tool to use for recording
> phonograph records, and is quite useful
> general purpose stereo recording as
> well.
> It also has lots of nifty tools to
> split tracks and eliminate clicks.
> I highly recommend it.


As do I for the filtering of clicks and pops, however I prefer gnoise as my wav recording application as I hav a graphical representation
(far more readable IMHO than the gramofile interface) of the sound and it makes breaking up tracks a dream (with highlight-> save selection as ...)

gramofile's track seperation is decent but for some sources inadequate.

It's tick-fixing however is excellent.


15 Dec 2001 19:08 Avatar dmazzoni

Developers, don't forget about libsndfile and portaudio
Great article!


I'd like to point out two other libraries that also deserve mentioning:

1. libsndfile (http://freshmeat.net/projects/libsndfile/), like Michael Pruett's audio file library aka libaudiofile, is a cross-platform library for reading and writing common sound file formats, written by Erik de Castro Lopo. However, while libaudiofile supports just AIFF, WAV, AU, and IRCAM, libsndfile supports all of these plus Amiga IFF, Ensoniq PARIS, NIST/Sphere, and more, plus it supports floating-point samples in the file types that allow it. I've found it to be easy to use and rock-solid stable. If you're writing an application that needs to read or write audio files, look no further.


2. PortAudio (http://www.portaudio.com) is a cross-platform library for doing audio I/O. Rather than writing your application to depend on OSS, ALSA, or aRts, write it based on PortAudio - that lets your users decide which drivers they want to use. PortAudio is also critical for truly cross-platform audio applications, because it supports not only Linux/OSS currently, but also two Windows drivers (WMME and DirectSound), two Mac drivers (OS 9 and OS X), and the Steinberg ASIO driver (used for high-end sound cards on Windows and Mac). Ports for BeOS, Linux/ALSA, SGI, and KDE/aRts are under development.


Screenshot

Project Spotlight

Kigo Video Converter Ultimate for Mac

A tool for converting and editing videos.

Screenshot

Project Spotlight

Kid3

An efficient tagger for MP3, Ogg/Vorbis, and FLAC files.