Print Version | Search Site
  1. What is SoX?
  2. Is there a GUI available for SoX?
  3. Why doesn't this SoX command line I found on the internet work?
  4. Why can't this wav file (with more than 2 channels or bit-depth more than 16) that I created with SoX be read by other programs?
  5. What are the best 'rate' settings to resample a file and retain the highest quality?
  6. Is SoX's de-emphasis filter any good—I heard it reduces the stereo image?
  7. What's the best way to mix files at different points in time?
  8. Is SoX's functionality also available as a function library?
  9. How do I configure SoX for maximum speed (minimum CPU usage)?

1. What is SoX?

SoX is a command line utility that can convert various formats of computer audio files in to other formats. For example, it can convert a Microsoft WAV file into an Apple AIFF file. It can also apply various effects to these sound files during the conversion. As an added bonus, SoX can play and record audio files on many common platforms. For software developers, SoX functionality is also available as a library—see below.

2. Is there a GUI available for SoX?

There have been some GUI front-ends created for SoX in the past, but I'm afraid none has been kept up-to-date. However, see Scripts for batch processing, and Links for links to some GUI-based audio processors.

However, for resampling only, there is now a SoX resampler plug-in available for the MS-Windows 'Foobar' audio tool. Questions regarding this plug-in should be raised here.

3. Why doesn't this SoX command line I found on the internet work?

A number of potentially confusing options have been replaced with what we hope are less confusing ones; e.g. -b & -w in SoX versions prior to 14.1.0 are now -b (or --bits) 8, and -b 16, and -e is now -n.

The ChangeLog contains full details of any backwards-compatibility issues.

4. Why can't this wav file (with more than 2 channels or bit-depth more than 16) that I created with SoX be read by other programs?

Unfortunately, there are 2 variants of the WAV-file standard for such files: an unofficial one, and the official one from Microsoft. SoX can read such files regardless of which standard they conform to, but some applications can cope with only one of the two variants. By default, SoX creates such files according to the official standard; to create such a wav file that conforms to the unofficial standard, `-t wavpcm' must be given before the name of the output file, e.g.

 sox infile.any -b 24 -t wavpcm outfile.wav

5. What are the best 'rate' settings to resample a file and retain the highest quality?

Resampling is a series of compromises so there's no one true answer for all situations, but the following rules of thumb should cover most people's needs for 99% of the time:

  • Phase setting: if resampling to < 40k, use intermediate phase (-I), otherwise use linear phase (-L, or don't specify; linear phase is the default).
  • Quality setting: if resampling (or changing speed, as it amounts to the same thing) at/to > 16 bit depth (i.e. most commonly 24-bit), use VHQ (-v), otherwise, use HQ (-h, or don't specify).
  • Bandwidth setting: don't change from the default setting (95%).
  • If you're mastering to 16-bit, you also need to add 'dither' (and in most cases noise-shaping) after the rate.

Time for some examples:

 sox any-file -b 16 outfile rate 44100 dither -s

(mastering for CDDA).

 sox 24-bit-file outfile rate -v 88200

(outfile bit depth unchanged at 24-bit).

 sox any-file -b 16 outfile rate -I 22050 dither -s

N.B. Both resampling and dithering require some headroom. If SoX reports that any clipping has occurred during processing then the conversion should be redone with some attenuation, e.g.

 sox any-file -b 16 outfile gain -1 rate 44100 dither -s

6. Is SoX's de-emphasis filter any good—I heard it reduces the stereo image?

The deemph effect did once have a bug that affected stereo imaging—but this was fixed in SoX version 12.18.2 released in 2006.

The current deemph effect works fine and is accurate to 0.06dB (up to 20kHz)—much more accurate than analogue implementations in CD players; see here.

7. What's the best way to mix files at different points in time?

For example, if I have three wav files, how do I mix them so that f1.wav starts immediately, f2.wav comes in 4 seconds later, f3.wav comes in after a further 4 seconds?

There are a couple of options:

 sox -M f2.wav f3.wav f1.wav out.wav delay 4 4 8 8 remix 1,3,5 2,4,6

(assuming stereo), or

 sox -m f1.wav "|sox f2.wav -p pad 4" "|sox f3.wav -p pad 8" out.wav

The second way is probably better since it works with files with any number of channels.

8. Is SoX's functionality also available as a function library?

Yes, the source code for SoX contains both SoX, the application, and libSoX. LibSoX is a static library that can be used in other applications. It contains routines such as sox_open_read(), sox_open_write(), sox_read(), sox_write(), and sox_close() to allow reading and writing audio files.

Further information can be found in the libSoX manual page (though it is somewhat out of date) and in the examples supplied in the SoX source-code distribution.

9. How do I configure SoX for maximum speed (minimum CPU usage)?

Firstly there are the --buffer and --input-buffer options. You can try changing these buffer sizes to find out what will give the best throughput performance on your system. For example --buffer 128k has been seen to give significant speed increases on multi-core systems (but will reduce the responsiveness of 'play' to skip or quit commands).

By default, in SoX 14.3.0, multi-threading is used to take advantage of multi-core CPUs; however, in some cases, this appears to reduce performance. The --buffer option may solve this; otherwise, use the --single-threaded option.

SoX 14.3.0 also introduces automatic dithering by default -- here SoX is doing the right thing i.e. processing the audio signal correctly, but with a small increase to CPU usage. Automatic dithering can be disabled using --no-dither.

Powered by PmWiki Last modified: August 26, 2009, at 05:46 PM