Historically, almost every type of machine used its own file format for audio data, but some file formats are more generally applicable, and in general it is possible to define conversions between almost any pair of file formats -- sometimes losing information, however.
File formats are a separate issue from device characteristics. There are two types of file formats: self-describing formats, where the device parameters and encoding are made explicit in some form of header, and headerless formats (sometimes called "raw"), where the device parameters and encoding are fixed.
Self-describing file formats generally define a family of data encodings, where a header field indicates the particular encoding variant used.
The header of self-describing formats contains the parameters of the sampling device and sometimes other information (e.g. a human-readable description of the sound, or a copyright notice). Most headers begin with a simple "magic word". (Some formats do not simply define a header format, but may contain chunks of data intermingled with chunks of encoding info.) The data encoding defines how the actual samples are stored in the file, e.g. signed or unsigned, as bytes or short integers, in little-endian or big-endian byte order, etc. Strictly speaking, channel interleaving is also part of the encoding, although so far I have seen little variation in this area.
Here's an overview of popular file formats.
extension, name origin variable parameters (fixed; comments) .au or .snd NeXT, Sun rate, #channels, encoding, info string .aif(f), AIFF Apple, SGI rate, #channels, sample width, lots of info .aif(f), AIFC Apple, SGI same (extension of AIFF with compression) .iff, IFF/8SVX Amiga rate, #channels, instrument info (8 bits) .mp2, .mp3 MPEG standard rate, #channels, sample quality .ra Real Networks rate, #channels, sample quality .sf IRCAM rate, #channels, encoding, info .smp Turtle Beach loops, cues, (16 bits/1 ch) .voc Soundblaster rate (8 bits/1 ch; can use silence deletion) .wav, WAVE Microsoft rate, #channels, sample width, lots of info .wve Psion (8 bits, 1 ch, a-law, 8khz) none, HCOM Mac rate (8 bits/1 ch; uses Huffman compression) none, MIME Internet (see below) none, NIST SPHERE DARPA speech community (see below) .mod or .nst Amiga (see below)
Note that the filename extension ".snd" is ambiguous: it can be either the self-describing NeXT format or the headerless Mac/PC format, or even a headerless Amiga format.
I know nothing for sure about the origin of HCOM files. The filenames usually don't have a ".hcom" extension, but this is what SOX (see the section File conversion) uses. The file format recognized by SOX includes a MacBinary header, where the file type field is "FSSD". The data fork begins with the magic word "HCOM" and contains Huffman compressed data; after decompression it it is 8 bits unsigned data.
IFF/8SVX allows for amplitude contours for sounds (attack/decay/etc). Compression is optional (and extensible); volume is variable; author, notes and copyright properties; etc.
AIFF, AIFC and WAVE are similar in spirit but allow more freedom in encoding style (other than 8 bit/sample), amongst others.
There are other sound formats in use on Amiga by digitizers and music programs, such as IFF/SMUS.
An interesting "interchange format" for audio data is described in the proposed Internet Standard "MIME", which describes a family of transport encodings and structuring devices for electronic mail. This is an extensible format, and initially standardizes a type of audio data dubbed "audio/basic", which is 8-bit u-law data sampled at 8000 samples/sec.
The "IRCAM" sound file system has now been superseded by the so-called "BICSF" (for Berkeley/IRCAM/CARL Sound File system) software release.
More recently, there has been an effort at Princeton (Prof. Paul Lansky) and Stanford (Stephen Travis Pope) to standardize several extensions to BICSF. A description of BICSF and the Princeton/Stanford extensions is available by anonymous ftp at ftp://ftp.cwi.nl/pub/audio/BICSF-info. This file contains further ftp pointers to software.
Headerless formats define a single encoding and usually allows no variation in device parameters (except sometimes sampling rate, which can be a pain to figure out other than by listening to the sample).
extension origin parameters or name .snd, .fssd Mac, PC variable rate, 1 channel, 8 bits unsigned .ul US telephony 8 k, 1 channel, 8 bit "u-law" encoding .snd? Amiga variable rate, 1 channel, 8 bits signed
It is usually easy to distinguish 8-bit signed formats from unsigned by looking at the beginning of the data with 'od -b <file | head'; since most sounds start with a little bit of silence containing small amounts of background noise, the signed formats will have an abundance of bytes with values 0376, 0377, 0, 1, 2, while the unsigned formats will have 0176, 0177, 0200, 0201, 0202 instead. (Using "od -c" will also show any headers that are tacked in front of the file.)
The Apple IIgs records raw data in the same format as the Mac, but uses a 0 byte as a terminator; samples with value 0 are replaced by 1.