In this text, I will only use the term "sample" to refer to a single output value from an A/D converter, i.e., a small integer number (usually 8 or 16 bits).
Audio data is characterized by the following parameters, which correspond to settings of the A/D converter when the data was recorded. Naturally, the same settings must be used to play the data.
Approximate sampling rates are often quoted in Hz or kHz ([kilo-] Hertz), however, the politically correct term is samples per second (samples/sec). Sampling rates are always measured per channel, so for stereo data recorded at 8000 samples/sec, there are actually 16000 samples in a second. I will sometimes write 8 k as a shorthand for 8000 samples/sec.
Multi-channel samples are generally interleaved on a frame-by-frame basis: if there are N channels, the data is a sequence of frames, where each frame contains N samples, one from each channel. (Thus, the sampling rate is really the number of *frames* per second.) For stereo, the left channel usually comes first.
The specification of the number of bits for audio in a compressed format, such as u-law samples, is somewhat problematic. u-law samples are logarithmically encoded in 8 bits, like a tiny floating point number; however, their dynamic range is that of 14 bit linear data.
There are various other techniques for encoding linear data in to less bits. See the section Compression Schemes for further information.