New WAVE Types
The necessary type, structure and constant defintions are in mmreg.h.
All newly defined WAVE types must contain both a fact chunk and an extended wave format description within the 'fmt' chunk. RIFF WAVE files of type WAVE_FORMAT_PCM need not have the extra chunk nor the extended wave format description.
Fact Chunk
This chunk stores file dependent information about the contents of the WAVE file. It currently specifies the length of the file in samples.
WAVEFORMATEX
The extended wave format structure is used to defined all non-PCM format wave data, and is described as follows in the include file mmreg.h:
/* general extended waveform format structure */
/* Use this for all NON PCM formats */
/* (information common to all formats) */
typedef struct waveformat_extended_tag {
WORD wFormatTag; /* format type */
WORD nChannels; /* number of channels (i.e. mono, stereo...) */
DWORD nSamplesPerSec; /* sample rate */
DWORD nAvgBytesPerSec; /* for buffer estimation */
WORD nBlockAlign; /* block size of data */
WORD wBitsPerSample; /* Number of bits per sample of mono data */
WORD cbSize; /* The count in bytes of the extra size */} WAVEFORMATEX;
| wFormatTag | Defines the type of WAVE file. |
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo |
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 11025, 22050, or 44100. Other sample rates are allowed, but not encouraged. This rate is also used by the sample size entry in the fact chunk to determine the length in time of the data. |
| nAvgBytesPerSec | Average data
rate. Playback software can estimate the buffer size using the <nAvgBytesPerSec> value. |
| nBlockAlign | The block
alignment (in bytes) of the data in <data-ck>. Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. |
| wBitsPerSample | This is the number of bits per sample per channel data. Each channel is assumed to have the same sample resolution. If this field is not needed, then it should be set to zero.img |
| cbSize | The size in bytes of the extra information in the WAVE format header not including the size of the WAVEFORMATEX structure.. As an example, in the IMA ADPCM format cbSize is calculated as sizeof(IMAADPCMWAVEFORMAT) - sizeof(WAVEFORMATEX) which yeilds two. |
Defined wFormatTags
| Expr1 | WAVE form Registration No - Hex | Expr2 |
| #define WAVE_FORMAT_G723_ADPCM | 0x0014 | /* Antex Electronics Corporation */ |
| #define WAVE_FORMAT_ANTEX_ADPCME | 0x0033 | /* Antex Electronics Corporation */ |
| #define WAVE_FORMAT_G721_ADPCM | 0x0040 | /* Antex Electronics Corporation */ |
| #define WAVE_FORMAT_APTX | 0x0025 | /* Audio Processing Technology */ |
| #define WAVE_FORMAT_AUDIOFILE_AF36 | 0x0024 | /* Audiofile, Inc. */ |
| #define WAVE_FORMAT_AUDIOFILE_AF10 | 0x0026 | /* Audiofile, Inc. */ |
| #define WAVE_FORMAT_CONTROL_RES_VQLPC | 0x0034 | /* Control Resources Limited */ |
| #define WAVE_FORMAT_CONTROL_RES_CR10 | 0x0037 | /* Control Resources Limited */ |
| #define WAVE_FORMAT_CREATIVE_ADPCM | 0x0200 | /* Creative Labs, Inc */ |
| #define WAVE_FORMAT_DOLBY_AC2 | 0x0030 | /* Dolby Laboratories */ |
| #define WAVE_FORMAT_DSPGROUP_TRUESPEECH | 0x0022 | /* DSP Group, Inc */ |
| #define WAVE_FORMAT_DIGISTD | 0x0015 | /* DSP Solutions, Inc. */ |
| #define WAVE_FORMAT_DIGIFIX | 0x0016 | /* DSP Solutions, Inc. */ |
| #define WAVE_FORMAT_DIGIREAL | 0x0035 | /* DSP Solutions, Inc. */ |
| #define WAVE_FORMAT_DIGIADPCM | 0x0036 | /* DSP Solutions, Inc. */ |
| #define WAVE_FORMAT_ECHOSC1 | 0x0023 | /* Echo Speech Corporation */ |
| #define WAVE_FORMAT_FM_TOWNS_SND | 0x0300 | /* Fujitsu Corp. */ |
| #define WAVE_FORMAT_IBM_CVSD | 0x0005 | /* IBM Corporation */ |
| #define WAVE_FORMAT_OLIGSM | 0x1000 | /* Ing C. Olivetti & C., S.p.A. */ |
| #define WAVE_FORMAT_OLIADPCM | 0x1001 | /* Ing C. Olivetti & C., S.p.A. */ |
| #define WAVE_FORMAT_OLICELP | 0x1002 | /* Ing C. Olivetti & C., S.p.A. */ |
| #define WAVE_FORMAT_OLISBC | 0x1003 | /* Ing C. Olivetti & C., S.p.A. */ |
| #define WAVE_FORMAT_OLIOPR | 0x1004 | /* Ing C. Olivetti & C., S.p.A. */ |
| #define WAVE_FORMAT_IMA_ADPCM | (WAVE_FORM_DVI_ADPCM) | /* Intel Corporation */ |
| #define WAVE_FORMAT_DVI_ADPCM | 0x0011 | /* Intel Corporation */ |
| #define WAVE_FORMAT_UNKNOWN | 0x0000 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_PCM | 0x0001 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_ADPCM | 0x0002 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_ALAW | 0x0006 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_MULAW | 0x0007 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_GSM610 | 0x0031 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_MPEG | 0x0050 | /* Microsoft Corporation */ |
| #define WAVE_FORMAT_NMS_VBXADPCM | 0x0038 | /* Natural MicroSystems */ |
| #define WAVE_FORMAT_OKI_ADPCM | 0x0010 | /* OKI */ |
| #define WAVE_FORMAT_SIERRA_ADPCM | 0x0013 | /* Sierra Semiconductor Corp */ |
| #define WAVE_FORMAT_SONARC | 0x0021 | /* Speech Compression */ |
| #define WAVE_FORMAT_MEDIASPACE_ADPCM | 0x0012 | /* Videologic */ |
| #define WAVE_FORMAT_YAMAHA_ADPCM | 0x0020 | /* Yamaha Corporation of America */ |
Unknown Wave Type
Added: 05/01/92
Author: Microsoft
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
Changed as of September 5, 1993: This wave format will not be defined. For development purposes, DO NOT USE 0x0000. Instead, USE 0xffff until an ID has been obtained.
# define WAVE_FORMAT_UNKNOWN (0x0000)
| wFormatTag | This must be set to WAVE_FORMAT_UNKNOWN. |
| nChannels | Number of channels in the wave.(1 for mono) |
| nSamplesPerSec | Frequency the of the sample rate of wave file. |
| nAvgBytesPerSec | Average data
rate. Playback software can estimate the buffer size using the <nAvgBytesPerSec> value. |
| nBlockAlign | Block
Alignment of the data. Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. |
| wBitsPerSample | This is the number of bits per sample of data. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. |
Microsoft ADPCM
Added 05/01/92
Author: Microsoft
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_ADPCM (0x0002)
typedef struct adpcmcoef_tag {
int iCoef1;
int iCoef2;
} ADPCMCOEFSET;
typedef struct adpcmwaveformat_tag {
WAVEFORMATEX wfxx;
WORD wSamplesPerBlock;
WORD wNumCoef;
ADPCMCOEFSET aCoeff[wNumCoef];
} ADPCMWAVEFORMAT;
| wFormatTag | This must be set to WAVE_FORMAT_ADPCM. | |||
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo. | |||
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 11025, 22050, or 44100. Other sample rates are allowed, but not encouraged. | |||
| nAvgBytesPerSec | Average
data rate. ((nSamplesperSec / nSamplesPerBlock) *
nBlockAlign). Playback software can estimate the buffer size using the |
|||
| nBlockAlign | The
block alignment (in bytes) of the data in
|
|||
| nSamplesPerSec x Channels | nBlockAlign | |||
| 8k | 256 | |||
| 11k | 256 | |||
| 22k | 512 | |||
| 44k | 1024 | |||
| Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. | ||||
| wBitsPerSample | This is the number of bits per sample of ADPCM. Currently only 4 bits per sample is defined. Other values are reserved. | |||
| cbSize | The
size in bytes of the extended information after the
WAVEFORMATEX structure. For the standard WAVE_FORMAT_ADPCM using the standard seven coefficient pairs, this is 32. If extra coefficients are added, then this value will increase. |
|||
| nSamplesPerBlock | Count
of number of samples per block. (((nBlockAlign - (7 * nChannels)) * 8) / (wBitsPerSample * nChannels)) + 2. |
|||
| nNumCoef | Count of the number of coefficient sets defined in aCoef. | |||
| aCoeff | These are the coefficients used by the wave to play. They may be interpreted as fixed point 8.8 signed values. Currently there are 7 preset coefficient sets. They must appear in the following order. | |||
| Coef Set | Coef1 | Coef2 | ||
| 0 | 256 | 0 | ||
| 1 | 512 | -256 | ||
| 2 | 0 | 0 | ||
| 3 | 192 | 64 | ||
| 4 | 240 | 0 | ||
| 5 | 460 | -208 | ||
| 6 | 392 | -232 | ||
| Note that if even only 1 coefficient set was used to encode the file then all coefficient sets are still included. More coefficients may be added by the encoding software, but the first 7 must always be the same. | ||||
Note: 8.8 signed values can be divided by 256 to obtain the integer portion of the value.
Block
The block has three parts, the header, data, and padding. The three together are <nBlockAlign> bytes.
typedef struct adpcmblockheader_tag {
BYTE bPredictor[nChannels];
int iDelta[nChannels];
int iSamp1[nChannels];
int iSamp2[nChannels];
} ADPCMBLOCKHEADER;
| Field | Description |
| bPredictor | Index into the aCoef array to define the predictor used to encode this block. |
| iDelta | Initial Delta value to use. |
| iSamp1 | The second sample value of the block. When decoding this will be used as the previous sample to start decoding with. |
| iSamp2 | The first sample value of the block. When decoding this will be used as the previous' previous sample to start decoding with. |
Data
The data is a bit string parsed in groups of (wBitsPerSample * nChannels).
For the case of Mono Voice ADPCM (wBitsPerSample = 4, nChannels = 1) we have:
... ... where
has or < (Sample 2N + 2) (Sample 2N + 3)>
= ((4 bit error delta for sample (2 * N) + 2) << 4) | (4 bit error delta for sample (2 * N) + 3) For the case of Stereo Voice ADPCM (wBitsPerSample = 4, nChannels = 2) we have:
... ... where
has or < (Left Channel of Sample N + 2) (Right Channel of Sample N + 2)>
= ((4 bit error delta for left channel of sample N + 2) << 4) | (4 bit error delta for right channel of sample N + 2)
Padding
Bit Padding is used to round off the block to an exact byte length.
The size of the padding (in bits):
((nBlockAlign - (7 * nChannels)) * 8) -
(((nSamplesPerBlock - 2) * nChannels) * wBitsPerSample)
The padding does not store any data and should be made zero.
ADPCM Algorithm
Each channel of the ADPCM file can be encoded/decoded independently. However this should not destroy phase and amplitude information since each channel will track the original. Since the channels are encoded/decoded independently, this document is written as if only one channel is being decoded. Since the channels are interleaved, multiple channels may be encoded/decoded in parallel using independent local storage and temporaries.
Note that the process for encoding/decoding one block is independent from the process for the next block. Therefore the process is described for one block only, and may be repeated for other blocks. While some optimizations may relate the process for one block to another, in theory they are still independent.
Note that in the description below the number designation appended to iSamp (i.e. iSamp1 and iSamp2) refers to the placement of the sample in relation to the current one being decoded. Thus when you are decoding sample N, iSamp1 would be sample N - 1 and iSamp2 would be sample N - 2. Coef1 is the coefficient for iSamp1 and Coef2 is the coefficient for iSamp2. This numbering is identical to that used in the block and format descriptions above.
A sample application will be provided to convert a RIFF waveform file to and from ADPCM and PCM formats.
Decoding
First the predictor coefficients are determined by using the bPredictor field of block header. This value is an index into the aCoef array in the file header.
bPredictor = GETBYTE
The initial iDelta is also taken from the block header.
iDelta = GETWORD
Then the first two samples are taken from block header. (They are stored as 16 bit PCM data as iSamp1 and iSamp2. iSamp2 is the first sample of the block, iSamp1 is the second sample.)
iSamp1= GETINT
iSamp2 = GETINT
After taking this initial data from the block header, the process of decoding the rest of the block may begin. It can be done in the following manner:
While there are more samples in the block to decode:
Predict the next sample from the previous two samples.
lPredSamp = ((iSamp1 * iCoef1) + (iSamp2 *iCoef2)) / FIXED_POINT_COEF_BASE
Get the 4 bit signed error delta.
(iErrorDelta = GETNIBBLE)
Add the 'error in prediction' to the predicted next sample and prevent over/underflow errors.
(lNewSamp = lPredSample + (iDelta * iErrorDelta)
if lNewSample too large, make it the maximum allowable size.
if lNewSample too small, make it the minimum allowable size.
Output the new sample.
OUTPUT( lNewSamp )
Adjust the quantization step size used to calculate the 'error in prediction'.
iDelta = iDelta * AdaptionTable[ iErrorDelta] / FIXED_POINT_ADAPTION_BASE
if iDelta too small, make it the minimum allowable size.
Update the record of previous samples.
iSamp2 = iSamp1;
iSamp1 = lNewSample.
Encoding
For each block, the encoding process can be done through the following steps. (for each channel)
Determine the predictor to use for the block.
Determine the initial iDelta for the block.
Write out the block header.
Encode and write out the data.
The predictor to use for each block can be determined in many ways.
1. A static predictor for all files.
2. The block can be encoded with each possible predictor. Then the predictor that gave the least error can be chosen. The least error can be determined from:
1. Sum of squares of differences. (from compressed/decompressed to original data)
2. The least average absolute difference.
3. The least average iDelta
3. The predictor that has the smallest initial iDelta can be chosen. (This is an approximation of method 2.3)
4. Statistics from either the previous or current block. (e.g. a linear combination of the first 5 samples of a block that corresponds to the average predicted error.)
The starting iDelta for each block can also be determined in a couple of ways.
1. One way is to always start off with the same initial iDelta.
2. Another way is to use the iDelta from the end of the previous block. (Note that for the first block an initial value must then be chosen.)
3. The initial iDelta may also be determined from the first few samples of the block. (iDelta generally fluctuates around the value that makes the absolute value of the encoded output about half maximum absolute value of the encoded output. (for 4 bit error deltas the maximum absolute value is 8. This means the initial iDelta should be set so that the first output is around 4.)
4. Finally the initial iDelta for this block may be determined from the last few samples of the last block. (Note that for the first block an initial value must then be chosen.)
Note that different choices for predictor and initial iDelta will result in different audio quality.
Once the predictor and starting quantization values are chosen, the block header may be written out.
First the choice of predictor is written out. (For each channel.)
Then the initial iDelta (quantization scale) is written out. (For each channel.)
Then the 16 bit PCM value of the second sample is written out. (iSamp1) (For each channel.)
Finally the 16 bit PCM value of the first sample is written out. (iSamp2) (For each channel.)
Then the rest of the block may be encoded. (Note that the first encoded value will be for the 3rd sample in the block since the first two are contained in the header.)
While there are more samples in the block to decode:
Predict the next sample from the previous two samples.
lPredSamp = ((iSamp1 * iCoef1) + (iSamp2 *iCoef2))
/ FIXED_POINT_COEF_BASE
The 4 bit signed error delta is produced and overflow/underflow is prevented..
iErrorDelta = (Sample(n) - lPredSamp) / iDelta
if iErrorDelta is too large, make it the maximum allowable size.
if iErrorDelta is too small, make it the minimum allowable size.
Then the nibble iErrorDelta is written out.
PutNIBBLE( iErrorDelta )
Add the 'error in prediction' to the predicted next sample and prevent over/underflow errors.
(lNewSamp = lPredSample + (iDelta * iErrorDelta)
if lNewSample too large, make it the maximum allowable size.
if lNewSample too small, make it the minimum allowable size.
Adjust the quantization step size used to calculate the 'error in prediction'.
iDelta = iDelta * AdaptionTable[ iErrorDelta] / FIXED_POINT_ADAPTION_BASE
if iDelta too small, make it the minimum allowable size.
Update the record of previous samples.
iSamp2 = iSamp1;
iSamp1 = lNewSample.
Sample C Code
Sample C Code is contained in the file msadpcm.c, which is available with this document in electronic form and separately. See the Overview section for how to obtain this sample code.
CVSD Wave Type
Added 07/21/92
Author: DSP Solutions, formerly Digispeech
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_IBM_CVSD (0x0005)
| wFormatTag | This must be set to WAVE_FORMAT_IBM_CVSD |
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo... |
| nSamplesPerSec | Frequency the source was sampled at. See chart below. |
| nAvgBytesPerSec | Average data
rate. See chart below. (One of 1800, 2400, 3000, 3600,
4200, or 4800) Playback software can estimate the buffer size using the |
| nBlockAlign | Set to 2048
to provide efficient caching of file from CD-ROM. Playback software needs to process a multiple of |
| wBitsPerSample | This is the number of bits per sample of data. This is always 1 for CVSD. |
| cbSize | The size in bytes of the rest of the wave format header. This is zero for CVSD. |
The Digispeech CVSD compression format is compatible with the IBM PS/2 Speech Adapter, which uses a Motorola MC3418 for CVSD modulation. The Motorola chip uses only one algorithm which can work at variable sampling clock rates. The CVSD algorithm compresses each input audio sample to 1 bit. An acceptable quality of sound is achieved using high sampling rates. The Digispeech DS201 adapter supports six CVSD sampling frequencies, which are being used by most software using the IBM PS/2 Speech Adapter:
| Sample Rate | Bytes/Second |
| 14,400Hz | 1800 Bytes |
| 19,200Hz | 2400 Bytes |
| 24,000Hz | 3000 Bytes |
| 28,800Hz | 3600 Bytes |
| 33,600Hz | 4200 Bytes |
| 38,400Hz | 4800 Bytes |
The CVSD format is a compression scheme which has been used by IBM and is supported by the IBM PS/2 Speech Adapter card. Digispeech also has a card that uses this compression scheme. It is not Digispeech's policy to disclose any of these algorithms to any third party vendor.
CCITT Standard Companded Wave Types
Added: 05/22/92
Author: Microsoft, DSP Solutions formerly Digispeech, Vocaltec, Artisoft
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
#define WAVE_FORMAT_ALAW (0x0006)
#define WAVE_FORMAT_MULAW (0x0007)
| wFormatTag | This must be set to one of WAVE_FORMAT_ALAW, WAVE_FORMAT_MULAW |
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo... |
| nSamplesPerSec | Frequency of the wave file. (8000, 11025, 22050, 44100). |
| nAvgBytesPerSec | Average data
rate. Playback software can estimate the buffer size using the <nAvgBytesPerSec> value. |
| nBlockAlign | Size of the
blocks in bytes. Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. |
| wBitsPerSample | This is the number of bits per sample of data. (This is 8 for all the companded formats.) |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be zero. |
See the CCITT G.711 specification for details of the data format.
This is a CCITT (International Telegraph and Telephone Consultative Committee) specification. Their address is:
Palais des Nations
CH-1211 Geneva 10, Switzerland
Phone: 22 7305111
OKI ADPCM Wave Types
Added: 05/22/92
Author: DigiSpeech, Vocaltec, Wang
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_OKI_ADPCM (0x0010)
typedef struct oki_adpcmwaveformat_tag {
WAVEFORMATEX wfx;
WORD wPole;
} OKIADPCMWAVEFORMAT;
| wFormatTag | This must be set to WAVE_FORMAT_OKI_ADPCM | ||
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo. | ||
| nSamplesPerSec | Frequency the sample rate of the wave file. (8000, 11025, 22050, 44100). | ||
| nAvgBytesPerSec | Average
data rate. Playback software can estimate the buffer size using the <nAvgBytesPerSec> value. |
||
| nBlockAlign | This is dependent upon the number of bits per sample. | ||
| wBitsPerSample | nChannels | nBlockAlign | |
| 3 | 1 | 3 | |
| 3 | 2 | 6 | |
| 4 | 1 | 1 | |
| 4 | 2 | 1 | |
| Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. | |||
| wBitsPerSample | This is the number of bits per sample of data. (OKI can be 3 or 4) | ||
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 2. | ||
| wPole | High frequency emphasis value | ||
This format is created and read by the OKI APDCM chip set. This chip set is used by a number of card manufacturers.
IMA ADPCM Wave Type
The IMA ADPCM and the DVI ADPCM are identical. Please see the following section on the DVI ADPCM Wave Type for a full description.
# define WAVE_FORMAT_IMA_ADPCM (0x0011)
DVI ADPCM Wave Type
Added: 12/16/92
Author: Intel
Please note that DVI ADPCM Wave Type is Identical to IMA ADPCM Wave Type.
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_DVI_ADPCM (0x0011)
typedef struct dvi_adpcmwaveformat_tag {
WAVEFORMATEX wfx;WORD wSamplesPerBlock;
} DVIADPCMWAVEFORMAT;
| wFormatTag | This must be set to WAVE_FORMAT_DVI_ADPCM. | |
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo... | |
| nSamplesPerSec | Sample rate of the WAVE file. This should be 8000, 11025, 22050 or 44100. Other sample rates are allowed. | |
| nAvgBytesPerSec | Total
average data rate. Playback software can estimate the buffer size for a selected amount of time by using the <nAvgBytesPerSec> value. |
|
| nBlockAlign | This is dependent upon the number of bits per sample. | |
| wBitsPerSample | nBlockAlign | |
| 3 | (( N * 3 ) + 1 ) * 4 * nChannels | |
| 4 | (N + 1) * 4 * nChannels | |
| Where N = 0, 1, 2, 3 . . . | ||
| The
recommended block size for coding is 256 * <nChannels> bytes* min(1, ( Smaller values cause the block header to become a more significant storage overhead. But, it is up to the implementation of the coding portion of the algorithm to decide the optimal value for <nBlockAlign> within the given constraints (see above). The decoding portion of the algorithm must be able to handle any valid block size. Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so the value of <nBlockAlign> can be used for allocating buffers. |
||
| wBitsPerSample | This is the number of bits per sample of data. DVI ADPCM supports 3 or 4 bits per sample. | |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 2. | |
| wSamplesPerBlock | Count of the number of samples per channel per Block. | |
Block
The block is defined to be <nBlockAlign> bytes in length. For DVI ADPCM this must be a multiple of 4 bytes since all information in the block is divided on 32 bit word boundaries.
The block has two parts, the header and the data. The two together are <nBlockAlign> bytes in length. The following diagram shows the Header and Data parts of one block.

Where:
M =
![]()
Header
This is a C structure that defines the DVI ADPCM block header.
typedef struct dvi_adpcmblockheader_tag {
int iSamp0;
BYTE bStepTableIndex;
BYTE bReserved;
} DVI_ADPCMBLOCKHEADER;
| Field | Description |
| iSamp0 | The first sample value of the block. When decoding, this will be used as the previous sample to start decoding with. |
| bStepTableIndex | The current index into the step table array. (0 - 88) |
| bReserved | This byte is reserved for future use. |
A block contains an array of <nChannels> header structures as defined above. This diagram gives a byte level description of the contents of each header word.

Data
The data words are interpreted differently depending on the number of bits per sample selected.
For 4 bit DVI ADPCM (where <wBitsPerSample> is equal to four) each data word contains eight sample codes as shown in the following diagram.

Where:
N = A data word for a given channel, in the range of 0 to
<nBlockAlign> / ( 4 * <nChannels> ) - <nChannels> - 1
P = ( N * 8 ) + 1
Sample 0 is always included in the block header for the channel.
Each Sample is 4 bits in length. Each block contains a total of <wSamplesPerBlock> samples for each channel.
For 3 bit DVI ADPCM (where <wBitsPerSample> is equal to three) each data word contains 10.667 sample codes. It takes three words to hold an integral number of sample codes at 3 bits per code. So for 3 bit DVI ADPCM, the number of data words is required to be a multiple of three words (12 bytes). These three words contain 32 sample codes as shown in the following diagram.

Where:
M = One of the channels, in the range of 1 to <nChannels>
N = The first data word in a group of three data words for channelM, in the
range of 0 to <nBlockAlign> / ( 4 * <nChannels> ) - <nChannels> - 1
P = ( ( N / 3 ) * 32 ) + 1
Sample 0 is always included in the block header for the channel.
Each Sample is 3 bits in length. Each block contains a total of <wSamplesPerBlock> samples for each channel.
DVI ADPCM Algorithm
Each channel of the DVI ADPCM file can be encoded/decoded independently. Since the channels are encoded/decoded independently, this document is written as if only one channel is being decoded. Since the channels are interleaved, multiple channels may be encoded/decoded in parallel using independent local storage and temporaries.
Note that the process for encoding/decoding one block is independent from the process for the next block. Therefore the process is described for one block only, and may be repeated for other blocks.
The processes for encoding and decoding is discussed below.
Tables
The DVI ADPCM algorithm relies on two tables to encode and decode audio samples. These are the step table and the index table. The contents of these tables are fixed for this algorithm. The 3 and 4 bit versions of the DVI ADPCM algorithm use the same step table, which is:
const int StepTab[ 89 ] = {
7, 8, 9, 10, 11, 12, 13, 14,
16, 17, 19, 21, 23, 25, 28, 31,
34, 37, 41, 45, 50, 55, 60, 66,
73, 80, 88, 97, 107, 118, 130, 143,
157, 173, 190, 209, 230, 253, 279, 307,
337, 371, 408, 449, 494, 544, 598, 658,
724, 796, 876, 963, 1060, 1166, 1282, 1411,
1552, 1707, 1878, 2066, 2272, 2499, 2749, 3024,
3327, 3660, 4026, 4428, 4871, 5358, 5894, 6484,
7132, 7845, 8630, 9493, 10442, 11487, 12635, 13899,
15289, 16818, 18500, 20350, 22385, 24623, 27086, 29794,
32767 }
But, the index table is different for the different bit rates. For the 4 bit DVI ADPCM the contents of index table is:
const int IndexTab[ 16 ] = { -1, -1, -1, -1, 2, 4, 6, 8,
-1, -1, -1, -1, 2, 4, 6, 8 };
For 3 bit DVI ADPCM the contents of the index table is:
const int IndexTab[ 8 ] = { -1, -1, 1, 2,
-1, -1, 1, 2 };
Decoding
This section describes the algorithm used for decoding the 4 bit DVI ADPCM. This procedure must be followed for each block for each channel.
Get the first sample, Samp0, from the block header
Set the initial step table index, Index, from the block header
Output the first sample, Samp0
Set the previous Sample value:
SampX-1 = Samp0
While there are still samples to decode
Get the next sample code, SampX Code
Calculate the new sample:
Calculate the difference:
Diff = 0
if ( SampX Code & 4 )
Diff = Diff + StepTab[ Index ]
if ( SampX Code & 2 )
Diff = Diff + ( StepTab[ Index ] >> 1 )
if ( SampX Code & 1 )
Diff = Diff + ( StepTab[ Index ] >> 2 )
Diff = Diff + ( StepTab[ Index ] >> 3 )
Check the sign bit:
if ( SampX Code & 8 )
Diff = -Diff
SampX = SampX-1 + Diff
Check for overflow and underflow errors:
if SampX too large, make it the maximum allowable size (32767)
if SampX too small, make it the minimum allowable size (-32768)
Output the new sample, SampX
Adjust the step table index:
Index = Index + IndexTab[ SampX Code ]
Check for step table index overflow and underflow:
if Index too large, make it the maximum allowable size (88)
if Index too small, make it the minimum allowable size (0)
Save the previous Sample value:
SampX-1 = SampX
This section describes the algorithm used for decoding the 3 bit DVI ADPCM. This procedure must be followed for each block for each channel.
Get the first sample, Samp0, from the block header
Set the initial step table index, Index, from the block header
Output the first sample, Samp0
Set the previous Sample value:
SampX-1 = Samp0
While there are still samples to decode
Get the next sample code, SampX Code
Calculate the new sample:
Calculate the difference:
Diff = 0
if ( SampX Code & 2 )
Diff = Diff + StepTab[ Index ]
if ( SampX Code & 1 )
Diff = Diff + ( StepTab[ Index ] >> 1 )
Diff = Diff + ( StepTab[ Index ] >> 2 )
Check the sign bit:
if ( SampX Code & 4 )
Diff = -Diff
SampX = SampX-1 + Diff
Check for overflow and underflow errors:
if SampX too large, make it the maximum allowable size (32767)
if SampX too small, make it the minimum allowable size (-32768)
Output the new sample, SampX
Adjust the step table index:
Index = Index + IndexTab[ SampX Code ]
Check for step table index overflow and underflow:
if Index too large, make it the maximum allowable size (88)
if Index too small, make it the minimum allowable size (0)
Save the previous Sample value:
SampX-1 = SampX
Encoding
This section describes the algorithm used for encoding the 4 bit DVI ADPCM. This procedure must be followed for each block for each channel.
For the first block only, clear the initial step table index:
Index = 0
Get the first sample, Samp0
Create the block header:
Write the first sample, Samp0, to the header
Write the initial step table index, Index, to the header
Set the previously predicted sample value:
PredSamp = Samp0
While there are still samples to encode, and we're not at the end of the block
Get the next sample to encode, SampX
Calculate the new sample code:
Diff = SampX - PredSamp
Set the sign bit:
if ( Diff < 0 )
SampX Code = 8
Diff = -Diff
else
SampX Code = 0
Set the rest of the code:
if ( Diff >= StepTab[ Index ] )
SampX Code = SampX Code | 4
Diff = Diff - StepTab[ Index ]
if ( Diff >= ( StepTab[ Index ] >> 1 )
SampX Code = SampX Code | 2
Diff = Diff - ( StepTab[ Index ] >> 1 )
if ( Diff >= ( StepTab[ Index ] >> 2 )
SampX Code = SampX Code | 1
Save the sample code, SampX Code in the block
Predict the current sample based on the sample code:
Calculate the difference:
Diff = 0
if ( SampX Code & 4 )
Diff = Diff + StepTab[ Index ]
if ( SampX Code & 2 )
Diff = Diff + ( StepTab[ Index ] >> 1 )
if ( SampX Code & 1 )
Diff = Diff + ( StepTab[ Index ] >> 2 )
Diff = Diff + ( StepTab[ Index ] >> 3 )
Check the sign bit:
if ( SampX Code & 8 )
Diff = -Diff
SampX = SampX-1 + Diff
Check for overflow and underflow errors:
if PredSamp too large, make it the maximum allowable size (32767)
if PredSamp too small, make it the minimum allowable size (-32768)
Adjust the step table index:
Index = Index + IndexTab[ SampX Code ]
Check for step table index overflow and underflow:
if Index too large, make it the maximum allowable size (88)
if Index too small, make it the minimum allowable size (0)
This section describes the algorithm used for encoding the 3 bit DVI ADPCM. This procedure must be followed for each block for each channel.
For the first block only, clear the initial step table index:
Index = 0
Get the first sample, Samp0
Create the block header:
Write the first sample, Samp0, to the header
Write the initial step table index, Index, to the header
Set the previously predicted sample value:
PredSamp = Samp0
While there are still samples to encode, and we're not at the end of the block
Get the next sample to encode, SampX
Calculate the new sample code:
Diff = SampX - PredSamp
Set the sign bit:
if ( Diff < 0 )
SampX Code = 4
Diff = -Diff
else
SampX Code = 0
Set the rest of the code:
if ( Diff >= StepTab[ Index ] )
SampX Code = SampX Code | 2
Diff = Diff - StepTab[ Index ]
if ( Diff >= ( StepTab[ Index ] >> 1 )
SampX Code = SampX Code | 1
Save the sample code, SampX Code in the block
Predict the current sample based on the sample code:
Calculate the difference:
Diff = 0
if ( SampX Code & 2 )
Diff = Diff + StepTab[ Index ]
if ( SampX Code & 1 )
Diff = Diff + ( StepTab[ Index ] >> 1 )
Diff = Diff + ( StepTab[ Index ] >> 2 )
Check the sign bit:
if ( SampX Code & 4 )
Diff = -Diff
SampX = SampX-1 + Diff
Check for overflow and underflow errors:
if PredSamp too large, make it the maximum allowable size (32767)
if PredSamp too small, make it the minimum allowable size (-32768)
Adjust the step table index:
Index = Index + IndexTab[ SampX Code ]
Check for step table index overflow and underflow:
if Index too large, make it the maximum allowable size (88)
if Index too small, make it the minimum allowable size (0)
DSP Solutions formerly Digispeech Wave Types
Added: 05/22/92
Author: Digispeech
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_DIGISTD (0x0015)
# define WAVE_FORMAT_DIGIFIX (0x0016)
| wFormatTag | This must be set to either WAVE_FORMAT_DIGISTD or WAVE_FORMAT_DIGIFIX. |
| nChannels | Number of channels in the wave. (1 for mono) |
| nSamplesPerSec | Frequency the sample rate of the wave file. (8000). This value is also used by the fact chunk to determine the length in time units of the date. |
| nAvgBytesPerSec | Average data
rate. (1100 for DIGISTD or 1625 for DigiFix) Playback software can estimate the buffer size using the |
| nBlockAlign | Block
Alignment of 2 for DIGISTD and 26 for DigiFix. Playback software needs to process a multiple of |
| wBitsPerSample | This is the number of bits per sample of data. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be zero. |
The definition of the data contained in the Digistd and DigiFix formats are considered proprietary information of Digispeech. They can be contacted at:
DSP Solutions, Inc.
2464 Embarcadero Way
Palo Alto, CA 94303The DIGISTD is a format used in a compression technique developed by Digispeech, Inc. DIGISTD format provides good speech quality with average rate of about 1100 bytes/second. The blocks (or buffers) in this format cannot be cyclically repeated.
The DigiFix is a format used in a compression technique developed by Digispeech, Inc. DigiFix format provides good speech quality (similar to DIGISTD) with average rate of exactly 1625 bytes/second. This format uses blocks of 26 bytes long.
Yamaha ADPCM
Added 09/25/92
Author: Yamaha
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_YAMAHA_ADPCM (0x0020)
| wFormatTag | This must be set to WAVE_FORMAT_YAMAHA_ADPCM. | |
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo. | |
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 5125, 7350, 9600, 11025, 22050, or 44100 Hz. Other sample rates are not allowed. | |
| nAvgBytesPerSec | Average
data rate.. Playback software can estimate the buffer size using the |
|
| nBlockAlign | This is dependent upon the number of bits per sample. | |
| wBitsPerSample | nBlockAlign | |
| 4 | 1 | |
| 4 | 1 | |
| wBitsPerSample | This is the number of bits per sample of YADPCM. Currently only 4 bits per sample is defined. Other values are reserved. | |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be zero. | |
This format is created and read by Yamaha chip included in the Gold Sound Standard (GSS) that is implemented in a number of manufacturers boards. The algorithm and conversion routines are published in the source code provided in YADPCM.C with this technote.
Sonarc™ Compression
Added 10/21/92
Author: Sound Compression
Sound Compression has developed a new compression algorithm which, unlike ADPCM, is capable of lossless compression of digitized audio files to a degree far greater (50-60%) than that achievable with the other compressors, PKZIP and LHarc. "Lossy" compression is possible with even higher ratios. Information about the algorithm is available form the address below.
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
typedef struct sonarcwaveformat_tag {
WAVEFORMATEX wfx;
WORD wCompType;
} SONARCWAVEFORMAT
# define WAVE_FORMAT_SONARC (0x0021)
| wFormatTag | This must be set to WAVE_FORMAT_SONARC. |
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo. |
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 11025, 22050, or 44100 Hz. Other sample rates are not allowed. |
| nAvgBytesPerSec | Average data
rate. Playback software can estimate the buffer size using the |
| nBlockAlign | The valid
values have not been defined. Playback software needs to process a multiple of |
| wBitsPerSample | This is the number of bits per sample of SONARC. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 2. |
| wCompType | This value is not yet defined.. |
"Sonarc" is a trademark of Speech Compression.
To get information on this format please contact:
Speech Compression
1682 Langley Ave.
Irvine, CA 92714
Telephone: 714-660-7727 Fax: 714-660-7155
Creative Labs ADPCM
Added 10/01/92
Author: Creative Labs
Createive has defined a new ADPCM compression scheme, and this new scheme will be implemented on their H/W and will be able to support compression and decompression real-time. They do not provide a description of this algorithm. Information about the algorithm is available form the address below.
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
typedef struct creative_adpcmwaveformat_tag {
WAVEFORMATEX wfx;
WORD wRevision;
} CREATIVEADPCMWAVEFORMAT
# define WAVE_FORMAT_CREATIVE_ADPCM (0x0200)
| wFormatTag | This must be set to WAVE_FORMAT_CREATIVE_ADPCM. | ||
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo. | ||
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 8000, 11025, 22050, or 44100 Hz. Other sample rates are not allowed. | ||
| nAvgBytesPerSec | Average
data rate.. Playback software can estimate the buffer size using the |
||
| nBlockAlign | This is dependent upon the number of bits per sample. | ||
| wBitsPerSample | nChannels | nBlockAlign | |
| 4 | 1 | 1 | |
| 4 | 2 | 1 | |
| Playback
software needs to process a multiple of
|
|||
| wBitsPerSample | This is the number of bits per sample of CADPCM. | ||
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 2. | ||
| wRevision | Revision of algorithm. This should be one for the current definition. | ||
To get information on this format please contact:
Creative Developer Support
1901, McCarthy Blvd, Milpitas, CA 95035.
Tel : 408-428 6644 Fax : 408-428 6655
DSP Group Wave Type
Added: 01/04/93
Author: Paul Beard, DSP Group
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_DSPGROUP_TRUESPEECH (0x0022)
| wFormatTag | This must be set to WAVE_FORMAT_DSPGROUP_TRUESPEECH. |
| nChannels | Number of channels in the wave, 1 for mono. |
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 8000 |
| nAvgBytesPerSec | Average data
rate.. (1067) Playback software can estimate the buffer size using the |
| nBlockAlign | This is the
block alignment of the data in bytes. (32). Playback software needs to process a multiple of |
| wBitsPerSample | This is the number of bits per sample of TRUESPEECH. Not used; set to zero. |
| cbExtraSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 32. |
| wRevision | Revision no (1,...) |
| nSamplesPerBlock | Number of samples per block. 240 |
= / * )
The definition of the data contained in the TRUESPEECH format is considered proprietary information of DSP Group Inc. They can be contacted at:
DSP Group Inc.,
4050 Moorpark Ave.,
San Jose CA. 95117
(408) 985 0722
TRUESPEECH is a format used in a compression technique developed by DSP Group Inc. TRUESPEECH format provides high quality telephony bandwidth voice vocoding with a rate of 1067 bytes per second. This format uses blocks of 32 bytes long.
Echo Speech Wave Type
Added: 01/21/93
Author: Echo Speech Corporation
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_ECHOSC1 (0x0023)
| wFormatTag | This must be set to WAVE_FORMAT_ECHOSC1. |
| nChannels | Number of channels in the wave, always 1 for mono. |
| nSamplesPerSec | Frequency of the sample rate of the wave file. This should be 11025 |
| nAvgBytesPerSec | Average data
rate.. (450) Playback software can estimate the buffer size using the |
| nBlockAlign | This is the
block alignment of the data in bytes. (6). Playback software needs to process a multiple of |
| wBitsPerSample | This is the number of bits per sample. Not used; set to zero. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 0. |
The definition of the data contained in the ECHO SC-1 format is considered proprietary information of Echo Speech Corporation. They can be contacted at:
Echo Speech Corporation
6460 Via Real
Carpinteria, CA. 93013
805 684-4593
ECHO SC-1 is a format used in a compression technique developed by Echo Speech Corporation. ECHO SC-1 format provides excellent speech quality with an average data rate of exactly 450 bytes/second. This format uses blocks 6 bytes long.
ECHO is a registered trademark of Echo Speech Corporation.
AUDIOFILE Wave Type AF36
Added: April 29, 1993
Author: AudioFile
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_AUDIOFILE_AF36 (0x0024)
| wFormatTag | This must be set to WAVE_FORMAT_AUDIOFILE_AF36 |
| nChannels | Number of channels in the wave.(1 for mono) |
| nSamplesPerSec | Frequency the of the sample rate of wave file. |
| nAvgBytesPerSec | Average data
rate. Playback software can estimate the buffer size using the <nAvgBytesPerSec> value. |
| nBlockAlign | Block
Alignment of the data. Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. |
| wBitsPerSample | This is the number of bits per sample of data. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. |
Audio File AF36 format provides very high compression for speech -based waveform audio. (Relative to 11 kHz, 16-bit PCM, a compression ratio of 36-to-1 is achieved with AF36.
For more information on AF36 and other AudioFile host-based and DSP based compression software contact: :
AudioFile, Inc.
Four Militia Drive
Lexington, MA, 02173
(617) 861-2996
Comment
Trademark info.
Audio Processing Technology Wave Type
Added: 06/22/93
Author: Calypso Software Limited
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_APTX (0x0025)
| wFormatTag | This must be set to WAVE_FORMAT_APTX. |
| nChannels | Number of channels in the wave, always 1 for mono, 2 for stereo. |
| nSamplesPerSec | Frequency of the sample rate of the wave file. (8000, 11025, 22050, 44100, 48000) |
| nAvgBytesPerSec | Average data
rate..= nChannels * nSamplesPerSec/2. (16bit audio) Playback software can estimate the buffer size using the |
| nBlockAlign | Should be set
to 2 (bytes) for mono data or 4 (bytes) for stereo. For mono data 4 sixteen bit samples will be compressed into 1 sixteen bit word For stereo data 4 sizteen bit left channel samples will be compressed into the first 16bit word and 4 sixteen bit right channel samples will be cmpressed into the next 16 bit word. Playback software needs to process a multiple of |
| wBitsPerSample | This is the number of bits per sample. Not used; set to four. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. This should be 0.(zero) |
The definition of the data contained in the APTX format is considered proprietary information of Audio Processing Technology Limited. They can be contacted at:
Audio Processing Technology Limited
Edgewater Road
Belfast, Northern Ireland, BT3 9QJ
Tel 44 232 371110
Fax 44 232 371137
This format is proprietary audio format using 4:1 compression i.c. 16 bits of audio are compressed to 4 bits. It is only encoded/decoded by dedicated hardware from MM_APT
AUDIOFILE Wave Type AF10
Added: June 22, 1993
Author: AudioFile
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the length of the data in samples.
WAVE Format Header
# define WAVE_FORMAT_AUDIOFILE_AF10 (0x0026)
| wFormatTag | This must be set to WAVE_FORMAT_AUDIOFILE_AF10 |
| nChannels | Number of channels in the wave.(1 for mono) |
| nSamplesPerSec | Frequency the of the sample rate of wave file. |
| nAvgBytesPerSec | Average data
rate. Playback software can estimate the buffer size using the <nAvgBytesPerSec> value. |
| nBlockAlign | Block
Alignment of the data. Playback software needs to process a multiple of <nBlockAlign> bytes of data at a time, so that the value of <nBlockAlign> can be used for buffer alignment. |
| wBitsPerSample | This is the number of bits per sample of data. |
| cbSize | The size in bytes of the extra information in the extended WAVE 'fmt' header. |
For more information on AF36 and other AudioFile host-based and DSP based compression software contact: :
AudioFile, Inc.
Four Militia Drive
Lexington, MA, 02173
(617) 861-2996
Dolby Labs AC-2 Wave Type
Added: 06/24/93
Author: Dolby Laboratories, Inc.
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the length of the data in samples.
WAVE Format Header
define WAVE_FORMAT_DOLBY_AC2 (0x0030)
| wFormatTag | This must be set to WAVE_FORMAT_DOLBY_AC2 |
| nChannels | Number of channels, 1 for mono, 2 for stereo |
| nSamplesPerSec | Three sample rates allowed: 48000, 44100, 32000 samples per second |
| nAvgBytesPerSec | Average data rate. ((nSamplesperSec*nBlockAlign)/512 |
| nBlockAlign | The block
alignment (in bytes) of the dat in |
| nSamplesPerSec | nBlockAlign |
| 48000 | nChannels*168 |
| 44100 | nChannels*184 |
| 32000 | nChannels*190 |
| wBitsPerSample | Approximately 3 bits per sample |
| cbExtraSize | 2 extra bytes of information in format header |
| nAuxBitsCode | Auxiliary bits code indicating number of Aux. bits per block. The amount of audio data bits is reduced by this number in the decoder, such that the overall block size remains constant. |
| nAuxBitsCode | Number of Aux bits in block |
| 0 | 0 |
| 1 | 8 |
| 2 | 16 |
| 3 | 32 |
specific structure of the
Dolby Laboratories
100 Potrero Avenue
San Francisco, CA 94103-4813
Tel 415-558-0200/* Dolby's AC-2 wave format structure definition */
typedef struct dolbyac2waveformat_tag {
WAVEFORMATEX wfx;
WORD nAuxBitsCode;
} DOLBYAC2WAVEFORMAT;
Sierra ADPCM
Added 07/26/93
Author: Sierra Semiconductor Corp.Sierra Semiconductor has developed a compression scheme similar to the standard CCITT ADPCM. This scheme has been implemented in AriaÔ -based sound boards and is capable of supporting compression and decompression in real-time. A description of the algorithm is not available at this time.
Fact Chunk
This chunk is required for all WAVE formats other than WAVE_FORMAT_PCM. It stores file dependent information about the contents of the WAVE data. It currently specifies the time length of the data in samples.
WAVE Format Header
typedef struct sierra_adpcmwaveformat_tag {
EXTWAVEFORMAT ewf;
WORD wRevision;
} SIERRAADPCMWAVEFORMAT;
# define WAVE_FORMAT_SIERRA_ADPCM (0x0013)
| wFormatTag | This must be set to WAVE_FORMAT_SIERRA_ADPCM. | ||
| nChannels | Number of channels in the wave, 1 for mono, 2 for stereo. | ||
| nSamplesPerSec | |||