RTP Buffering - Frame Based Buffering
Introduction
In Song module version 8 a new RTP buffering method called frame based buffering was introduced. The algorithm calculates the audio buffer level in milliseconds rather than in bytes.
Features
Frame based buffering allows:
- configurable decoding delay with one frame accuracy
- synchronisation of several decoders to the same stream (just by configuring them to the same initial delay)
- stable delay over long period of time
- automatic correction of clock difference between encoder and decoder
Applications
The following applications use frame based buffering:
Application Name | Version |
Streaming Client | 2.17 |
Annuncicom Full Duplex | 0.21 |
RTP STL | 2.01 |
Configuration
The only configuration parameter for the RTP decoder is the delay in milliseconds.
The delay parameter is the desired processing delay of the decoder (between the network input and the audio output). Please note that the end-to-end delay between the encoder and the decoder might be (significantly) different to the value configured.
In an ideal case the delay parameter would be 0 ms, however due to device's internal buffers a small delay (depending on the hardware) is inevitable. The delay value should also cover possible temporary network hick-ups (jitter). E.g. if the network sometimes delays the packet delivery by 20ms due to a temporary load, the configured parameter should not be less than 20ms.
The maximum configurable delay is limited by the device's internal buffer (64, 32 or 16kB).
Recommended Settings
The following table lists recommended delay values for various audio formats. The value includes 2-frame jitter and is independent on hardware/software.
Audio format | Delay |
MP3 | 600ms |
uLaw/ALaw 8kHz mono | 444ms |
PCM 8kHz mono | 444ms |
uLaw/ALaw 12kHz mono | 316ms |
PCM 12kHz mono | 316ms |
uLaw/ALaw 24kHz mono | 188ms |
PCM 24kHz mono | 188ms |
uLaw/ALaw 32kHz mono | 156ms |
PCM 32kHz mono | 152ms |
PCM 44.1kHz stereo | 110ms |
PCM 44.1kHz mono | 79ms |
PCM 48kHz stereo | 72ms |
Maximum Settings
This section explains the minimum and the maximum delay values for different audio formats and platforms.
The hardware is divided into two groups:
- Micronas (MAS) based devices: Annuncicom 100/155/200/1000, Exstreamer 1000
- VLSI based devices: Exstreamer 100/110/200
MP3 CBR
The following table shows the minimum and the maximum possible delay with MP3 constant bitrate. The maximum delay differs between the Streaming Client, which has 64kB audio buffer available, and ABCL (Annuncicom FDX, STL), which features only 32kB buffer. The minimum delay includes 100ms network jitter.
MP3 CBR bitrate | Min delay | Max delay (SC) | Max delay (ABCL) |
320kbps | 150ms | 1,588ms | 769ms |
256kbps | 163ms | 2,011ms | 987ms |
192kbps | 183ms | 2,741ms | 1,349ms |
160kbps | 200ms | 3,277ms | 1,638ms |
128kbps | 225ms | 4,121ms | 2,073ms |
64kbps | 350ms | 8,342ms | 4,246ms |
32kbps | 600ms | 16,784ms | 8,592ms |
MP3 VBR and ABR
Variable or average bitrate the minimum and delay depends on the bitrate variation interval. The minimum delay is taken from the CBR table for the low end of the interval, whereas the maximum delay is the CBR value for the high end of the interval.
Please note that most MP3 encoders use the whole bitrate range starting from the lowest bitrate 32kbps. E.g. VBR 128kbps varies from 32 to 128kbps
MP3 Format | Min delay | Max delay (SC) | Max delay (ABCL) |
32-320kbps | 600ms | 1,588ms | 769ms |
32-256kbps | 600ms | 2,011ms | 987ms |
32-192kbps | 600ms | 2,741ms | 1,349ms |
32-160kbps | 600ms | 3,277ms | 1,638ms |
32-128kbps | 600ms | 4,121ms | 2,073ms |
32-64kbps | 600ms | 8,342ms | 4,246ms |
PCM
In uncompressed audio (PCM, uLaw or ALaw) the minimum and maximum delay depend on the bit rate and on the hardware.
The following table lists minimum and maximum settings for all standard RTP audio formats:
Format | Min delay MAS | Min delay VLSI | Max delay (SC) | Max delay (ABCL) | Max delay (ABCL full duplex) |
uLaw 8kHz mono ALaw 8kHz mono |
80ms | 424ms | 8171ms | 4075ms | 2027ms |
PCM 8kHz mono | 60ms | 424ms | 4075ms | 2027ms | 1003ms |
uLaw 12kHz mono ALaw 12kHz mono |
67ms | 296ms | 5441ms | 2710ms | 1345ms |
PCM 12kHz mono | 54ms | 296ms | 2710ms | 1345ms | 662ms |
uLaw 24kHz mono ALaw 24kHz mono |
54ms | 168ms | 2710ms | 1345ms | 662ms |
PCM 24kHz mono | 47ms | 168ms | 1345ms | 662ms | 321ms |
uLaw 32kHz mono ALaw 32kHz mono |
50ms | 136ms | 2027ms | 1003ms | 491ms |
PCM 32kHz mono | 43ms | 134ms | 1005ms | 493ms | 237ms |
PCM 44.1kHz stereo | 31ms | 97ms | 729ms | 358ms | 172ms |
PCM 44.1kHz mono | 16ms | 72ms | 364ms | 179ms | 86ms |
PCM 48kHz stereo | 15ms | 66ms | 335ms | 164ms | 79ms |
Multiple Device Synchronisation
Multiple devices receiving the same RTP stream can be configured to play in sync by entering the same delay parameter.
Barix recommends to use broadcast or multicast together with synchronisation, otherwise a small inaccuracy (few milliseconds) might be caused by the network delivery to different locations.
Deliberate Delays
In some applications it is desired to artificially delay the audio. E.g. in a tunnel to eliminate the delay caused by the distance between the devices.
An artificial delay can be introduced by configuring the devices to different delay values. E.g. 100ms, 120ms, 140ms, 160ms, etc.