Playing Audio

audio is just a 1d signal; digital audio is a 1d discreet signal

each value in this 1d discreet signal is called an “audio sample”

you put audio samples into a buffer in the sound card, and then you tell the sound card to start playing the samples in said buffer

this buffer is of a limited size, so you have to be careful not to put too much in there at once

as the sound card is playing the samples, it is taking them out of the buffer, so there is room for new samples

you also wanna watch out for when the sound card runs out of samples to play, this will result in an artifact in the audio

the best way to handle all this is to tell the sound card to call a callback when it needs new samples to play

your callback will be called by an internal sound card thread, so any data in that callback should not be accessed from other threads, or if they are, they should be guarded with a mutex

              +-----------------------+
              |   Sound Card          |
You  ---->    |                       |  ----> Sound Card removes
put           |  +-----------------+  |        and plays them
samples       |  | Audio Samples   |  |
into          |  +-----------------+  |
buffer        |                       |
              +-----------------------+

Here’s how playing audio works.

Audio is basically a 1d signal. Digital audio then is just a 1d discreet signal, essentially a sequence of values (called “audio samples”).

You send these audio samples to the sound card, which then drives your speakers. Basically the bigger the magnitude of an audio sample, the more voltage it drives to the speaker. So your audio samples are converted to voltages by the sound card, and this voltage is fed to a speaker.

The sound card has a buffer where you can place audio samples. This buffer is not infinitely sized. It has a finite size. You have to ensure that you don’t overflow this buffer. Additionally, as the sound card is “playing” your samples, you need to put in new samples for it to play, so you don’t have weird gaps/artifacts in the played audio.

So generally, you want to put audio samples into the buffer at roughly the same rate the sound card is taking them out (i.e. playing them). The decent size of this buffer allows you to “buffer” up a certain number of audio samples before you start playing, this way, if there is a slight pause in incoming samples, the sound card still has something to play.

So generally, you want to store a certain number of samples in the buffer before you start playing.

Rather than having to worry about buffer overflow or underflow (no samples for the sound card to play), once you start playing, you can tell the sound card to call a specified callback whenever it needs new samples to play. In this callback, you should…well place new samples in the audio buffer! Your callback will be called by a thread of the sound card, so be wary of that. In other words, whatever data your callback accesses, ensure you are not accessing that data from other threads, or use mutexes to guard it.

Here’s an example of doing this in python:

import sounddevice as sd

def on_samples_needed(audio_buffer, frames, time, status):
    audio_buffer[:] = 1 # fill entire audio buffer with 1's 
                        # (obviously *you* should put real audio 
                        # samples in here instead of just 1's)

# create an audio output stream, somewhere you can put audio samples
# - 'samplerate' is rate the samples in the audio stream should be played at
# - 'callback' is the function that is called when the audio buffer needs
# new samples to play
audio_stream = sd.OutputStream(samplerate=44000,callback=on_samples_needed) 

audio_stream.start() # tell sound card to start playing samples in this audio buffer

audio_stream.stop() # tell sound card to stop playing samples in this audio buffer
audio_stream.close() # clean up any resources associated with this audio stream