Earlier in this series I talked about an essential component in the analogue signal path - the phono stage or phono preamp. In this article I'm going to introduce the DAC or 'Digital to Analogue Converter'.

dac64

DACs are a hotly debated subject. Spend just five minutes online and you'll find thousands of amateur electronics experts lining up to tell you that all DACs are the same, and a $3 chip you can buy from RS components does the same as $10000 DAC from a high-end manufacturer. You'll even find people claiming that it's impossible to hear the differences. I'm not going to talk about that side of things just yet. Today I'm going to cover the groundwork of how a DAC actually works, and why it's necessary. This should give rise to some questions about implementation, which we can cover in later article, which should help explain why so many different sorts of DACs are available and why, plainly, they do, in fact, sound different.

What's the analogue in DAC?

Sound is a type of energy made by vibrations. When something vibrates, say the wings on an insect, those vibrations cause air particles move - to vibrate themselves - and then bumping into more air particles, which in turn vibrate until there's not enough energy to cause further vibrations.

Music is made up of a whole mass of complicated vibrations, which happen in space and time. If you're at a live performance of a string quartet, you are directly experiencing the sound waves, through your ears. If you're listening to a recording of a performance, those sound waves needed to be captured in some way, so that they could be reproduced.

In an earlier article, I mentioned Vaughan Williams using a phonograph to record carols. Invented in 1877, it worked by capturing sound waves via a horn, and etching a spiral groove onto a rotating cylinder. In essence it tried to "write the sound" - hence "phonograph". Phonographs (which later became gramophones) could also be used to play back the recordings. This is pure analogue - the capturing and replay of sound waves.

phonograph

When we hear a musical note, and describe its pitch, for example A4 - so-called concert pitch - we're talking about a wave with a frequency of 440Hz.

If I were to connect a loudspeaker to a signal generator, and send a 440Hz signal through it, you might recognise the pitch of the note, but it would sound nothing like an instrument, because in reality instruments don't produce perfect waves. In fact, they're made up of various harmonics (a sine wave with a frequency that is an integer multiple of the fundamental frequency). As you add these together, you get a more complicated wave.

Here's a computer-generated cello note, taking harmonics into account:

Screen-Shot-2017-12-05-at-13.30.45

And here's a recording of a real cello:

Screen-Shot-2017-12-05-at-13.31.31

So, immediately you can see that real life sounds are complex waveforms. Now try to imagine how complicated the sound of a piano being played would be. Now add in an orchestra.

These are the 'analogue' sounds that we perceive at a live concert.

What's the digital in a DAC?

Most modern music is captured and stored digitally. There are lots of reasons for this, but size, reliability, ease of maintenance storage capacity, and relative lack of deterioration of media are amongst the reasons for its uptake.

To store and manipulate music digitally, we need to make a representation of the complex series of sound waves, which can be understood by a computer. Effectively a series of numbers, which within the computer are stored as collections of on and off states - binary words.

Most music - music on the radio, on a streaming service, on CD, (and, incidentally, much which has been subsequently pressed to vinyl) is stored and manipulated in digital form.

How do we go from Analogue to Digital?

Making a digital representation of an analogue waveform is conceptually simple. In practice it's very complicated, but it's worth understanding how it actually happens, from first principles.

What we need to do is take a number of samples from the wave. You can think of the original wave as having, effectively, an infinite number of samples. We can't take an infinite number of samples, but we can take a large enough number as to get very very close to the original waveform.

If we record a sound, using a microphone, the output we get is still a wave, but a wave expressed with electricity rather than sound. In order to take samples, we need to agree a range with which to represent the wave. These are voltage levels which cover the highest and lowest points of the voltage, and all changes in between.

Next we take a point along the wave, and using a 'divide and question' approach we simply say: is this value in the top half or bottom half of the agreed voltage range? If it's in the top half, we record 1 and if it's in the bottom half, we record 0. Now we go to the half in which our sample was found (say the top half), and we divide it again. Is point on our wave in the top half or the bottom half? Again, we write a 1 if it's in the top, and a 0 if it's in the bottom. We can keep doing this as many times as we like. Each time we divide, we can add another "bit" of information. If we decided to do this three times, our range, in binary terms, would be 000-111 - 8 possible values. This would allow us to reproduce the sound wave - with gaps. The more times we divide, the greater the 'resolution'.

We then need to do this for an agreed number of times along the wave. So our recording is defined by the number of times we divide and question, and by the number of times we take a sample. In our primitive example, we could take 8 samples a second, and divide and question 8 times - that would be a sample rate of '8 / 0.008' an 8 bit simple at a frequency of 8 Hz (8 cycles a second). CDs are sampled at 16 bit depth, at a frequency of 44kHz. So the wave can be divided into 65,536 possible values. For this reason, it's sometimes called 16/44 - 16 bit, 44kHz. The frequency was chosen in the late 1970s, based on a number of factors, including the scientific premise that the sampling frequency needs to be twice that of the maximum frequency you wish to record - for humans, the theoretical upper band for the human ear is 20kHz, so the sampling rate needs to be at least 40kHz.

Recent years have seen the development of 'high resolution audio', or 24/192 - so 16.7 million values rather than 66 thousand, and more than 400% more samples per second.

How do we go back from digital to analogue?

Now we understand the basics of the digital and the analogue in DAC, how do we actually get back to the analogue again?

We need to go back from our two state data, high/low, on/off, expressed by voltage at either zero volts, or whatever the size of the power supply is, say 4.8 volts, to the samples we took, we we did out divide and question approach.

I'll explain how to do this by describing a primitive approach to building a DAC.

Let's suppose that we've recorded our signal at a resolution of 8 bits. That is, there are 256 possible values (2 x 2 x 2 x 2 x 2 x 2 x 2 x 2) representing the relative amplitude of our waveform. We want to reproduce the waveform, or rather, we want to reproduce the samples we took of the waveform at a point in time.

To do this we build a voltage divider using a number of resistors. The number corresponds to the resolution - we are using 8 bit words, so we need 16 resistors - two for each possible high/low state. Each possible state is given two resistors - one half the resistance of the other. These are made into a ladder:

Screen-Shot-2017-12-05-at-12.30.45

The principle of this ladder is that as voltages are applied, the resister ladder removes a fraction of the input voltage, in proportion to the length of the binary word.

If we feed the binary word '00000000', then we'll obviously get zero volts. If we feed the word '1000000', which is 128, or halfway between the minimum and maximum signal range, we'll get 2.3V or roughly half the maximum voltage we can provide. The figure isn't exactly half because resistors are not 100% accurate.

Screen-Shot-2017-12-05-at-12.44.42

Whatever number we enter, the output will correspond to the proportion of the size of the maximum output. You can work this forwards or backwards - look at the voltage outputted, and determine what binary word was fed into the ladder, or feed a number into the ladder, and predict the output. It will always be proportional.

So no we have a way to reproduce the samples we took.

If we were to use a simple computer such as an Arduino to generate an input which gradually increased up to 256 and dropped to 0, we'd create a ramp wave form. If we connected this to an oscilloscope, we'd see that waveform.

If we zoomed in on the oscilloscope, we'd see that the wave isn't completely linear - it's made up of steps.

Similarly, if we wrote a function which emulated a sine wave, we would see a sine wave on the scope. Again, if we zoomed in, we'd see it wasn't a 'pure' sine wave, because the resolution isn't great enough.

However, what we have done is gone back to the samples we took. It should be obvious that the output is only as good as the input. We sampled 8 times a second, with a resolution of 8 bits, our analogue wave form will bear a resemblance to the thing we sampled, but there will be a lot missing.

How do we hear the analogue?

OK, we we now have a digital to analogue converter. We can pass in a series of digital values, and get out a wave form. We can reproduce the samples we took.

Can we now just take the output of our resistor ladder and connect it directly to a loudspeaker?

Not really. The reason is down to the impedance of the loudspeaker. Without getting too bogged down in the electronics, because the loudspeaker has a relatively low impedance (say 8 ohms) relative to the resistors in our ladder, there won't be enough power to drive them.

To solve this we need to add a buffer, which doesn't change the signal, but which has a very high impedance, so the speaker can be driven. We can do that with a simple op amp, configured as a voltage follower. This looks like this:

Screen-Shot-2017-12-05-at-13.51.19

This amplifier doesn't in any way change the signal - it doesn't amplify it or attenuate it, it simply buffers the signal, allow us to drive the loudspeaker with the wave we recreated.

So! Now we can hear the signal, and we have a functioning, primitive digital to analogue converter!

What next?

So there we have it - digital to analogue conversion, from first principles. We understand why it's needed, and how it works. Obviously commercial DACs are much more sophisticated than this, but before going on to discuss some of the finer points of DACS, or indeed digital and analogue recording and playback, it's very valuable to have an understanding of the underpinning principles.

This article hopefully raises a number of questions in your mind - questions about digital resolution, perhaps. Possibly you've heard about and are curious about high resolution audio, or SACD. You might also be wondering what all the fuss is about, given we've bee able to make a DAC using a few pence worth of electronic components? What's the difference between this simple proof-of-concept DAC, and the ones in so-called audiophile systems? More on this in later articles.


Stephen Nelson-Smith is founder and principal consultant at Expressive Audio, in Chichester, West Sussex. He has been fascinated by electronics since he was a very small child, and has happy memories of being with his father, poring over transmitter circuit diagrams, and building home-brew amateur radio components.

sns