The 24/192 digital audio format and why it makes no sense. Part 2 [Translation]


The 24/192 digital audio format and why it makes no sense. Part 2 [Translation]

What is one of the most common and deep-rooted misconceptions in the world of music lovers?

Save and read later -

Note translation:

This is the translation of the second (of four) parts of an extensive article by Christopher "Monty" Montgomery (creator of Ogg Free Software and Vorbis) about what he believes is one of the most common and deep-rooted misconceptions in the world of music lovers.

[First part]

Frequency 192 kHz is considered harmful

Digital music files at 192 kHz don't provide any benefit, but they still have some impact. In practice, it turns out that their playback quality is slightly worse, and ultrasonic waves appear during playback.

Both audio converters and power amplifiers are susceptible to distortion, and distortion tends to increase rapidly at high and low frequencies. If the same speaker reproduces ultrasound along with frequencies from the audible range, then any nonlinear response will shift part of the ultrasonic range into the audible spectrum in the form of disordered, uncontrolled nonlinear distortions that span the entire audible audio range. Nonlinearity in a power amplifier will have the same effect. These effects are difficult to notice, but tests have confirmed that both types of distortion can be heard.

The graph above shows the distortion resulting from intermodulating 30 kHz and 33 kHz audio in a theoretical amplifier with a constant total harmonic distortion (THD) of about 0.09%. Distortion is visible throughout the entire spectrum, even at lower frequencies.

Inaudible ultrasonic waves contribute to intermodulation distortion in the audible range (light blue area). Systems not designed to reproduce ultrasound typically have higher distortion levels, around 20 kHz, further contributing to intermodulation. Extending the frequency range to include ultrasound requires compromises that will reduce noise and distortion activity within the audible spectrum, but in any case, unnecessary reproduction of the ultrasonic component will degrade the quality of the reproduction.

There are several ways to avoid additional distortion:

  1. A speaker designed to only reproduce ultrasound, an amplifier and a signal spectrum splitter to separate and independently reproduce ultrasound that you cannot hear, so that it does not affect other sounds.
  2. Amplifiers and transducers designed to reproduce a wider range of frequencies so that ultrasound does not cause audible harmonic distortion. Due to the additional cost and complexity of implementation, the additional frequency range will reduce the quality of reproduction in the audible part of the spectrum.
  3. Well-designed speakers and amplifiers that do not reproduce ultrasound at all.
  4. For starters, you can avoid encoding such a wide range of frequencies. You cannot (and should not) hear ultrasonic harmonics in the audible frequency band unless there is an ultrasonic component in it.

All these methods are aimed at solving one problem, but only method 4 makes any sense.

If you're interested in what your own system can do, the following samples contain: 30 kHz and 33 kHz audio in 24/96 WAV format, a longer version in FLAC format, a few tunes, and a cut of regular songs at 24 kHz so that they fall entirely within the ultrasonic range from 24 kHz to 46 kHz.

Tests for measuring nonlinear distortion:

  • 30kHz audio + 33kHz audio (24bit/96kHz) [5 second WAV] [30 second FLAC]
  • Ringtones 26 kHz – 48 kHz (24 bit / 96 kHz) [10 second WAV]
  • Ringtones 26 kHz – 96 kHz (24 bit / 192 kHz) [10 second WAV]
  • Cut of songs reduced to 24 kHz (24 bit / 96 kHz WAV) [10 second WAV] (original cut version) (16 bit / 44.1 kHz WAV)

Let's assume that your system is capable of playing all formats at 96 kHz sampling rates [6]. When playing the above files, you should not hear anything, no noise, no whistling, no clicks or any other sounds. If you hear something, then your system has a non-linear response and is causing audible non-linear ultrasound distortion. Be careful when increasing the volume; if you hit digital or analog clipping, even a soft one, it can cause loud intermodulation noise.

In general, it is not a fact that nonlinear distortions from ultrasound will be audible on a particular system. The distortion introduced can be either minor or quite noticeable. In any case, the ultrasonic component is never a good thing, and in many audio systems will lead to a severe reduction in sound reproduction quality. In systems that are not harmed by it, the ability to process ultrasound can be preserved, or the resource can instead be used to improve the sound quality of the audible range.

Misunderstanding of the sampling process

Sampling theory is often incomprehensible without the context of signal processing. And it is not surprising that most people, even brilliant doctors of science in other fields, usually do not understand it. It's also not surprising that many people don't even realize they're getting it wrong.

Sampled signals are often depicted as a jagged ladder, like the one above (in red), which looks like a rough approximation of the original signal. However, this representation is mathematically accurate, and when converted to an analog signal, its graph becomes smooth (blue line in the figure).

The most common misconception is that sampling is a rough process and leads to information loss. A discrete signal is often depicted as a jagged, angular, step-like copy of the original perfectly smooth wave. If you think so, then you can assume that the higher the sampling rate (and the more bits per sample), the smaller the steps will be and the more accurate the approximation will be. A digital signal will increasingly resemble an analog signal in shape until it takes on its shape as the sampling rate tends to infinity.

By analogy, many people who are not involved in digital signal processing will look at the image below and say: “Ugh!” It may seem that a sampled signal does not represent the high frequencies of an analog wave well, or in other words, as the audio frequency increases, the sampling quality drops and the frequency response deteriorates or becomes sensitive to the phase of the input signal.

It just looks that way. These beliefs are wrong!

Comment from 04/04/2013: As a response to all the mail regarding digital signals and steps that I received, I will show the actual behavior of a digital signal on real equipment in our Digital Show & Tell video, so you don’t have to take my word for it.

All signals below the Nyquist frequency (half the sampling rate) will be captured perfectly and completely during sampling, and an infinitely high sampling rate is not needed for this. Sampling does not affect frequency response or phase. The analog signal can be restored without loss - as smooth and synchronous as the original.

You can't argue with mathematics, but what's the difficulty? The most well known is the band limit requirement. Signals with frequencies above the Nyquist frequency must be filtered before sampling to avoid distortion due to aliasing. This filter is the infamous anti-aliasing filter. Sampling noise suppression, in practice, cannot be perfect, but modern technologies make it possible to get very close to the ideal result. And we come to oversampling.

Oversampling

Sampling rates above 48 kHz are not associated with high fidelity audio reproduction, but they are necessary for some modern technologies. Oversampling (oversampling) is the most significant of them [7].

The idea behind resampling is simple and elegant. You may remember from my video “Digital Multimedia. A Guide to Beginner Geeks" that high sample rates provide a much larger gap between the highest frequency we care about (20 kHz) and the Nyquist frequency (half the sample rate). This allows you to use simpler and more reliable anti-aliasing filters and increase fidelity. This extra space between 20 kHz and the Nyquist frequency is essentially just a buffer for the analog filter.

The figure above shows diagrams from the video “Digital Multimedia. A Guide for Beginner Geeks,” illustrating the transition bandwidth for a DAC or ADC at 48 kHz (left) and 96 kHz (right).

This is only half the battle, because digital filters have fewer practical limitations than analog filters, and we can complete the anti-aliasing with greater accuracy and efficiency. The high-frequency dry signal passes through a digital anti-aliasing filter, which has no problem fitting the filter's transition band into tight spaces. Once smoothing is complete, additional discrete sections in the cushioning space are simply folded back. Playback of the resampled signal proceeds in reverse order.

This means that signals with a low sample rate (44.1 kHz or 48 kHz) can have the same fidelity, smoothness, and low aliasing as signals with a sample rate of 192 kHz or higher, but none of them will appear. disadvantages (ultrasonic waves causing intermodulation distortion, increased file size). Almost all modern DACs and ADCs oversample at very high speeds, and few people know about it because it happens automatically inside the device.

DACs and ADCs were not always able to resample. Thirty years ago, some recording consoles used high sampling rates for audio recording using only analog filters. This high-frequency signal was then used to create master discs. Digital smoothing and decimation (resampling at a lower frequency for CD and DAT) occurred in the final stage of recording creation. This may have been one of the early reasons why sample rates of 96 kHz and 192 kHz became associated with professional recording production.

16 bit vs 24 bit

Okay, now we know that saving music at 192 kHz doesn't make sense. Topic closed. But what about 16-bit and 24-bit audio? What's better?

16-bit PCM audio does not fully cover the theoretical dynamic audio range that humans can hear under ideal conditions. There are also (and always will be) reasons to use more than 16 bits for audio recording.

None of these reasons have anything to do with audio playback - in this situation, 24-bit audio is as useless as 192 kHz sampling. The good news is that using 24-bit quantization does not harm the sound quality, it simply does not make it worse or take up extra space.

Notes for Part 2

6.

Many of the systems that are unable to play 96 kHz samples will not refuse to play them, but will silently downsample them to 48 kHz. In this case, the sound will not be reproduced at all, and there will be nothing on the recording, regardless of the degree of nonlinearity of the system.

7.

Oversampling is not the only way to deal with high sampling rates in signal processing. There are several theoretical ways to obtain band-limited audio at a high sample rate and avoid decimation, even if it is later downsampled for recording to discs. It is unclear whether such methods are used in practice, since the development of most professional installations is kept secret.

8.

Whether historically or not, many professionals today use high resolutions because they mistakenly believe that audio with preserved content beyond 20 kHz sounds better. Just like consumers.

[Part 3]

This article has been read 89,717 times.

The article is included in the sections:

Interesting things about sound

Let's see what we have with high-resolution formats, let's start with Flac:

Flac 96000 Hz 24 bit

  • 24000 Hz 4p(2b) -12dB
  • 12000 Hz 8p(3b) -18dB
  • 6000 Hz 16p(4b) -24dB
  • 3000 Hz 32p(5b) -30dB
  • 20 Hz 4800p(13b) -78dB

Flac 192000 Hz 24 bit

  • 24000 Hz 8p(3b) -18dB
  • 12000 Hz 16p(4b) -24dB
  • 6000 Hz 32p(5b) -30dB
  • 3000 Hz 64p(6b) -36dB
  • 20 Hz 9600p(14b) -84dB

It can be seen that with increasing frequency the quality gets better, but not by much.

Additionally, we will consider a couple of WAVE formats with ultra-high sampling rates

WAVE 384000 Hz 32 bit

  • 24000 Hz 16p(4b) -24dB
  • 12000 Hz 32p(5b) -30dB
  • 6000 Hz 64p(6b) -36dB
  • 3000 Hz 128p(7b) -42dB
  • 20 Hz 19200p(15b) -90dB

WAVE 768000 Hz 32 bit

  • 24000 Hz 32p(5b) -30dB
  • 12000 Hz 64p(6b) -36dB
  • 6000 Hz 128p(7b) -42dB
  • 3000 Hz 256p(8b) -48dB
  • 20 Hz 38400p(16b) -96dB

The results are already much better, but still not ideal =) It is clear that ultra-high frequency formats are still inaccessible to almost anyone.

Rating
( 1 rating, average 5 out of 5 )
Did you like the article? Share with friends:
For any suggestions regarding the site: [email protected]
Для любых предложений по сайту: [email protected]