Why do you need ASIO for audiophiles? / Sudo Null IT News

Anyone who is faced with the issue of high-quality sound reproduction sooner or later comes across the abbreviation ASIO as an important and necessary option.

What is it and what is its practical meaning? First of all, ASIO relates only to the audio part of recording/playback from a computer via a sound card or USB DAC under Windows operating systems. For those who listen to music from a smartphone or from a network player with its own OS, it is also useful to have an ASIO representation, because Knowing about this “option” allows you to avoid a number of problems present in smartphones and, accordingly, allows you to understand why not all audio platforms are equally useful.

ASIO is a software interface for transferring data from a program that plays or receives an audio signal directly to the sound card driver, bypassing the OS audio subsystem.

The need for ASIO arose exclusively for professional tasks. The biggest problem was and remains the minimum delay for audio signal transmission. When we watch a movie, it doesn’t matter to us how long it takes the system to start playing video and audio, a fraction of a millisecond or a couple of seconds after pressing the “play” button. The main thing is that the video and audio are synchronous relative to each other. In the studio, the requirements are very strict, because often requires playing live on virtual instruments, from which the sound must be processed in real time. It is impossible to fully play on a midi keyboard if you do not hear a key press immediately, but after a second.

Typically, in Windows operating systems, the delay ranges from 7 to 300 ms and depends on the current system load. As you might guess, the sound system is not a priority in Windows and all that is required of it is that the sound simply does not stutter, and for this audio data is collected in a separate buffer and transmitted at once in a large piece. For extremely low latencies, the buffer must be small and constantly transmitted in small packets.

ASIO is an alternative bridge that ensures the transmission of an audio stream from a program to a driver with a fixed buffer value, bypassing the standard OS data transfer system. Because ASIO is not a development of Microsoft (which, by the way, traditionally puts three heaps on sound), then support for output and reception in ASIO falls on the shoulders of the software and sound device manufacturer. ASIO was originally developed by Steinberg for its products at the time of the transition from MIDI to virtual synthesis and today is supported by almost all professional software and audio interfaces.

As you might guess, audiophiles don’t care what the delay is in the system. But it is useful to know what the OS spends its energy on for sound transmission and how this affects the quality.

How does the OS audio subsystem affect sound?

There are many programs in the OS that are sources of sound, such as Skype, ICQ, a browser with music on VKontakte, system sounds, a video player and other applications. All these audio streams differ in both discreteness and sampling frequency, and only one stereo stream with a certain bit depth and sampling frequency should arrive at the DAC. Accordingly, all audio streams must be mixed in advance. To imagine the level of the problem, let's imagine that there are several photographs with different original resolutions that need to be simultaneously displayed on the LCD monitor, and each photo must fill the entire screen. If the photo is rendered pixel to pixel and the photo takes up part of the screen, it will be similar to the fact that the sound will be played slower or faster.

If the photo resolution is 600x480 pixels and the monitor resolution is 1024x768, then you must first convert the photo to 1024x768. The clarity of the photo will undoubtedly decrease. The sound that the system recalculates from 44,100 kHz to 48,000 or 96,000 kHz suffers in much the same way. The quality of the resampler in Windows leaves much to be desired, because... there is maximum saving of resources.

Returning to photography, we have a photo with a resolution of 600x480 pixels, 1024x768 pixels and 2048x1536 pixels and all photos need to be displayed at 1024x768 pixels. Before adding, you need to convert 600x480 and 2048x1536 into 1024x768 and then sum the three photos, superimposing one picture on another.

Usually only one program plays the main sound, while the others play the sound periodically (ICQ, Skype) and can be compared to logos and inscriptions on top of the main photo. It is quite obvious that a picture with an original resolution of 1024x768 will suffer the least in quality, and if it is the main one and matches the monitor resolution, then the quality of only the auxiliary pictures will decrease: the logo and inscriptions.

Likewise, in the system, you can formally set the final sampling frequency to 44.100 kHz for the audio player and neglect the quality of system sounds, which sound only from time to time.

However, to minimize losses in quality when mixing audio streams, special noise (ditter) is added and the system does not care whether one program reproduces sound or several. Thus, even when playing just one audio stream without converting it to another sampling frequency, it is still subject to processing and will no longer be sent to the DAC “bit for bit”.

If previously the OS monitored the sampling frequency at which the data was received at the input and automatically set the maximum sampling frequency supported by the sound card to the incoming audio streams (for example, with incoming 22, 44.1 and 48 kHz it was set to 48 kHz, and at 22 and 44.1 it was reduced up to 44.1 kHz), then starting from Win7 the system is forced to set the overall sampling frequency and there is no automatic reference frequency. The stability of the OS has improved, but not everyone was happy with the method.

The described situation is equally valid for all operating systems and platforms that can play sound simultaneously from different programs. In a mobile phone, this is, for example, playback of a telephone conversation and a system signal about a low battery.

Conventionally, the general scheme looks like this.
When using ASIO, the audio stream is sent directly to the sound card driver mixer (Mixer Driver), bypassing the resampler (SRC) and the OS mixer. For the need to reproduce a bit-for-bit audio stream, there are special modes, in Windows OS these are “Kernel Streaming” (versions before XP) and WASAPI (versions after XP inclusive). In this mode, only one program in the system has the right to transmit the audio stream, and mixing and recalculation of data are completely eliminated. Moreover, the system supports automatic switching of the reference frequency (but with appropriate support from the sound card driver).

This mode is not recommended for the average user, because... brings with it various problems. For example, a user turns on Foobar2000 with WASAPI and then launches a video clip from a previously launched browser. The sound driver does not accept the audio stream from the browser and the flash plugin crashes. It is obvious that the system has collapsed, and this is: “sadness, misfortune and grief.” Software manufacturers extremely rarely make it possible to output sound to KS/WASAPI, because militant users will blame the problems not on their crooked hands, but on the program “because of which everything collapsed.”

KS/WASAPI modes can only be found in audio editors, sequencers and rare software players intended for audiophiles - for trained users who understand that the stability of the OS will suffer and there will be no sound except for the player/audio editor/sequencer. Advanced audiophiles who have abandoned built-in sound usually use a separate sound card for music, and send system sounds to the built-in sound, which ensures high stability of the OS.

Those. in essence, KS/WASAPI is the ideal audio output option for the audiophile. Supported by Foobar2000, AIMP, Winamp. For those who want to listen to movies in high quality, there is the Light Alloy player.

How to download and install the driver on your PC

The PC program ASIO4ALL is free and you can download it from this link.

You can download it for any version and bitness of Windows. There is no need to select specific files. Find the Russian flag below to download the Russian version.

It is known that this program ASIO4ALL is installed in the traditional way:

Run the .exe file from the folder where the file was downloaded;
Click on the “Next” button;
Check the box indicating that you agree to the terms of the license agreement;
Click “Next” again and specify the installation path;
Complete the process by clicking "Finish".

A “User Guide” shortcut for the universal driver will appear on your desktop. It can be opened by any browser on your computer, as it has a .pdf . Even if you accidentally delete it or for various reasons the shortcut does not appear, the manual can be found in the program folder along the path that you specified during installation. You can start sound output through the settings of the sound-reproducing device.

This may be useful: PCI controller Simple Communications what is this driver?

ASIO or WASAPI?

Professionals use the ASIO mode, which transmits a bit-for-bit audio stream to the driver and provides a fixed level of latency.
WASAPI does not allow you to control the delay using the standard OS settings. The level of latency in professional work is a priority, and “bit-to-bit” is just a nice bonus. What happens when both OS and ASIO sound are enabled?

For the sound driver there are two sound streams, one of them comes from the OS subsystem, the other from ASIO. It depends solely on how the driver was written that the final stream will be mixed to the DAC. In some cases, if there is an audio stream from ASIO, then the sound from the OS subsystem is turned off, in other cases there is a mix of streams from the OS and ASIO and “bit-for-bit” remains only in theory. ASIO, like WASAPI, only allows you to avoid SRC (oversampling) algorithms and mixing of the OS subsystem and nothing more. The integrity of the final stream will depend on the driver.

In any case, the sound card is almost always running in the same sampling mode as the incoming stream from ASIO, which gives some advantage to ASIO.

Mixing in the driver can be software or hardware. Particularly funny are the attempts of audiophiles to use a professional interface for “quality digital” as a source, where the digital is produced after hardware mixing. However, some people like the remixed sound more than the original... cleaner, more transparent and more soulful...

If you logically look at the chain through which the audio stream should pass, then for the “bit-to-bit” ideology, the audio interface should support only one option, or disable stream mixing when only one interface is working. Only in this case the chances of getting “bit-for-bit” are maximum.

For example, let's take OPPO HA-1 with ASIO support. If we run Foobar2000 with WASAPI and AIMP with ASIO at the same time, then at the output we will hear both audio streams simultaneously. OPPO does not have digital outputs and, accordingly, there is no way to check the audio stream for “bit-for-bit” separately for ASIO and WASAPI before the DAC.

But with ASUS Essence STU the situation is different. If AIMP with ASIO plays, then Foobar2000 with WASAPI is already silent, audio streams are not mixed, giving priority to ASIO. There is no way to check the digital stream in the same way, but the chances that the audio stream arrived “bit-for-bit” are an order of magnitude greater.

It is believed that a USB DAC must support ASIO, but in practice we get an additional link where streams from the OS sound system and ASIO must be mixed or switched. And here the absence of ASIO means the absence of an unknown link where there may be forced mixing that cannot be tested without digital outputs. At the same time, mixing at this stage is usually carried out in 24 or 32 bits and, accordingly, it is unlikely to hear dither noise. The only problem is the “Hi-End” ideology.

The nuances of using drivers in programs for writing music

Here you can immediately give advice on using the preferred type of ASIO4ALL driver (in Windows 10, for example). In relatively weak configurations, it is best to use the second or third types from the list above, since they will not load the system so much.

In addition, if you are not going to play the connected electronic instrument, pay special attention to the driver settings panel, where you can set the maximum buffer level (2048 samples), which will avoid distortion and delays in playback when using a large number of virtual VST instruments or the same additional effects applied to each track, but not those that are preinstalled in the program itself.

But if you connect an instrument or keyboard with a MIDI interface, the buffer level must be set to 512 samples. At higher values, the delay after pressing a key will be too long, and vice versa. And one more tip: in Windows 10, to ensure maximum performance for the music program, you can enable the special game mode (Win + G), and then specify the location of the program's executable file.

Universal driver ASIO4ALL

The ASIO4ALL driver is extremely popular, but it is also a bridge between the ASIO output from the program and the KS/WASAPI input to the OS. This is important to know because... if your sound card does not support ASIO, then after installing ASIO4ALL in the same Foobar2000 you have a choice of initially selecting output to KS/WASAPI or ASIO via ASIO4ALL, which will direct the audio stream to the same KS/WASAPI in the OS.

A note for fans of ASIO4ALL - yes, there are also various settings, such as buffer selection, etc., but these features are needed only for professional work and do not provide anything useful to audiophiles for whom this material is intended.

What gives us the right to say that ASIO4ALL delivers bit-for-bit data to KS/WASAPI? After all, theory and practice often give opposite results. To evaluate the quality of ASIO4ALL's work, Audiolab M-DAC was used with the function of checking the incoming audio stream for “bit perfect” by playing a special sound file. The test confirmed that the data is indeed "bit-for-bit" when played back from Foobar2000 via ASIO4ALL.

By the way, there have been statements from programmers that, for example, the ASIO driver for external E-MU cards (USB versions) is made similarly to ASIO4ALL in the form of a bridge and this is precisely the source of low stability of the cards...

Types of ASIO drivers

It is especially worth noting that in comparison with the initial state of affairs, the technology is constantly improving, and today you can find several main varieties of these drivers. The easiest way to explain them is using the example of the popular sequencer FL Studio.

If you call up the audio settings, it is easy to see that the following types of interfaces are presented there:

ASIO4ALL v2;
Generic Low Latency ASIO Driver;
FL Studio ASIO (Yamaha ASIO, Steinberg ASIO, etc.).

The first type is a standard driver, the second is software with a reduced load on system resources (in particular, we are talking about CPU and RAM load), and the third type is like an average analogue of the first two, but was developed by Image-Line, which created the sequencer itself (as well as other packages of this type presented in the list).

We figured out what ASIO4ALL is. The only question is which type and how to use it for maximum performance and ease of recording or audio processing.

Android and bit-to-bit

Returning to players based on Android OS.
This OS has a similar KS/WASAPI mode, but there are no explicit settings for it. The only software player that has a direct output mode to the DAC is used in the iBasso DX100. Of course, our own software player only works in the DX100; it cannot be downloaded and installed, for example, in a Sony player. It's very easy to check that Android is running bit-for-bit. Launch any alarm clock and player. If you hear an alarm clock over the sound from the player, then there is no “bit-for-bit” output.

How to install

Now let's look at how to install ASIO4ALL:

Download the installation file from the official website.
Unpack the archive and run the setup file.
The installation wizard will first ask you to accept the license agreement - check the box provided and click Next.
There is no need to edit anything in the next window, unless you want to install additional software in addition. Just click Next again.
Now specify the folder where the program files will be stored. In principle, you can leave the default one. Click Install.
The installation of ASIO4ALL will begin, wait until the installation completion notification appears.

Initial setup of Foobar2000

For example, consider one of the most popular players Foobar2000. It takes up minimal space, is free, and is still advanced enough for complex DSP audio signal processing. But now we will not talk about all the capabilities of Foobar2000, but about its initial configuration for working in ASIO mode with your sound card.

In paid audio players like Audiorvana or JRiver, ASIO compatibility is built into the original distribution, and the player itself picks up available ASIO connections. For Foobar2000 you will need to install the ASIO support module, which you need to additionally download from https://www.foobar2000.org/components/view/foo_out_asio. After that, in the Preferences section of the player, select the very first line Components. Click Install and point the program to the saved file foo_out_asio.fb2k-component. Then reboot the player for the changes to take effect.

Software volume control

Many users prefer to adjust the volume directly in the software player. This is very convenient, for example, in the case of listening using desktop systems with active monitors.

If your audio path operates in ASIO mode, you will no longer be able to reduce the volume using the slider in the corner of the screen (on the Windows taskbar). You will have to use the player's own signal level control (Volume Control). However, keep in mind that if you change the volume programmatically, the bit-by-bit transmission of the original data to the DAC is not preserved. This is only possible with the maximum volume position at 100%.

DSP

When processing a digital signal (DSP - digital sound processing), the sample is scaled to at least a 64-bit floating point number (double64) in the range from –1 to 1. The most commonly used conversions are upsampling/downsampling and upscale/downscale. The second is to change the sample bit depth and in the vast majority of implementations comes down to simply scaling the 64-bit double to the desired bit depth. This transformation, in addition to scaling the useful signal, does exactly the same scaling of the noise, so upscale does not change the signal-to-noise ratio of the original signal, and downscale additionally increases the share of noise due to degradation of the bit depth of the useful signal.

Upsampling/downsampling is very often done through solutions of an nth order polynomial (usually cubic). A sequence of K-samples is taken and the coefficients of the interpolating polynomial are calculated from them, then the resulting polynomial is solved for new sampling points. Ideally, according to the Nyquist-Kotelnikov theorem, upsampling can only preserve the resolution of the original signal at the new sampling rate. In a non-ideal case, noise may appear at higher harmonics. Interestingly, downsampling after upsampling will return the original value of the signal, even if distortion and noise appeared in it after upsampling.

Studios use algorithms that combine upsampling and upscale into a single process to increase signal resolution and dynamic range. These algorithms cannot be used for real-time playback.

Another case of DSP processing is convolution, which is used to adapt the signal to the acoustic properties of the room. Here the original signal is decomposed into harmonics in a Fourier series up to the nth order. Unfortunately, all fast algorithms usually work with the amplitude of a signal of a certain frequency without taking into account the phase (which is still very difficult to measure correctly). Moreover, fast algorithms do not solve the integral, but take the average value in the range. As a result, all correction is reduced to a parametric equalizer. Simple bandpass filters introduce phase distortion at the crossover frequencies, which is why the convolution parameters need to be adjusted again and again.

MQA at high harmonics, in my opinion, incrementally encodes the first derivative (slope) of the signal's amplitude function. Knowing the frequency of the encoding harmonics, it is very easy to extract and restore the behavior of the derivative using a simple Fourier series expansion. And having a derivative, you can already do upsampling not with polynomials, but with splines with smoothing. Then, in real time, you can do upsampling and upscale with increasing resolution and dynamic range of the signal. Of course, this will not be the original Hi-Res, but it will be something.

Conclusions : Upscale does not improve the signal-to-noise ratio. Upsampling does not improve signal resolution. Upsampling makes sense to go from the 44100 to 48000 line if your device's oscillator is better for 48000. Using room correction requires iterative tuning and is largely unpredictable.

How to use the ASIO4ALL program

Most aspects of working with the program are described in the instructions, which will be at your fingertips at all times. Let's look at the main ways to use the ASIO4ALL program. For example, if you need to use a universal driver for a specific application, you need to specify it through the software settings.

Using Cubase as an example :

In the toolbar, select the “Devices” tab;

Click on "Devices"
Next, click “Device Setup” in the drop-down list;
In the list of drivers for playback you need to select ASIO4ALL;
Confirm the settings in the program for them to be applied.

You can call the universal driver control panel using the settings of the sound program in which it is used. Or from the Windows tray through the background applications window. When installing some music programs (for example, certain versions of FL Studio), you may notice that our driver is installed in parallel. Because this or that program simply needs it.