In this post I want to talk about how to get the best results when compressing music material into FLAC and MP3 formats. In the era of mass distribution of streaming audio and video services, this topic may seem of little relevance, however, as practice shows, this is not the case. Firstly, not everyone wants to be dependent on third-party resources, which at any time can behave in any way - from introducing various restrictions on access to content to completely deleting it. Secondly, there are many places in the world where the Internet is slow, sad and intermittent. Thirdly, the sound quality when listening online is usually quite acceptable for most users, but it can make experienced listeners with good equipment feel bored. Considering all of the above, we can assume that the topic of self-compression of audio recordings for storing and listening offline will remain relevant for quite a long time.
Since this article will discuss Windows console applications, it is assumed that the reader is familiar with the basics of working on the command line under this operating system.
Basic Concepts
PCM (pulse code modulation), also known as PCM (pulse code modulation) is a way of representing an analog signal in digital form. It works like this: electrical oscillations, which are an analog audio signal, are supplied to the input of a device called an ADC (analog-to-digital converter). The ADC measures the level of this signal at a certain frequency and transmits the resulting values outside, where they are stored. In this way, a data array is formed, which is a sequence of amplitude values of the original signal. The process described is called “digitization”. The main problem with storing PCM data in “bare” form is its rather large volume, therefore, to make more efficient use of storage space, various digital audio compression algorithms are used.
CDDA (Compact Disc Digital Audio) is the good old audio CD, historically the first mass-produced digital media standard for audio recordings. Despite the fact that in our time CDDA itself is no longer very relevant, the parameters for presenting sound on it - PCM 16 bit / 44.1 kHz / stereo - are still basic to this day for almost all published musical phonograms.
WAV is an audio format standardly used for storing uncompressed PCM audio in Windows. The format may also contain compressed data, but in practice this is extremely rare and, one might say, is bad manners. Playing WAV files requires a minimum of system resources, since no additional information processing is required. Saving material in this format is almost always an intermediate step when processing sound in audio editors, CD grabbers and other similar software. The bitrate of uncompressed WAV with parameters 16 bit/44.1 kHz/stereo is 1411 kbps, the file size of a five-minute recording in this form is about 52 MB.
An encoder (or “encoder” from the English “encoder”) is software designed to convert WAV to some other format in order to reduce the amount of stored data.
Decoder is software or hardware used to play files compressed into the appropriate format or convert them into uncompressed form.
Lossy is the generic name for a family of audio formats that use lossy data compression. Typical representatives of the family are MP3, AAC, WMA, Ogg Vorbis. The main feature of lossy formats is that when material is compressed into any of them, a significant part of the original audio information is lost irretrievably and cannot be subsequently restored in any way. Due to this, a high degree of compression is achieved, while aural losses are hardly noticeable or even unnoticeable, since only data that is not critical for human perception is discarded.
Lossless is the general name for a family of audio formats that use lossless data compression. Typical representatives of the family: FLAC, Monkey's Audio, ALAC, WavPack. Unlike lossy formats, here no information is lost during compression; everything happens much like in conventional archivers. The price for complete data safety is a significantly lower degree of compression compared to lossy.
MP3 (MPEG-1 Layer 3) is historically the first and most common lossy compression format. Despite the fact that, due to its age, MP3 today does not shine with compression efficiency, its popularity remains very high due to its versatility - any iron can play this format. Moreover, if an adequate encoder and decoder are used, the MP3 sound quality is at a very decent level. The combination of these two factors makes the use of the format justified even now. MP3 compression ratio while maintaining high sound quality is 6-9 times. The average bitrate of such an MP3 with parameters 16 bit/44.1 kHz/stereo is 150-240 kbps, the file size of a five-minute recording in this form is 6-9 MB.
FLAC (Free Lossless Audio Codec) is the currently most popular lossless audio compression format. If any software or hardware claims to support lossless, it is almost certain that this software/hardware can play FLAC. The format is a de facto standard among lovers of high-quality sound. FLAC compression ratio is 1.2-3.5 times. The FLAC bitrate with 16-bit/44.1 kHz/stereo parameters is 400-1200 kbps, the file size of a five-minute recording in this form is 15-44 MB. For lossless formats, of which FLAC is a representative, the rule “higher bitrate - higher sound quality” does not work; the quality always remains identical to the original. The degree of compression and bitrate vary depending on the complexity of the material being compressed - for example, singing with a guitar lends itself to compression better than a recording of a symphony orchestra.
To conclude this section, I will provide a picture that clearly illustrates the key features of the audio formats described above:
Lossless compatible devices
For example, owners of Android devices can use the andLess player. It is capable of playing FLAC, APE, uncompressed WAV and other formats supported by Android.
The situation is worse for owners of devices on the Blackberry platform. Only owners of the Bold 9000 and 8900 and later models can listen to the lossless format.
Owners of Apple devices can use the ALAC codec without any problems. It is supported by iPod (except shuffle), iPhone and iPad. For FLAC format, you can download FLAC Player from the App Store.
The FLAC codec is supported by Samsung Galaxy devices, some Sony Ericsson smartphones and iriver players.
Stationary devices from many manufacturers also received support for FLAC. Media players and media centers allow you to do without a personal computer when listening to songs without loss of quality.
General issues
Is it possible to compress material from lossless to lossy - for example, from FLAC to MP3?
It is possible and often necessary. For example, if you want to listen to music in “camping” conditions from a portable device without audiophile habits, and your source material is stored in a lossless format, then before transferring it to a portable device, it makes sense to convert the necessary tracks to lossy. This way you will reduce the file size and be able to save significantly more music on your mobile device. You most likely will not feel any sound degradation from such a transformation at all.
Is it possible to compress material from lossy to lossless - for example, from MP3 to FLAC?
This should not be done under any circumstances, as the sound quality will not improve and the file size will increase significantly. Moreover, such pseudo-lossless, which subsequently gets to other people, will mislead them. Read this article on how to reject such fakes when downloading lossless from the Internet.
Is it possible to compress material from lossy to lossy - for example, MP3 with a lower bitrate to MP3 with a higher bitrate?
If you want to get an improvement in quality from such compression, then no, you don’t need to do this - the sound will not only not improve, but will even deteriorate slightly. If your goal is to reduce the file size and the sound quality is not very critical, then compressing from a higher to a lower bitrate is quite justified.
Which lossless format is better in terms of sound quality - FLAC, Monkey's Audio, WavPack?
As mentioned earlier, lossless formats compress data without loss. This means that in terms of sound quality they are all absolutely identical. You should choose a lossless format for use in each specific case, focusing solely on its compatibility with software/hardware and your personal preferences.
Do the choice of encoder and its settings affect the sound quality when compressing material into MP3?
Significantly influence. I will tell you below which encoder and with what settings provides the optimal result.
Does the decoder affect the sound quality when listening to MP3 material? What's the best way to play MP3 on PC?
The MP3 decoder may affect the sound quality. Some (especially old) decoders noticeably distort the sound when playing MP3, which can create a false impression that this format as such is inferior. To play music on a PC, you should use proven player programs, preferably the latest versions. I use foobar2000, which I recommend to everyone - it has no problems with the playback quality of MP3 or other supported formats.
Do encoder settings affect the sound quality when compressing material in FLAC? What compression ratio is better to choose?
The FLAC encoder settings do not affect the sound quality at all. Only the size of the resulting files and the time spent on compression can depend on them, and even then insignificantly. Therefore, most often I don’t bother and encode in FLAC with default settings, which I advise you to do too. In rare cases, when you need to get files of a minimum size and have to save every byte, it makes sense to increase the compression ratio to the maximum value.
Does the decoder affect the sound quality when listening to FLAC material? What's the best way to play FLAC on PC?
In the case of FLAC and other lossless formats, the decoder does not affect the sound quality; the original audio data is restored with bit accuracy during playback. Therefore, to listen to FLAC, you can use any player that supports this format and that you personally like. However, I will once again recommend foobar2000 as a time-tested universal solution for playing music under Windows.
Apple Lossless Format
High quality lossless music can be listened to using Apple's audio compression codec without sacrificing quality. This format was developed by Apple for use on its own devices. The format is compatible with iPod players that have special dock connectors and the latest firmware. The format does not use specific rights management (DRM) tools, but the container format contains such capabilities. It is also supported by QuickTime and is included as a feature in iTunes.
The format is part of freely available libraries, which makes it possible to organize listening to files in Windows applications. In 2011, Apple published the source codes of the format, which opens up broad prospects for the codec. In the future, it may become a serious competitor to other formats. The tests showed good results. Compressed files range in size from 40-60% of the size of the originals. The decoding speed is also impressive, which justifies its use for mobile devices whose performance is low.
One of the disadvantages of the codec is that the extension of the audio files matches the AAC (Advanced Audio Coding) audio codec. This leads to confusion because AAC is not a high quality music format. Therefore, it was decided to store the data in an MP4 container with the .m4a extension.
Among other formats, it is worth mentioning Windows Media Audio 9 Lossless, which is part of the Windows Media application. It works with Windows and Mac OS X. However, users do not respond very favorably to it. There are often problems with codec compatibility, and the number of supported channels is limited to six.
MP3 Encoding and Decoding with LAME
As mentioned above, in the case of MP3, the sound quality of the resulting files directly depends on the choice of encoder and its compression settings. To date, the best results for this format are produced by the LAME encoder. The original project website looks a little confusing, so I’ll immediately provide a link to the files. From the archive we need the lame.exe file. Open the command line.
For quick help on using LAME, type lame --help (information will be printed on the screen) or lame --help > usage.txt (information will be printed to the usage.txt file). For detailed help, replace --help with --longhelp.
Let's move directly to the compression functionality. Should I specify bitrate and other compression settings separately? No, this is not at all necessary; the developers did most of the work for us, making the encoder as easy to use as possible. LAME contains a set of presets (presets) that allow the user to get excellent results with a minimum of technical knowledge. Almost all presets use the VBR (Variable Bit Rate) mode, which gives the optimal ratio of sound quality and file size. Help for presets can be accessed with the command lame —preset help. Let's look at the most relevant presets.
Preset standard . Description from the built-in help:
This preset should be "transparent" to most people on most music, while being of fairly high quality.
The average bitrate when using standard is 170-210 kbps, the HF cutoff starts at approximately 18.7 kHz. I recommend using this preset as the default mode; it is the most balanced combination of characteristics.
Encoding WAV to MP3 with this preset:
lame —preset standard infile.wav outfile.mp3, where infile.wav is the name of the source WAV file, outfile.mp3 is the name of the resulting MP3 file (the latter can be omitted).
Preset extreme. Description from the built-in help:
If you have extremely good hearing and the same equipment, this preset will give slightly higher quality than standard.
The average bitrate when using extreme is 220-260 kbps, the high-pass filter is not used. I recommend using this preset in cases where you want to get an MP3 with very high sound quality. When listening to music on average equipment, this preset, compared to the standard one, usually does nothing except increase the file size.
Encoding WAV to MP3 with this preset:
lame --preset extreme infile.wav outfile.mp3
Preset insane. Unlike previous presets that use VBR, this one uses a constant bitrate mode of 320 kbps. Description from the built-in help:
This preset will be overkill for most people in most situations, but if you need the highest quality without regard to file size, then go ahead.
I do not recommend using insane due to its practical meaninglessness. If you are tormented by attacks of perfectionism, do not use MP3 with sky-high settings, but one of the lossless formats. For example, FLAC.
Decoding MP3 to WAV:
lame --decode infile.mp3 outfile.wav
Digital players with lossless support
Users respond well to the digital players jetAudio, Foobar2000, Spider Player. There are no fundamental differences between them. The choice of any device is based on the subjective opinion of a music lover about the convenience of the interface for lossless playback. You can find out what a lossless format is by testing these players.
The Apple Lossless format is played using iTunes. In addition, this codec is supported by the popular video player VLC.
Owners of Apple-compatible computers can use two interesting programs: Vox and Cog.
They support the following lossless formats:
In addition to this, there are many useful features, for example, Last.fm services are supported.
Owners of Windows computers can use any application that is compatible with music codecs without loss of quality: Foobar2000 or WinAmp. Winamp requires special plugins. Lossless music plays well on iTunes and KMPlayer. An advantage of iTunes that other players do not have is the ability to support tags.
FLAC Encoding and Decoding
Let's look at file compression using the FLAC encoder, which you can download here. To work, we need the flac.exe file. If you run it without parameters, you will receive brief help on how to use the encoder. For detailed help, type flac --help (information will be displayed on the screen) or flac --help > usage.txt (information will be displayed in the usage.txt file).
Encoding WAV to FLAC with default compression ratio (5):
flac infile.wav
Encoding WAV to FLAC with the specified compression ratio:
flac -n infile.wav, where n is a number from 0 (minimum compression) to 8 (maximum compression).
Decoding FLAC to WAV:
flac -d infile.flac
To make sure that FLAC truly compresses data without loss, you can use any software that can compare files bit by bit - for example, the fc utility built into Windows. For the experiment, we select any WAV file and do the following transformations with it: original.wav (original file) > compressed.flac (we encode the file into FLAC) > decompressed.wav (we decode FLAC back to WAV). Next, compare original.wav and decompressed.wav using fc in binary comparison mode:
fc /b original.wav decompressed.wav
Upon completion of the check, the utility will display the message “no differences between the files were found,” which indicates their complete identity. This means that not a single bit was lost when converting to FLAC and back, which was what needed to be proven. If you do the same experiment with MP3, the result will be completely different; there will be a huge number of differences between the files.
WavPack format
WavPack is another freely available audio codec that compresses audio information without loss of quality. WavPack integrates an exclusive combined mode that allows you to create two files. One of the files in this mode is created in a relatively small size with loss of quality .wv, which can be played independently. The second “.wvc” file corrects the previous “.wv” and, in combination with it, makes it possible to fully restore the original. Some users may find this approach promising, since there is no need to choose between two types of compression - both will always be implemented.
Also worthy of attention is a video codec with high-quality sound - lagarith lossless codec. It works quickly and efficiently.
Automating compression using foobar2000
Working through the console is, of course, good, but for regular use, I would like to make the format conversion process simpler and more convenient. The foobar2000 player, which I already mentioned above, is perfect for solving this problem. This player has a built-in file converter, the configuration of which we will look at step by step.
First, let's add presets for LAME to the converter:
1) Open foobar, add several files to its playlist. Right-click on any track from the playlist, select Convert >... In the Converter Setup window that opens, in the Current Settings block, select Destination - here you can configure where and how foobar will save the created files. If necessary, adjust these parameters, then click Back.
2) Click Output format > Add New, in the window that appears, fill in the fields as in the screenshot:
In the Encoder file field you should specify the full path to the lame.exe file. After everything is filled in, click OK, then Back.
3) Returning to the Converter Setup window, save the created preset with the Save button. We go through steps 2 and 3 again, but this time in the parameters and name of the preset we change “standard” to “extreme”. As a result, two items will be added to the Saved presets list, launching LAME in standard and extreme modes. Now you can re-encode any files from the foobar playlist to MP3 by simply selecting them with the mouse and selecting Convert > the name of the desired preset in the context menu:
Setting up FLAC is even easier. In the Converter Setup window, in the Current Settings block, select Output format, then select FLAC in the list of presets, and click Back. Returning to Converter Setup, save the new preset with the name FLAC:
That's it, now FLAC compression is available through the Convert context menu in the same way as MP3 compression. A caveat: when you first start the conversion process, a window will open in which you will need to indicate where the flac.exe file is located.
By the way. Users often have a question about how to split an album into separate tracks, downloaded as one large FLAC file with an addition in the form of a Cue sheet markup file. Having configured foobar as described above, we can do this in a few clicks: open the file with the .cue extension, select the tracks that appear in the playlist and convert them into separate files via the Convert > FLAC context menu.
Listening equipment
It is difficult to recommend anything to fans of high-quality sound equipment (Hi-Fi or Hi-End). The choice in this area is limited only by budget and tastes. Equalizer, amplifier, acoustics - the choice of these devices has many options. PC owners who are choosing high-quality acoustics for their computer are better off choosing budget monitor speakers from any well-known brand. Users respond well to the Microlab SOLO series acoustics. To make lossless music sound good, it is important to purchase acoustics with a subwoofer. Two-way acoustics cannot cope with the reproduction of the lower frequency band.
Preparing for the test
2.1 Equipment
All power saving technologies and HyperThreading are disabled in the motherboard BIOS. To eliminate delays associated with writing/reading the HDD, a RAM disk (Z:, 3 GB) is used for tests.
2.2 Software
Where possible, 64-bit versions of encoders are used. For the test, the latest version of the portable foobar2000 with the necessary plugins and encoders was specially installed, after which converter profiles were previously created for each encoder.
OS:
Windows 10 Pro 64-bit
Video driver:
ForceWare 385.41 WHQL
foobar2000:
Core (2017-07-10 05:24:08 UTC)
foobar2000 core 1.3.16
foo_benchmark.dll (2017-09-04 20:26:52 UTC) Decoding Speed Test 1.2.3
foo_bitcompare.dll (2017-09-04 20:26:52 UTC) Binary Comparator 2.1.1
foo_converter.dll (2017-07-10 05:22:28 UTC) Converter 1.5
foo_input_la.dll (2010-12-08 22:45:00 UTC) Lossless Audio(La) decoder 0.01
foo_input_monkey.dll (2017-09-04 20:26:54 UTC) Monkey's Audio Decoder 2.1.7 foo_input_ofr.dll (2017-09-05 14:18:58 UTC) OptimFROG Lossless/DualStream Decoder 1.31
foo_input_std.dll (2017-07-10 05:22:04 UTC) FFmpeg Decoders 3.2.4 Standard Input Array 1.0
foo_input_tak.dll (2017-09-04 20:26:54 UTC) TAK Decoder 0.4.7
foo_input_tta.dll (2017-09-04 20:26:54 UTC) TTA Audio Decoder 3.4
foo_ui_std.dll (2017-07-10 05:22:34 UTC) Default User Interface 0.9.5
Coders:
FLAC 1.3.2 GIT20170314 x64 ICL Flake 0.11 FLACCL 2.1.6 Lossless Audio (LA) Compressor v0.4b Monkey's Audio Console Front End v4.22 OptimFROG Lossless Audio Compressor v5.100 x64 refalac 1.64 TAK v2.3.0 TTA 2.3 64-bit SSE4 WMA 0.2.9c 64-bit WavPack v5.1.0
The player process (as well as the converter in its settings) had real-time priority set.
2.3 Musical material
For the test, a disk image of one of the modern electronic music composers was chosen. The recording has a wide frequency and relatively good dynamic range.
2.4 Selection of encoding parameters
Initially, I planned to conduct a comparison at the maximum compression settings for each encoder. But for some reason this seemed impossible, and the meaning of such a test would be very doubtful. For example, to encode a 30-second segment of standard audio material with the OptimFROG codec with maximum parameters, it takes 230 seconds (encoding speed about 0.13x). Thus, I formulated the following requirements:
- encoding speed at least 1x
- decoding speed at least 2x
- ability to use scrolling for compressed files.
Since the compression gain decreases as the parameters increase, the compression ratio with the parameters I selected is practically no different from the maximum.
General converter parameters:
Output bit depth:
Auto Dither: Never Output folder: Source track folder Fiilename pattern: %filename% Processing: None When finished: Do nothing
Encoders and parameters
Note: For coders without the ability to encode on the fly, the input file was specified directly in the parameters (instead of the %s variable). This is done so that encoding occurs directly from the source file to the final one, without creating a temporary one (which takes a significant amount of time and distorts the results). Below the parameters for the mentioned encoders are listed with the %s variable.
results
The test results are presented in a table and sorted in descending order of compression ratio.
Codec | Compression ratio | FBR, kbit/s | File size, MB | Compression time, s | Compression speed | Decoding time, s | Decoding speed |
L.A. | 66,55% | 939 | 535,55 | 589,28 | 8.11x | 519,90 | 9.20x |
OFR | 66,74% | 941 | 537.06 | 1028,78 | 4.65x | 558,19 | 8.57x |
A.P.E. | 67,23% | 948 | 540,98 | 252,92 | 18.91x | 327,8 | 14.59x |
SO | 67,79% | 956 | 545,53 | 128,91 | 37.10x | 12,62 | 379.02x |
TAK (2 cores) | 67,79% | 956 | 545,53 | 78,92 | 60.60x | 12,67 | 377.36x |
W.V. | 68,75% | 970 | 553,28 | 1741,86 | 2.74x | 39,86 | 119.99x |
FLACCL | 69,51% | 980 | 559,36 | 14,01 | 341.40x | 10,99 | 435.21x |
TTA | 69,60% | 982 | 560,12 | 22,94 | 208.51x | 32,42 | 147.53x |
Flake | 69,67% | 983 | 560,66 | 432,33 | 11.06x | 14,07 | 340.03x |
FLAC | 69,90% | 986 | 562,49 | 39,98 | 119.61x | 9,13 | 523.48x |
ALAC | 71,00% | 1002 | 571,38 | 42,56 | 112.37x | 18,92 | 252.822x |
WMA | 71,68% | 1011 | 576,81 | 38,72 | 123.52x | 32,26 | 148.28x |
PCM | 100% | 1411 | 804,67 | — | — | — | — |