Internet

Fact-checked

What is Code Compression?

Mike Howells

Last Modified Date: February 20, 2024

Computer code can be considered the DNA of the digital world — the raw lines of data that make up all programs, graphic files, and digital music. The larger and more complex the file or application, the more code it contains. Given the finite limitations of hard drives and other types of data storage, code compression is often utilized to temporarily shrink the size of files. It works by using special algorithms to make a smaller piece of code stand for a larger piece. Data can be compressed and uncompressed in this way, as long as a compression program knows the correct algorithm to unlock it.

Most people who know how to use a computer have at least a passing familiarity with the way data is stored, in terms of kilobytes, megabytes, gigabytes, and so on. What they may not understand is the relationship between these units of measurement and the actual words, graphics, music, and programs they manipulate on-screen. A single byte represents a single character of text, and itself is made up of eight smaller units known as bits. Bits are the raw components of digital information, and the way they are arranged makes for different letters of the alphabet, numbers, or other kinds of characters.

Code compression uses special algorithms to make a smaller piece of code stand for a larger piece.

A code compression program takes the bits and bytes that make up a given file and encodes them so that one or two characters of the compressed version represent a larger number of the original. The two main types of code compression are known as lossy and lossless. Lossy compression can be used in cases where some data loss is acceptable, such as music files in which some frequencies are unneeded. Basically, a lossy algorithm counts the number of times a given section of data is reused throughout a file and generates a smaller piece of code tallying that number. Greater size compression can typically be achieved using this method, and the MP3 format is an example of this type of compressed file.

The basic functional difference between lossy and lossless compression is that with lossy compression, data that cannot be compressed is discarded, whereas lossless compression keeps it, uncompressed. This leads to larger compressed file sizes, but a retention of the original file quality. Text documents and other similar files, in which information cannot be lost, must be compressed in this way.

Generally speaking, a compressed file cannot be used or manipulated unless it is first uncompressed. Compression is a temporary state, therefore used mainly for storage or transmission purposes. This does not extend to compressed music and video files however, for which programs exist that can decode on the fly, during playback.