How to Work with .tar.gz Archives in Linux
As software evolves and data volumes grow, files become larger and heavier. To make them easier to store and transfer, they are often packaged into archives. In Linux, one of the most common formats is still .tar.gz.
Despite its popularity, beginners often have questions: what’s the difference between .tar and .tar.gz, how to open such archives, and which commands to use.
Let’s break down how these files work and how to handle them in Linux.
What are .tar and .tar.gz files
The name .tar comes from Tape Archive. When the format was first introduced, data was often stored on magnetic tapes, and tar was used to package files before writing them to tape.
Today, tapes are mostly a thing of the past, but the format turned out to be so convenient that it is still widely used.
It’s important to understand one detail: tar itself does not compress data. It simply bundles files into a single archive. That’s why the size of a .tar file may be very close to the total size of the original data.
To reduce the size, additional compression tools are used. The most common one is gzip. When a .tar archive is compressed with gzip, it becomes a .tar.gz file.
The workflow looks like this:
- .tar — a container that groups files together
- gzip — a tool that compresses that container
There are other compression methods as well, for example:
- .tar.bz2 — uses the bzip2 algorithm
- .tar.zst — uses zstd
- .tar.br — uses Brotli
Each option differs in speed and compression ratio, but the general idea is the same.
Working with .tar.gz archives in Linux
In Linux, the main tool for working with these archives is still the tar utility. It can both create archives and extract them.
The simplest extraction command looks like this:
tar -xvf yourarch.tar.gz
Let’s break down the options:
- -x — extract files from the archive
- -v — show the list of extracted files
- -f — specify the archive file
After running the command, the contents will be extracted into the current directory.
Extracting to a different directory
Sometimes you need to extract an archive into a specific directory instead of the current one. For that, use the -C option:
tar -xf yourarch.tar.gz -C /tmp/yourdir
In this case, the archive contents will be extracted into /tmp/yourdir.
Viewing archive contents without extracting
It can be useful to check what’s inside an archive before extracting it. The tar utility can list its contents:
tar -tf yourarch.tar.gz
The -t option prints the list of files and directories without extracting them.
Extracting specific files
If the archive is large, there’s no need to extract everything. You can extract only selected items:
tar -xf yourarch.tar.gz yourdir2 yourfile3 yourdir33
In this case, only the specified files and directories will be extracted.
It’s important to use exact names as they appear inside the archive. If there’s a typo, the command will return an error.
The gzip utility
Although tar is often used together with gzip, gzip itself works independently and can be used directly.
To compress a file:
gzip yourarch.tar
After running the command, the original file will be replaced with a compressed archive yourarch.tar.gz.
To decompress it, use the -d option:
gzip -d yourarch.tar.gz
This will extract the archive and restore the original .tar file.
Compressing while keeping the original file
Sometimes you may want to keep the original file unchanged. In that case, use the -c option, which outputs the result to a new file:
gzip -c yourarch.tar > newyourarch.tar.gz
The original archive remains intact.
Compressing and extracting entire directories
gzip can also process directories using the -r option for recursive operation.
To compress files in a directory:
gzip -r yourdir
This will create compressed versions of all files inside yourdir, including those in subdirectories.
To decompress them:
gzip -dr yourdir
This command will extract all compressed files within the directory and its subdirectories.
Conclusion
The .tar.gz format has long been a standard in the Linux ecosystem. It is widely used for distributing software, source code, and various data packages. Despite the seemingly complex double extension, the principle is simple: tar bundles files, and gzip compresses them.
Once you learn a few basic commands, working with most archives in Linux becomes straightforward. Over time, these operations become routine and almost automatic.
Related
All articles
What a 404 Error Means, Why It Happens, and How to Deal With It
A 404 error is one of the basic HTTP status codes. It is most often displayed as “404 Not Found” or “Page Not Found.” The wording makes it seem like something has broken. In practice, that is not necessarily the…
How to Install the Latest Versions of Docker and Docker Compose on Ubuntu and Debian
Docker is no longer just a tool for developers. It’s widely used for test environments, production services, and even small self-hosted projects at home. Docker Compose makes this setup even more convenient: instead of manually running multiple containers, you can…