Top.Mail.Ru

How to Work with .tar.gz Archives in Linux

How to Work with .tar.gz Archives in Linux

As software evolves and data volumes grow, files become larger and heavier. To make them easier to store and transfer, they are often packaged into archives. In Linux, one of the most common formats is still .tar.gz.

Despite its popularity, beginners often have questions: what’s the difference between .tar and .tar.gz, how to open such archives, and which commands to use.

Let’s break down how these files work and how to handle them in Linux.

What are .tar and .tar.gz files

The name .tar comes from Tape Archive. When the format was first introduced, data was often stored on magnetic tapes, and tar was used to package files before writing them to tape.

Today, tapes are mostly a thing of the past, but the format turned out to be so convenient that it is still widely used.

It’s important to understand one detail: tar itself does not compress data. It simply bundles files into a single archive. That’s why the size of a .tar file may be very close to the total size of the original data.

To reduce the size, additional compression tools are used. The most common one is gzip. When a .tar archive is compressed with gzip, it becomes a .tar.gz file.

The workflow looks like this:

  • .tar — a container that groups files together
  • gzip — a tool that compresses that container

There are other compression methods as well, for example:

  • .tar.bz2 — uses the bzip2 algorithm
  • .tar.zst — uses zstd
  • .tar.br — uses Brotli

Each option differs in speed and compression ratio, but the general idea is the same.

Working with .tar.gz archives in Linux

In Linux, the main tool for working with these archives is still the tar utility. It can both create archives and extract them.

The simplest extraction command looks like this:

tar -xvf yourarch.tar.gz

Let’s break down the options:

  • -x — extract files from the archive
  • -v — show the list of extracted files
  • -f — specify the archive file

After running the command, the contents will be extracted into the current directory.

Extracting to a different directory

Sometimes you need to extract an archive into a specific directory instead of the current one. For that, use the -C option:

tar -xf yourarch.tar.gz -C /tmp/yourdir

In this case, the archive contents will be extracted into /tmp/yourdir.

Viewing archive contents without extracting

It can be useful to check what’s inside an archive before extracting it. The tar utility can list its contents:

tar -tf yourarch.tar.gz

The -t option prints the list of files and directories without extracting them.

Extracting specific files

If the archive is large, there’s no need to extract everything. You can extract only selected items:

tar -xf yourarch.tar.gz yourdir2 yourfile3 yourdir33

In this case, only the specified files and directories will be extracted.

It’s important to use exact names as they appear inside the archive. If there’s a typo, the command will return an error.

The gzip utility

Although tar is often used together with gzip, gzip itself works independently and can be used directly.

To compress a file:

gzip yourarch.tar

After running the command, the original file will be replaced with a compressed archive yourarch.tar.gz.

To decompress it, use the -d option:

gzip -d yourarch.tar.gz

This will extract the archive and restore the original .tar file.

Compressing while keeping the original file

Sometimes you may want to keep the original file unchanged. In that case, use the -c option, which outputs the result to a new file:

gzip -c yourarch.tar > newyourarch.tar.gz

The original archive remains intact.

Compressing and extracting entire directories

gzip can also process directories using the -r option for recursive operation.

To compress files in a directory:

gzip -r yourdir

This will create compressed versions of all files inside yourdir, including those in subdirectories.

To decompress them:

gzip -dr yourdir

This command will extract all compressed files within the directory and its subdirectories.

Conclusion

The .tar.gz format has long been a standard in the Linux ecosystem. It is widely used for distributing software, source code, and various data packages. Despite the seemingly complex double extension, the principle is simple: tar bundles files, and gzip compresses them.

Once you learn a few basic commands, working with most archives in Linux becomes straightforward. Over time, these operations become routine and almost automatic.

Related

All articles