Linux Administration Lesson 19 – File System Types | Dataplexa

Section II — User, Process & Package Management

File System Types

In this lesson

ext4 in depth xfs btrfs tmpfs and special filesystems Choosing the right filesystem

A filesystem type defines the on-disk structure — how data blocks are allocated, how metadata is stored, how the system recovers from crashes, and what features are available for managing data at scale. Choosing the wrong filesystem for a workload can mean poor performance, lost data after a crash, or hitting hard limits years later. Knowing the strengths and trade-offs of each type is a core skill for any Linux administrator.

What Makes Filesystems Different

Every filesystem must answer the same fundamental questions: how are files and directories represented? How is free space tracked? What happens if power is cut mid-write? The answers to these questions — journalling strategy, block allocation algorithm, metadata design — determine the filesystem's performance characteristics, reliability, and feature set.

Fig 1 — The three fundamental filesystem design approaches used by Linux filesystems

# Check the filesystem type of any mounted path
df -T /home
stat -f /home

# Check filesystem type of a specific device
lsblk -f /dev/sda3

# Identify filesystem type of an unmounted partition
sudo blkid /dev/sdb1

# Show detailed filesystem info (ext4/xfs specific)
sudo tune2fs -l /dev/sda3        # ext4
sudo xfs_info /dev/sdb1          # xfs (must be mounted)

# df -T /home
Filesystem     Type  1K-blocks    Used Available Use% Mounted on
/dev/sda3      ext4  102400000 4512000  92688000   5% /

# sudo tune2fs -l /dev/sda3 | grep -E "Filesystem|Block|Inode|Journal"
Filesystem volume name:   rootfs
Filesystem UUID:          a1b2c3d4-1111-2222-3333-444455556666
Filesystem features:      has_journal ext_attr resize_inode dir_index
Block count:              25600000
Block size:               4096
Inode count:              6553600
Journal size:             128M

What just happened? tune2fs -l read the ext4 superblock — the master record of everything about the filesystem. The has_journal feature flag confirms journalling is active, and the 128M journal size shows how much space is reserved for write-ahead logging before data blocks are committed.

ext4 — The Linux Workhorse

ext4 (fourth extended filesystem) is the default filesystem on Ubuntu, Debian, and most general-purpose Linux distributions. It evolved from ext2 and ext3, adding delayed allocation, extents, a larger journal, and better performance with large directories. Its stability, broad tool support, and well-understood behaviour make it the safest default for most workloads.

Max file size

16 TiB per file. Filesystem volumes up to 1 EiB. More than sufficient for almost all workloads.

Journalling

Three modes: data=writeback (fastest), data=ordered (default, balanced), data=journal (safest, slowest).

Resize

Can be grown while mounted or unmounted. Can be shrunk only while unmounted. Use resize2fs.

Best for

Root partitions, home directories, general-purpose data. Any workload that values stability and tool familiarity over cutting-edge features.

# Create ext4 with custom inode ratio (more inodes for many-small-files workloads)
sudo mkfs.ext4 -T news /dev/sdb1        # optimised for many small files
sudo mkfs.ext4 -T largefile /dev/sdb1  # optimised for large files

# Check and repair an ext4 filesystem (must be unmounted)
sudo fsck.ext4 -f /dev/sdb1

# Grow an ext4 filesystem to fill all available space on the partition
sudo resize2fs /dev/sdb1

# Grow to a specific size
sudo resize2fs /dev/sdb1 80G

# Tune reserved block percentage (default 5% is reserved for root — wasteful on large disks)
sudo tune2fs -m 1 /dev/sdb1     # reduce to 1% reserved blocks

# Set a filesystem label after creation
sudo tune2fs -L "appdata" /dev/sdb1

# Enable dir_index feature for large directories (usually already on)
sudo tune2fs -O dir_index /dev/sdb1

# sudo tune2fs -m 1 /dev/sdb1
tune2fs 1.46.5 (30-Dec-2021)
Setting reserved blocks percentage to 1% (262144 blocks)

# sudo resize2fs /dev/sdb1
resize2fs 1.46.5 (30-Dec-2021)
Filesystem at /dev/sdb1 is mounted on /mnt/data; on-line resizing required
old_desc_blocks = 13, new_desc_blocks = 13
The filesystem on /dev/sdb1 is now 26214400 (4k) blocks long.

What just happened? tune2fs -m 1 reduced the reserved-blocks percentage from the default 5% to 1%. On a 100GB partition that frees roughly 4GB — the 5% default was designed for small 1990s disks where root processes needed guaranteed space. On modern large data partitions it is unnecessary waste. resize2fs extended the filesystem online without any downtime.

xfs — High-Performance Journalled Filesystem

xfs was developed by Silicon Graphics in the 1990s and became the default filesystem on RHEL, Rocky Linux, and Fedora from version 7 onwards. It is engineered for high concurrency and large file throughput — its allocation group design allows multiple parallel writes without contention, making it excellent for database, media, and high-I/O server workloads.

ext4 — generalist

Can be shrunk (offline)
Better for small random I/O
Simpler toolchain (e2fsprogs)
Default on Debian/Ubuntu
fsck is slightly slower on large volumes
5% reserved blocks by default

xfs — high-throughput specialist

Cannot be shrunk — only grown
Superior for large sequential I/O
Parallel allocation groups
Default on RHEL / Rocky / Fedora
Near-instant fsck on huge volumes
No reserved blocks overhead

# Create an xfs filesystem
sudo mkfs.xfs -L "database" /dev/sdb1

# Display detailed xfs filesystem information (must be mounted)
sudo xfs_info /mnt/data

# Check and repair xfs (must be unmounted)
sudo xfs_repair /dev/sdb1

# Grow an xfs filesystem to fill all available space (must be mounted)
sudo xfs_growfs /mnt/data

# Freeze xfs filesystem for consistent snapshots (then unfreeze)
sudo xfs_freeze -f /mnt/data
# ... take snapshot ...
sudo xfs_freeze -u /mnt/data

# Dump filesystem metadata for backup/restore
sudo xfsdump -l 0 -f /backup/xfs-dump.img /mnt/data

# sudo xfs_info /mnt/data
meta-data=/dev/sdb1    isize=512    agcount=4, agsize=6553600 blks
         =             sectsz=512   attr=2, projid32bit=1
         =             crc=1        finobt=1, sparse=1, rmapbt=0
data     =             bsize=4096   blocks=26214400, imaxpct=25
         =             sunit=0      swidth=0 blks
naming   =version 2    bsize=4096   ascii-ci=0, ftype=1
log      =internal     bsize=4096   blocks=12800, version=2
         =             sectsz=512   sunit=0 blks, lazy-count=1
realtime =none         extsz=4096   blocks=0, rtextents=0

What just happened? xfs_info revealed that this volume has 4 allocation groups (agcount=4) — the parallel write regions that give xfs its concurrency advantage. The crc=1 flag shows metadata checksumming is enabled (xfs v5 feature), protecting against silent corruption at the block level.

btrfs — Modern Copy-on-Write Filesystem

btrfs (B-tree filesystem) takes a fundamentally different approach — it never overwrites existing data in place. Instead it writes new data to new blocks and atomically updates the tree of pointers. This copy-on-write design enables instant snapshots at no space cost, per-file checksums that detect silent corruption, and built-in RAID support. It is the default on Fedora desktop installations and is growing in container and cloud environments.

Snapshots

Instant, space-efficient point-in-time copies

A snapshot shares blocks with the original — it only consumes additional space as data diverges. Used heavily in system update rollbacks and container layer storage.

Checksums

Per-block data and metadata integrity verification

Every data and metadata block has a checksum. On read, the checksum is verified — silent corruption (bitrot) is detected and, with RAID, automatically repaired.

Subvolumes

Independent filesystem namespaces within one volume

Subvolumes act like separate filesystems but share the same pool of space. They can be mounted independently and snapshotted individually — the basis for Docker's btrfs storage driver.

Compression

Transparent per-file compression (zstd, lzo, zlib)

Files are compressed on write and decompressed on read, transparently to applications. zstd provides the best balance of speed and ratio on modern hardware.

# Create a btrfs filesystem
sudo mkfs.btrfs -L "containers" /dev/sdb1

# Show btrfs filesystem info
sudo btrfs filesystem show /mnt/data
sudo btrfs filesystem usage /mnt/data

# Create a subvolume
sudo btrfs subvolume create /mnt/data/appdata

# List subvolumes
sudo btrfs subvolume list /mnt/data

# Take a snapshot of a subvolume (read-write snapshot)
sudo btrfs subvolume snapshot /mnt/data/appdata /mnt/data/appdata-snap-$(date +%Y%m%d)

# Mount with compression enabled
sudo mount -o compress=zstd /dev/sdb1 /mnt/data

# Run a filesystem scrub — verify all checksums and repair if RAID is active
sudo btrfs scrub start /mnt/data
sudo btrfs scrub status /mnt/data

# sudo btrfs filesystem show /mnt/data
Label: 'containers'  uuid: d4e5f6a7-abcd-efgh-ijkl-mnopqrstuvwx
        Total devices 1 FS bytes used 1.02GiB
        devid    1 size 100.00GiB used 3.02GiB path /dev/sdb1

# sudo btrfs subvolume list /mnt/data
ID 256 gen 12 top level 5 path appdata
ID 257 gen 14 top level 5 path appdata-snap-20250312

# sudo btrfs scrub status /mnt/data
UUID:             d4e5f6a7-abcd-efgh-ijkl-mnopqrstuvwx
Scrub started:    Wed Mar 12 11:00:00 2025
Status:           finished
Duration:         0:00:42
Total to scrub:   1.02GiB
Rate:             24.86MiB/s
Error summary:    no errors found

What just happened? The snapshot appdata-snap-20250312 was created instantly and at zero initial cost — it shares all blocks with appdata until data diverges. The scrub completed in 42 seconds on 1GB of data and found no corruption. Running scrubs monthly is a best practice on btrfs volumes holding important data.

tmpfs and Special Filesystems

Not all Linux filesystems store data on physical disks. Several virtual filesystems present kernel data structures as files — making them navigable with standard tools — or use RAM for extremely fast temporary storage. Understanding these is essential because they explain many things that appear confusing when exploring /proc, /sys, and /dev.

Linux Special and Virtual Filesystems

Filesystem	Mount point	Purpose
`tmpfs`	`/tmp`, `/run`	RAM-backed temporary storage. Size-limited. Data lost on reboot. Much faster than disk for scratch space. Can spill to swap if needed.
`proc`	`/proc`	Virtual filesystem exposing kernel process information. `/proc/meminfo`, `/proc/cpuinfo`, `/proc/PID/` are generated on the fly.
`sysfs`	`/sys`	Exposes kernel device model, driver parameters, and hardware state. Used by `udev` and sysctl. Writing to files here changes kernel settings live.
`devtmpfs`	`/dev`	Device node filesystem managed by the kernel. Creates device files (`/dev/sda`, `/dev/null`) automatically as hardware is detected.
`hugetlbfs`	`/dev/hugepages`	Provides access to huge memory pages (2MB or 1GB) for high-performance applications like databases and virtual machine hypervisors.

# Show all currently mounted filesystems including virtual ones
mount | grep -E "tmpfs|proc|sysfs|devtmpfs"

# tmpfs — check how much /tmp is using
df -h /tmp
du -sh /tmp/*

# proc — read kernel data as files
cat /proc/meminfo | grep -E "MemTotal|MemFree|MemAvailable"
cat /proc/cpuinfo | grep "model name" | uniq
cat /proc/loadavg

# sysfs — check and set kernel parameters live
cat /sys/block/sda/queue/scheduler       # I/O scheduler for sda
echo mq-deadline | sudo tee /sys/block/sda/queue/scheduler

# Mount a size-limited tmpfs manually
sudo mount -t tmpfs -o size=512m tmpfs /mnt/ramdisk

# mount | grep tmpfs
tmpfs on /run type tmpfs (rw,nosuid,nodev,size=794688k,mode=755)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=2097152k)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=794688k,mode=700)

# cat /proc/meminfo | grep -E "MemTotal|MemFree|MemAvailable"
MemTotal:        8142508 kB
MemFree:         4012344 kB
MemAvailable:    6218832 kB

# cat /proc/loadavg
0.12 0.08 0.05 1/312 9821

What just happened? /proc/loadavg shows four values: 1-minute, 5-minute, and 15-minute load averages, followed by running/total processes and the last PID created. The load averages of 0.12/0.08/0.05 indicate a nearly idle system — values above the number of CPU cores indicate the system is under load. All of this data is generated on-demand by the kernel, not read from any file on disk.

Choosing the Right Filesystem

There is no single best filesystem — the right choice depends on the workload, the distribution, the data's value, and the operational toolchain your team is comfortable with. The decision matrix below covers the most common scenarios.

Root partition (/) and home directories

ext4 on Debian/Ubuntu, xfs on RHEL/Rocky. Both are excellent and well-supported by the respective distro's rescue and recovery tooling. Stick with the distribution default unless you have a specific reason not to.

Database storage (PostgreSQL, MySQL, MongoDB)

xfs for high-concurrency write workloads on large volumes. ext4 is perfectly capable for smaller databases. Avoid btrfs for databases — its CoW design causes write amplification that hurts database performance and can interact poorly with database journalling.

Container hosts and system snapshots

btrfs is compelling — subvolumes map naturally to container layers, snapshots enable instant rollback after updates, and compression reduces storage cost. Used as the default on Fedora and openSUSE installations.

Temporary files, build caches, session data

tmpfs. No disk I/O, no persistence needed, automatically cleared on reboot. Size-limit it to prevent runaway processes from consuming all RAM.

Long-term archival storage on spinning disks

btrfs with regular scrubs — its checksumming catches bitrot that silently corrupts files over years on spinning media. Alternatively ext4 with periodic fsck passes if your team prefers familiar tooling.

xfs Partitions Cannot Be Shrunk — Plan Your Layout Before Formatting

Unlike ext4, an xfs filesystem can only be grown — never shrunk. If you format a 500GB partition as xfs and later need to reclaim space, your only options are to back up the data, delete the partition, create a smaller one, reformat, and restore. This is not a theoretical limitation — it has stranded data on undersized partitions in real deployments. With xfs, always allocate the final intended size from the start, or use LVM underneath so the volume can be managed independently of the partition.

Lesson Checklist

✔ I understand the three fundamental design approaches — journalling, copy-on-write, and RAM-backed — and which Linux filesystems use each

✔ I can tune ext4 with tune2fs, grow it with resize2fs, and check it with fsck.ext4

✔ I know that xfs cannot be shrunk, can only be grown with xfs_growfs, and is the RHEL/Rocky default for good performance reasons

✔ I can create btrfs subvolumes, take snapshots, mount with compress=zstd, and run scrubs to verify integrity

✔ I understand that /proc, /sys, and /dev are virtual kernel-generated filesystems, not directories backed by disk storage

Teacher's Note

The single most actionable tip from this lesson for production work: reduce ext4's reserved-blocks percentage from 5% to 1% on any large data partition with tune2fs -m 1. On a 1TB partition, the default 5% reserves 50GB for root processes that will never need it. That is storage you paid for sitting permanently unused. It takes one command and zero downtime.

Practice Questions

1. A team is provisioning a new 2TB disk on a Rocky Linux server to hold PostgreSQL database files. They are debating between ext4 and xfs. What would you recommend and why? What specific operational limitation of your chosen filesystem should they plan for before formatting?

2. You have a btrfs volume at /mnt/data with a subvolume called webroot. Write the commands to take a read-write snapshot named webroot-before-deploy, then explain how much additional disk space the snapshot consumes immediately after creation.

sudo btrfs subvolume snapshot /mnt/data/webroot /mnt/data/webroot-before-deploy. Immediately after creation the snapshot consumes virtually no additional disk space — btrfs uses copy-on-write, so the snapshot initially shares all data blocks with the original subvolume. New space is only allocated as files in either the snapshot or the original are modified, causing blocks to diverge.

3. A developer asks why their script can read values from /proc/meminfo and /sys/block/sda/queue/scheduler like ordinary files, but these paths take up no disk space. Explain what kind of filesystem these paths belong to and where the data actually comes from.

These paths belong to virtual (pseudo) filesystems — procfs mounted at /proc and sysfs mounted at /sys. They are not stored on disk at all — the data is generated on-the-fly by the kernel when the file is read. Reading /proc/meminfo asks the kernel to report current memory statistics; writing to /sys paths changes live kernel parameters and driver settings.

Lesson Quiz

1. You have an xfs filesystem on a 500GB partition but now need it to be only 200GB to reclaim space. What must you do?

Run xfs_growfs -D 200g /mnt/data to shrink it online Unmount, run xfs_repair -s 200g, then remount Back up the data, delete and recreate the partition at 200GB, reformat as xfs, and restore — xfs cannot be shrunk Convert the filesystem to ext4 first, shrink it with resize2fs, then convert back to xfs

2. What is the primary mechanism that makes btrfs snapshots instant and space-efficient at creation time?

btrfs compresses all data before snapshotting, reducing the copy size Copy-on-write — the snapshot initially shares all data blocks with the original and only allocates new blocks as data diverges The snapshot stores only the journal, not the actual data blocks btrfs deduplicates all data automatically, so the snapshot adds no storage overhead at all

3. Which filesystem type would you choose to mount /tmp for maximum performance, and what is the key trade-off?

xfs — because it has the highest sequential write throughput of any disk-backed filesystem tmpfs — it stores data in RAM for maximum speed, but all contents are permanently lost on reboot or unmount btrfs with compression — the compression CPU overhead is offset by reduced I/O ext4 with data=writeback — the fastest journalling mode available

Up Next

Lesson 20 — Disk Usage and Cleanup

Finding what is consuming disk space, cleaning up safely, and preventing storage surprises on production systems

← Previous Course Index Next →