Linux Administration
File System Types
In this lesson
A filesystem type defines the on-disk structure — how data blocks are allocated, how metadata is stored, how the system recovers from crashes, and what features are available for managing data at scale. Choosing the wrong filesystem for a workload can mean poor performance, lost data after a crash, or hitting hard limits years later. Knowing the strengths and trade-offs of each type is a core skill for any Linux administrator.
What Makes Filesystems Different
Every filesystem must answer the same fundamental questions: how are files and directories represented? How is free space tracked? What happens if power is cut mid-write? The answers to these questions — journalling strategy, block allocation algorithm, metadata design — determine the filesystem's performance characteristics, reliability, and feature set.
Fig 1 — The three fundamental filesystem design approaches used by Linux filesystems
# Check the filesystem type of any mounted path
df -T /home
stat -f /home
# Check filesystem type of a specific device
lsblk -f /dev/sda3
# Identify filesystem type of an unmounted partition
sudo blkid /dev/sdb1
# Show detailed filesystem info (ext4/xfs specific)
sudo tune2fs -l /dev/sda3 # ext4
sudo xfs_info /dev/sdb1 # xfs (must be mounted)# df -T /home Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/sda3 ext4 102400000 4512000 92688000 5% / # sudo tune2fs -l /dev/sda3 | grep -E "Filesystem|Block|Inode|Journal" Filesystem volume name: rootfs Filesystem UUID: a1b2c3d4-1111-2222-3333-444455556666 Filesystem features: has_journal ext_attr resize_inode dir_index Block count: 25600000 Block size: 4096 Inode count: 6553600 Journal size: 128M
What just happened? tune2fs -l read the ext4 superblock — the master record of everything about the filesystem. The has_journal feature flag confirms journalling is active, and the 128M journal size shows how much space is reserved for write-ahead logging before data blocks are committed.
ext4 — The Linux Workhorse
ext4 (fourth extended filesystem) is the default filesystem on Ubuntu, Debian, and most general-purpose Linux distributions. It evolved from ext2 and ext3, adding delayed allocation, extents, a larger journal, and better performance with large directories. Its stability, broad tool support, and well-understood behaviour make it the safest default for most workloads.
16 TiB per file. Filesystem volumes up to 1 EiB. More than sufficient for almost all workloads.
Three modes: data=writeback (fastest), data=ordered (default, balanced), data=journal (safest, slowest).
Can be grown while mounted or unmounted. Can be shrunk only while unmounted. Use resize2fs.
Root partitions, home directories, general-purpose data. Any workload that values stability and tool familiarity over cutting-edge features.
# Create ext4 with custom inode ratio (more inodes for many-small-files workloads)
sudo mkfs.ext4 -T news /dev/sdb1 # optimised for many small files
sudo mkfs.ext4 -T largefile /dev/sdb1 # optimised for large files
# Check and repair an ext4 filesystem (must be unmounted)
sudo fsck.ext4 -f /dev/sdb1
# Grow an ext4 filesystem to fill all available space on the partition
sudo resize2fs /dev/sdb1
# Grow to a specific size
sudo resize2fs /dev/sdb1 80G
# Tune reserved block percentage (default 5% is reserved for root — wasteful on large disks)
sudo tune2fs -m 1 /dev/sdb1 # reduce to 1% reserved blocks
# Set a filesystem label after creation
sudo tune2fs -L "appdata" /dev/sdb1
# Enable dir_index feature for large directories (usually already on)
sudo tune2fs -O dir_index /dev/sdb1# sudo tune2fs -m 1 /dev/sdb1 tune2fs 1.46.5 (30-Dec-2021) Setting reserved blocks percentage to 1% (262144 blocks) # sudo resize2fs /dev/sdb1 resize2fs 1.46.5 (30-Dec-2021) Filesystem at /dev/sdb1 is mounted on /mnt/data; on-line resizing required old_desc_blocks = 13, new_desc_blocks = 13 The filesystem on /dev/sdb1 is now 26214400 (4k) blocks long.
What just happened? tune2fs -m 1 reduced the reserved-blocks percentage from the default 5% to 1%. On a 100GB partition that frees roughly 4GB — the 5% default was designed for small 1990s disks where root processes needed guaranteed space. On modern large data partitions it is unnecessary waste. resize2fs extended the filesystem online without any downtime.
xfs — High-Performance Journalled Filesystem
xfs was developed by Silicon Graphics in the 1990s and became the default filesystem on RHEL, Rocky Linux, and Fedora from version 7 onwards. It is engineered for high concurrency and large file throughput — its allocation group design allows multiple parallel writes without contention, making it excellent for database, media, and high-I/O server workloads.
ext4 — generalist
- Can be shrunk (offline)
- Better for small random I/O
- Simpler toolchain (
e2fsprogs) - Default on Debian/Ubuntu
- fsck is slightly slower on large volumes
- 5% reserved blocks by default
xfs — high-throughput specialist
- Cannot be shrunk — only grown
- Superior for large sequential I/O
- Parallel allocation groups
- Default on RHEL / Rocky / Fedora
- Near-instant fsck on huge volumes
- No reserved blocks overhead
# Create an xfs filesystem
sudo mkfs.xfs -L "database" /dev/sdb1
# Display detailed xfs filesystem information (must be mounted)
sudo xfs_info /mnt/data
# Check and repair xfs (must be unmounted)
sudo xfs_repair /dev/sdb1
# Grow an xfs filesystem to fill all available space (must be mounted)
sudo xfs_growfs /mnt/data
# Freeze xfs filesystem for consistent snapshots (then unfreeze)
sudo xfs_freeze -f /mnt/data
# ... take snapshot ...
sudo xfs_freeze -u /mnt/data
# Dump filesystem metadata for backup/restore
sudo xfsdump -l 0 -f /backup/xfs-dump.img /mnt/data# sudo xfs_info /mnt/data
meta-data=/dev/sdb1 isize=512 agcount=4, agsize=6553600 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
data = bsize=4096 blocks=26214400, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
What just happened? xfs_info revealed that this volume has 4 allocation groups (agcount=4) — the parallel write regions that give xfs its concurrency advantage. The crc=1 flag shows metadata checksumming is enabled (xfs v5 feature), protecting against silent corruption at the block level.
btrfs — Modern Copy-on-Write Filesystem
btrfs (B-tree filesystem) takes a fundamentally different approach — it never overwrites existing data in place. Instead it writes new data to new blocks and atomically updates the tree of pointers. This copy-on-write design enables instant snapshots at no space cost, per-file checksums that detect silent corruption, and built-in RAID support. It is the default on Fedora desktop installations and is growing in container and cloud environments.
A snapshot shares blocks with the original — it only consumes additional space as data diverges. Used heavily in system update rollbacks and container layer storage.
Every data and metadata block has a checksum. On read, the checksum is verified — silent corruption (bitrot) is detected and, with RAID, automatically repaired.
Subvolumes act like separate filesystems but share the same pool of space. They can be mounted independently and snapshotted individually — the basis for Docker's btrfs storage driver.
Files are compressed on write and decompressed on read, transparently to applications. zstd provides the best balance of speed and ratio on modern hardware.
# Create a btrfs filesystem
sudo mkfs.btrfs -L "containers" /dev/sdb1
# Show btrfs filesystem info
sudo btrfs filesystem show /mnt/data
sudo btrfs filesystem usage /mnt/data
# Create a subvolume
sudo btrfs subvolume create /mnt/data/appdata
# List subvolumes
sudo btrfs subvolume list /mnt/data
# Take a snapshot of a subvolume (read-write snapshot)
sudo btrfs subvolume snapshot /mnt/data/appdata /mnt/data/appdata-snap-$(date +%Y%m%d)
# Mount with compression enabled
sudo mount -o compress=zstd /dev/sdb1 /mnt/data
# Run a filesystem scrub — verify all checksums and repair if RAID is active
sudo btrfs scrub start /mnt/data
sudo btrfs scrub status /mnt/data# sudo btrfs filesystem show /mnt/data
Label: 'containers' uuid: d4e5f6a7-abcd-efgh-ijkl-mnopqrstuvwx
Total devices 1 FS bytes used 1.02GiB
devid 1 size 100.00GiB used 3.02GiB path /dev/sdb1
# sudo btrfs subvolume list /mnt/data
ID 256 gen 12 top level 5 path appdata
ID 257 gen 14 top level 5 path appdata-snap-20250312
# sudo btrfs scrub status /mnt/data
UUID: d4e5f6a7-abcd-efgh-ijkl-mnopqrstuvwx
Scrub started: Wed Mar 12 11:00:00 2025
Status: finished
Duration: 0:00:42
Total to scrub: 1.02GiB
Rate: 24.86MiB/s
Error summary: no errors found
What just happened? The snapshot appdata-snap-20250312 was created instantly and at zero initial cost — it shares all blocks with appdata until data diverges. The scrub completed in 42 seconds on 1GB of data and found no corruption. Running scrubs monthly is a best practice on btrfs volumes holding important data.
tmpfs and Special Filesystems
Not all Linux filesystems store data on physical disks. Several virtual filesystems present kernel data structures as files — making them navigable with standard tools — or use RAM for extremely fast temporary storage. Understanding these is essential because they explain many things that appear confusing when exploring /proc, /sys, and /dev.
| Filesystem | Mount point | Purpose |
|---|---|---|
tmpfs |
/tmp, /run |
RAM-backed temporary storage. Size-limited. Data lost on reboot. Much faster than disk for scratch space. Can spill to swap if needed. |
proc |
/proc |
Virtual filesystem exposing kernel process information. /proc/meminfo, /proc/cpuinfo, /proc/PID/ are generated on the fly. |
sysfs |
/sys |
Exposes kernel device model, driver parameters, and hardware state. Used by udev and sysctl. Writing to files here changes kernel settings live. |
devtmpfs |
/dev |
Device node filesystem managed by the kernel. Creates device files (/dev/sda, /dev/null) automatically as hardware is detected. |
hugetlbfs |
/dev/hugepages |
Provides access to huge memory pages (2MB or 1GB) for high-performance applications like databases and virtual machine hypervisors. |
# Show all currently mounted filesystems including virtual ones
mount | grep -E "tmpfs|proc|sysfs|devtmpfs"
# tmpfs — check how much /tmp is using
df -h /tmp
du -sh /tmp/*
# proc — read kernel data as files
cat /proc/meminfo | grep -E "MemTotal|MemFree|MemAvailable"
cat /proc/cpuinfo | grep "model name" | uniq
cat /proc/loadavg
# sysfs — check and set kernel parameters live
cat /sys/block/sda/queue/scheduler # I/O scheduler for sda
echo mq-deadline | sudo tee /sys/block/sda/queue/scheduler
# Mount a size-limited tmpfs manually
sudo mount -t tmpfs -o size=512m tmpfs /mnt/ramdisk# mount | grep tmpfs tmpfs on /run type tmpfs (rw,nosuid,nodev,size=794688k,mode=755) tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=2097152k) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=794688k,mode=700) # cat /proc/meminfo | grep -E "MemTotal|MemFree|MemAvailable" MemTotal: 8142508 kB MemFree: 4012344 kB MemAvailable: 6218832 kB # cat /proc/loadavg 0.12 0.08 0.05 1/312 9821
What just happened? /proc/loadavg shows four values: 1-minute, 5-minute, and 15-minute load averages, followed by running/total processes and the last PID created. The load averages of 0.12/0.08/0.05 indicate a nearly idle system — values above the number of CPU cores indicate the system is under load. All of this data is generated on-demand by the kernel, not read from any file on disk.
Choosing the Right Filesystem
There is no single best filesystem — the right choice depends on the workload, the distribution, the data's value, and the operational toolchain your team is comfortable with. The decision matrix below covers the most common scenarios.
Root partition (/) and home directories
ext4 on Debian/Ubuntu, xfs on RHEL/Rocky. Both are excellent and well-supported by the respective distro's rescue and recovery tooling. Stick with the distribution default unless you have a specific reason not to.
Database storage (PostgreSQL, MySQL, MongoDB)
xfs for high-concurrency write workloads on large volumes. ext4 is perfectly capable for smaller databases. Avoid btrfs for databases — its CoW design causes write amplification that hurts database performance and can interact poorly with database journalling.
Container hosts and system snapshots
btrfs is compelling — subvolumes map naturally to container layers, snapshots enable instant rollback after updates, and compression reduces storage cost. Used as the default on Fedora and openSUSE installations.
Temporary files, build caches, session data
tmpfs. No disk I/O, no persistence needed, automatically cleared on reboot. Size-limit it to prevent runaway processes from consuming all RAM.
Long-term archival storage on spinning disks
btrfs with regular scrubs — its checksumming catches bitrot that silently corrupts files over years on spinning media. Alternatively ext4 with periodic fsck passes if your team prefers familiar tooling.
xfs Partitions Cannot Be Shrunk — Plan Your Layout Before Formatting
Unlike ext4, an xfs filesystem can only be grown — never shrunk. If you format a 500GB partition as xfs and later need to reclaim space, your only options are to back up the data, delete the partition, create a smaller one, reformat, and restore. This is not a theoretical limitation — it has stranded data on undersized partitions in real deployments. With xfs, always allocate the final intended size from the start, or use LVM underneath so the volume can be managed independently of the partition.
Lesson Checklist
tune2fs, grow it with resize2fs, and check it with fsck.ext4
xfs_growfs, and is the RHEL/Rocky default for good performance reasons
compress=zstd, and run scrubs to verify integrity
/proc, /sys, and /dev are virtual kernel-generated filesystems, not directories backed by disk storage
Teacher's Note
The single most actionable tip from this lesson for production work: reduce ext4's reserved-blocks percentage from 5% to 1% on any large data partition with tune2fs -m 1. On a 1TB partition, the default 5% reserves 50GB for root processes that will never need it. That is storage you paid for sitting permanently unused. It takes one command and zero downtime.
Practice Questions
1. A team is provisioning a new 2TB disk on a Rocky Linux server to hold PostgreSQL database files. They are debating between ext4 and xfs. What would you recommend and why? What specific operational limitation of your chosen filesystem should they plan for before formatting?
2. You have a btrfs volume at /mnt/data with a subvolume called webroot. Write the commands to take a read-write snapshot named webroot-before-deploy, then explain how much additional disk space the snapshot consumes immediately after creation.
sudo btrfs subvolume snapshot /mnt/data/webroot /mnt/data/webroot-before-deploy. Immediately after creation the snapshot consumes virtually no additional disk space — btrfs uses copy-on-write, so the snapshot initially shares all data blocks with the original subvolume. New space is only allocated as files in either the snapshot or the original are modified, causing blocks to diverge.
3. A developer asks why their script can read values from /proc/meminfo and /sys/block/sda/queue/scheduler like ordinary files, but these paths take up no disk space. Explain what kind of filesystem these paths belong to and where the data actually comes from.
procfs mounted at /proc and sysfs mounted at /sys. They are not stored on disk at all — the data is generated on-the-fly by the kernel when the file is read. Reading /proc/meminfo asks the kernel to report current memory statistics; writing to /sys paths changes live kernel parameters and driver settings.
Lesson Quiz
1. You have an xfs filesystem on a 500GB partition but now need it to be only 200GB to reclaim space. What must you do?
2. What is the primary mechanism that makes btrfs snapshots instant and space-efficient at creation time?
3. Which filesystem type would you choose to mount /tmp for maximum performance, and what is the key trade-off?
Up Next
Lesson 20 — Disk Usage and Cleanup
Finding what is consuming disk space, cleaning up safely, and preventing storage surprises on production systems