Linux Administration Lesson 20 – Disk Usage and Cleanup | Dataplexa

Section II — User, Process & Package Management

Disk Usage and Cleanup

In this lesson

df and du Finding large files Package cache cleanup Log management Inode exhaustion

Disk usage and cleanup is the practice of understanding what is consuming storage on a Linux system, identifying waste and growth hotspots, and reclaiming space safely. A full disk is one of the most disruptive failures on a production server — it stops databases from writing, prevents logs from rotating, and can corrupt running services. Proactive monitoring and disciplined cleanup habits prevent this class of incident entirely.

Checking Disk Space with df

df (disk free) reports the amount of space used and available on each mounted filesystem. It answers the question "how full is each partition?" — giving a system-wide view in seconds. It is always the first command to run when investigating a disk space issue.

# Human-readable sizes — the most common form
df -h

# Include filesystem type in the output
df -hT

# Show a specific path's filesystem usage
df -h /var/log

# Show inode usage instead of block usage (critical for inode exhaustion diagnosis)
df -i

# Exclude tmpfs and other virtual filesystems — show only real disks
df -hx tmpfs -x devtmpfs

# df -hT
Filesystem     Type      Size  Used Avail Use% Mounted on
/dev/sda3      ext4       98G   41G   52G  45% /
/dev/sda2      ext4      974M  213M  694M  24% /boot
/dev/sdb1      ext4      200G  187G  6.4G  97% /var/log
tmpfs          tmpfs     3.9G     0  3.9G   0% /run
/dev/sdc1      xfs       500G   12G  488G   3% /data

# df -i
Filesystem      Inodes  IUsed   IFree IUse% Mounted on
/dev/sda3      6553600 312400 6241200    5% /
/dev/sdb1      1310720 1310719       1  100% /var/log

What just happened? Two critical situations appear in this output. First, /var/log is at 97% capacity — a log write failure is imminent. Second, the inode output shows /var/log at 100% inode usage with only 1 inode remaining. This means no new files can be created on that partition even if block space exists — both crises need immediate attention.

Measuring Directory Sizes with du

While df tells you how full a filesystem is, du (disk usage) tells you what is using it. By recursively summing the sizes of files and directories, du lets you drill down from filesystem to partition to subdirectory to individual files until you find the source of consumption.

Analogy: df is like reading the fuel gauge in your car — it tells you how much is left. du is like opening the boot and counting the luggage — it tells you what is taking up all the space.

# Show total size of a directory and all its contents
du -sh /var/log

# Show sizes of all immediate subdirectories — one level deep
du -h --max-depth=1 /var/log

# Sort directories by size — largest at the bottom
du -h --max-depth=1 /var | sort -h

# Sort directories by size — largest at the top (most useful for cleanup)
du -h --max-depth=1 /var | sort -rh | head -20

# Show size of every file in a directory recursively, sorted largest first
du -ah /var/log | sort -rh | head -20

# Summarise total usage for multiple paths
du -sh /var/log /var/cache /tmp /home

# du -h --max-depth=1 /var | sort -rh | head -10
181G  /var
168G  /var/log
 11G  /var/cache
1.4G  /var/lib
312M  /var/backups
 88M  /var/spool
 12M  /var/tmp

# du -ah /var/log | sort -rh | head -10
168G  /var/log
 94G  /var/log/app.log
 42G  /var/log/app.log.1
 18G  /var/log/nginx
  9G  /var/log/nginx/access.log
5.4G  /var/log/nginx/access.log.1
3.2G  /var/log/journal

What just happened? The combination of du and sort -rh (reverse, human-readable sort) immediately exposed the culprit — a single app.log file consuming 94GB. This is the standard drill-down pattern: start at the full filesystem, descend one directory level at a time with --max-depth=1, sort by size, repeat until you reach the actual offending file.

Finding Large and Old Files with find

find is the most powerful tool for locating specific files by size, age, type, or ownership. Combined with -ls or -delete, it can both identify and clean up space consumers in a single pass.

# Find files larger than 1GB anywhere on the system
sudo find / -xdev -size +1G -ls 2>/dev/null

# Find files larger than 100MB in /var, sorted by size
sudo find /var -size +100M -printf "%s\t%p\n" 2>/dev/null | \
  sort -rn | awk '{printf "%.1fM\t%s\n", $1/1048576, $2}' | head -20

# Find files not accessed in more than 90 days
find /var/log -atime +90 -type f -ls

# Find and delete log files older than 30 days (preview first with -ls)
find /var/log/archive -name "*.log" -mtime +30 -ls
find /var/log/archive -name "*.log" -mtime +30 -delete

# Find core dump files — often large and forgotten
find / -xdev -name "core" -o -name "core.[0-9]*" 2>/dev/null | xargs ls -lh

# Find files with no owner (orphaned — user deleted but file remains)
sudo find / -xdev -nouser -ls 2>/dev/null

# sudo find /var -size +100M -printf "%s\t%p\n" | sort -rn | head -5
98566144000  /var/log/app.log
45097156608  /var/log/app.log.1
18874368000  /var/log/nginx/access.log
5368709120   /var/log/journal
1073741824   /var/cache/apt/archives/linux-image-6.5.0-1021.amd64.deb

# find /var/log -atime +90 -type f -ls
12348 2048 -rw-r--r-- 1 syslog adm 2097152 Nov 12 2024 /var/log/syslog.4.gz
12349 1024 -rw-r--r-- 1 syslog adm 1048576 Oct 18 2024 /var/log/auth.log.4.gz

What just happened? The -xdev flag restricted the search to the current filesystem, preventing find from crossing mount boundaries and accidentally scanning network mounts or other slow volumes. The old compressed logs found with -atime +90 are prime cleanup candidates — they have not been read in three months, suggesting they are past any reasonable retention window.

Package Cache and System Cleanup

Linux package managers accumulate downloaded package files in a local cache. After installation these cached .deb and .rpm files are no longer needed but remain on disk indefinitely. On active servers this cache can grow to several gigabytes. Old kernels similarly accumulate and are a common source of /boot partition exhaustion.

apt cache

Cached .deb files in /var/cache/apt/archives/. Safe to clear after packages are installed.

Old kernels

Previous kernel packages accumulate in /boot. Keep the current and one prior; remove the rest.

Journal size

The systemd journal can grow without bound. Vacuuming it by time or size reclaims space immediately.

Temp files

Build artefacts, session files, and crash dumps accumulate in /tmp and /var/tmp over time.

# ── Debian / Ubuntu ──────────────────────────────────────────────

# Show how much space the apt cache is using
du -sh /var/cache/apt/archives/

# Remove cached packages that can no longer be downloaded (obsolete versions)
sudo apt autoclean

# Remove ALL cached package files (safe — they can be re-downloaded)
sudo apt clean

# Remove packages that were auto-installed as dependencies but are no longer needed
sudo apt autoremove --purge

# List installed kernels to identify old ones
dpkg --list | grep linux-image

# Remove a specific old kernel (keep current + 1 previous)
sudo apt purge linux-image-6.5.0-18-generic

# ── RHEL / Rocky ─────────────────────────────────────────────────

# Show dnf cache size
du -sh /var/cache/dnf/

# Clean all dnf cache
sudo dnf clean all

# Remove old kernels — keep only the 2 most recent
sudo dnf remove --oldinstallonly --setopt installonly_limit=2 kernel

# du -sh /var/cache/apt/archives/
3.2G    /var/cache/apt/archives/

# sudo apt autoclean
Del linux-image-6.5.0-15-generic 6.5.0-15.15 [12.4 MB]
Del linux-image-6.5.0-17-generic 6.5.0-17.17 [12.4 MB]

# sudo apt autoremove --purge
The following packages will be REMOVED:
  linux-image-6.5.0-15-generic* linux-headers-6.5.0-15*
  linux-image-6.5.0-17-generic* linux-headers-6.5.0-17*
0 upgraded, 0 newly installed, 4 to remove and 0 not upgraded.
After this operation, 198 MB disk space will be freed.

What just happened? apt autoremove --purge identified four old kernel packages consuming 198MB in /boot. This is the most common cause of a full /boot partition — each kernel upgrade adds ~60MB but nothing removes the old ones automatically unless autoremove is run regularly.

Log Management and Journal Cleanup

Logs are the second most common cause of disk exhaustion after data files. Linux manages logs through two parallel systems: logrotate for traditional text log files under /var/log/, and journald for the systemd binary journal. Both need active size management on production systems.

logrotate Key Directives

Directive	Effect
`daily / weekly / monthly`	How often to rotate the log file.
`rotate N`	Keep N rotated copies. Older files are deleted. `rotate 7` keeps one week of daily logs.
`compress`	Gzip rotated log files. Combined with `delaycompress` to skip compressing the most recent rotated file.
`size N`	Rotate when the log reaches size N (e.g. `100M`, `1G`). Overrides time-based rotation.
`missingok`	Do not error if the log file is missing.
`postrotate / endscript`	Run a shell command after rotation — typically to signal the application to re-open its log file handle.

# View logrotate configuration for a specific service
cat /etc/logrotate.d/nginx

# Force an immediate rotation (useful for testing or emergency cleanup)
sudo logrotate -f /etc/logrotate.d/nginx

# Run logrotate in debug mode — shows what would happen without making changes
sudo logrotate -d /etc/logrotate.conf

# ── systemd journal cleanup ───────────────────────────────────────

# Check current journal disk usage
journalctl --disk-usage

# Remove journal entries older than 2 weeks
sudo journalctl --vacuum-time=2weeks

# Limit journal to a maximum total size
sudo journalctl --vacuum-size=500M

# Set permanent journal size limit in journald config
sudo mkdir -p /etc/systemd/journald.conf.d/
sudo tee /etc/systemd/journald.conf.d/size-limit.conf <<'EOF'
[Journal]
SystemMaxUse=500M
SystemKeepFree=1G
MaxRetentionSec=2weeks
EOF
sudo systemctl restart systemd-journald

# journalctl --disk-usage
Archived and active journals take up 4.2G in the file system.

# sudo journalctl --vacuum-time=2weeks
Vacuuming done, freed 3.1G of archived journals from /var/log/journal.

# sudo journalctl --vacuum-size=500M
Vacuuming done, freed 204.5M of archived journals from /var/log/journal.

What just happened? The journal had accumulated 4.2GB — mostly older archived log files. --vacuum-time=2weeks deleted everything older than 14 days, freeing 3.1GB immediately. The permanent configuration in /etc/systemd/journald.conf.d/ will prevent this from happening again — the journal will now self-limit to 500MB and automatically discard old entries.

Inode Exhaustion and Deleted-but-Open Files

Two subtle disk-space problems trip up even experienced administrators. Inode exhaustion means a filesystem is out of file-tracking slots — no new files can be created even though block space remains. Deleted-but-open files means disk space is not freed after rm because a running process still holds the file open — the space is only released when that process closes or is restarted.

Inode exhaustion

Symptoms: No space left on device but df -h shows free space. Diagnosis: df -i shows 100% inode usage. Cause: millions of tiny files (mail queues, session caches, tmp files).

df -i
# Find dir with most files:
find /var -xdev -printf '%h\n' | \
  sort | uniq -c | sort -rn | head

Deleted-but-open files

Symptoms: du shows less usage than df. Cause: a process opened a log file, then logrotate deleted it, but the process is still writing to the old open file descriptor — occupying space invisible to du.

# Find deleted files still held open:
sudo lsof +L1 | grep deleted
# Fix: restart the process holding the file

# ── Inode exhaustion diagnosis ────────────────────────────────────

# Check inode usage across all filesystems
df -i

# Find which directory contains the most files (the inode hog)
sudo find /var -xdev -printf '%h\n' 2>/dev/null | \
  sort | uniq -c | sort -rn | head -10

# Count files in a specific directory non-recursively
ls /var/spool/exim4/input | wc -l

# ── Deleted-but-open files ────────────────────────────────────────

# List all deleted files currently held open by running processes
sudo lsof +L1

# Filter to just the large ones (column 7 is file size in bytes)
sudo lsof +L1 | awk 'NR>1 && $7 > 104857600 {print $7/1048576"MB", $1, $2, $9}'

# The fix: restart the process that holds the deleted file open
# Example: nginx is holding a deleted access.log
sudo systemctl restart nginx

# sudo lsof +L1
COMMAND   PID   USER   FD   TYPE DEVICE    SIZE/OFF NLINK NAME
nginx    1235   root   10w  REG   8,17  94489280512     0 /var/log/nginx/access.log (deleted)
postgres 3012   postgres 3w  REG   8,17  42949672960     0 /var/log/pg/postgresql.log (deleted)

# sudo lsof +L1 | awk 'NR>1 && $7 > 104857600 {print $7/1048576"MB", $1, $2, $9}'
90112MB nginx 1235 /var/log/nginx/access.log (deleted)
40960MB postgres 3012 /var/log/pg/postgresql.log (deleted)

What just happened? lsof +L1 found two files marked (deleted) that together occupy over 130GB of space — but du would show zero for them because they no longer have a directory entry. This is the classic scenario where a disk appears "full" but du / accounts for far less than df reports. Restarting nginx and postgres will cause them to close their file descriptors and release all that space immediately.

Never Truncate a Log File with > — Restart the Process Instead

A common emergency response to a full disk is to run > /var/log/app.log to zero out a large log file. This truncates the inode on disk but the running process still holds the old file descriptor open — it continues writing to the same position in the original (now zeroed) file and the space is not freed. The correct approach is to either use logrotate -f to rotate and signal the process, or restart the service so it opens a fresh file handle. If you must truncate without a restart, use truncate -s 0 /var/log/app.log — this zeroes the file in place without changing the inode, so the open file descriptor continues to work correctly.

Lesson Checklist

✔ I use df -hT for a system-wide space overview and du -h --max-depth=1 | sort -rh to drill down to the space consumer

✔ I can use find with -size, -mtime, and -xdev to locate large, old, and orphaned files on a specific filesystem

✔ I regularly run apt clean, apt autoremove, and journalctl --vacuum-time to prevent gradual storage accumulation

✔ I know to check df -i when a disk appears full but df -h shows free space — inode exhaustion is the likely cause

✔ I use lsof +L1 to find deleted-but-open files when du and df disagree, and I restart the holding process rather than truncating with >

Teacher's Note

The deleted-but-open file scenario (lsof +L1) is responsible for a surprising number of "disk full" incidents where the operator is completely baffled because du -sh / accounts for far less space than df -h reports. Memorise the pattern: df ≠ du? Run lsof +L1. It solves this class of problem every time.

Practice Questions

1. A production server alerts at 95% disk usage on /var. Describe the exact sequence of commands you would run to identify what is consuming the space, working from the filesystem level down to the individual file — including what to check if du and df disagree.

Start with df -h to confirm the full filesystem. Then drill down: sudo du -h --max-depth=1 /var | sort -rh to find the largest subdirectory, then repeat with that directory until you reach the offending files. Use sudo find /var -type f -size +500M to surface large individual files. If df shows high usage but du totals are low, a process has the file open after deletion — find it with sudo lsof +L1 /var and restart that process to release the blocks.

2. An application server is throwing No space left on device errors when creating session files, but df -h shows 40% free space on the partition. What is the likely cause and what command would you run to confirm and diagnose it?

The likely cause is inode exhaustion — block space remains but all inode slots are consumed, so no new files can be created. Confirm with df -i which shows inode usage per filesystem. If IUse% is at or near 100%, the inode table is full. The fix is to find and remove directories containing huge numbers of small files — a common culprit is a mail queue, session store, or cache directory. Use sudo find /var -xdev -type d | xargs -I{} sh -c 'echo 0 {}' | sort -rn | head -20 to find the densest directories.

3. Write the command to permanently configure the systemd journal to keep no more than 300MB of logs and retain entries for a maximum of 10 days. Where does this configuration go, and what must you run for it to take effect?

Edit /etc/systemd/journald.conf and set: SystemMaxUse=300M and MaxRetentionSec=10day under the [Journal] section. Then restart the journal service: sudo systemctl restart systemd-journald. To immediately vacuum logs down to the new limit without waiting: sudo journalctl --vacuum-size=300M --vacuum-time=10d.

Lesson Quiz

1. df -h shows 30% free space on /var, but creating a new file there fails with No space left on device. What is the most likely cause?

The filesystem is mounted read-only The filesystem has run out of inodes — all inode slots are used even though block space remains The 5% reserved-blocks limit has been reached and must be lowered with tune2fs The noexec mount option is preventing file creation

2. You delete a 50GB log file with rm /var/log/app.log but df -h shows no change in free space. What is the most likely explanation?

The filesystem cache has not been flushed yet — run sync to update the disk A running process still has the file open — the kernel retains the disk blocks until the last file descriptor is closed The file is still in the trash directory and has not been permanently deleted df only updates after a reboot — the space will be freed on next restart

3. Which command shows the top disk-consuming subdirectories under /var, one level deep, sorted largest first?

df -h /var | sort -rh du -h --max-depth=1 /var | sort -rh ls -lhS /var find /var -maxdepth 1 -size +1M | sort

Up Next

Lesson 21 — Log Files and Log Rotation

Reading system logs, configuring logrotate, and managing application logging across a Linux server

← Previous Course Index Next →