Ansible Lesson 18 – File and Package Management | Dataplexa
Section II · Lesson 18

File and Package Management

In this lesson

File module in depth copy & fetch lineinfile & blockinfile Package management Version pinning

File and package management are the two most frequent categories of tasks in real-world Ansible automation. Nearly every provisioning playbook creates directories, deploys files, sets permissions, and installs software — and nearly every application lifecycle task involves updating, pinning, or removing packages. This lesson covers the full depth of Ansible's file and package modules: creating directory trees, managing symlinks, editing files in place, fetching files back to the control node, handling version pinning, and installing from non-standard sources.

The ansible.builtin.file Module in Depth

The file module manages the existence, type, ownership, and permissions of filesystem objects. Its behaviour is entirely driven by the state parameter.

file module state values

directory Creates the directory and all intermediate parent directories — equivalent to mkdir -p. Idempotent — does nothing if already exists.
file Ensures a regular file exists and sets owner, group, mode. Does not create content — use copy or template for that. Use to update metadata on an existing file.
link Creates a symbolic link at path pointing to src. If the symlink already exists and points elsewhere, it is updated. Use for versioned application directories: /var/www/current pointing to a release.
touch Creates an empty file if it does not exist, or updates the modification timestamp if it does. Useful for creating lock files, marker files, or placeholder configs.
absent Removes the file, directory (recursively), or symlink at path. Idempotent — no error if the path does not exist.
- name: Create application directory tree
  ansible.builtin.file:
    path: "{{ item }}"
    state: directory
    owner: "{{ app_user }}"
    group: "{{ app_group }}"
    mode: "0755"
  loop:
    - /var/www/app
    - /var/www/app/releases
    - /var/www/app/shared
    - /var/log/app

- name: Create symlink for current release
  ansible.builtin.file:
    src: "/var/www/app/releases/{{ app_version }}"
    dest: /var/www/app/current
    state: link
    force: true        # replace symlink even if it already points elsewhere

- name: Remove stale lock file
  ansible.builtin.file:
    path: /tmp/app.lock
    state: absent      # idempotent — no error if already missing

The Property Manager Analogy

Think of the file module as a property manager maintaining a building to a specification. You describe what each room should look like — type, owner, permissions — and the manager makes it so. If the room already matches the spec, they leave it alone and report ok.

copy and fetch

The copy module pushes content from the control node to managed nodes. fetch does the reverse — pulling files from managed nodes back to the control node.

ansible.builtin.copy — push files to managed nodes

# Push a file from the control node
- name: Deploy application config
  ansible.builtin.copy:
    src: files/app.conf
    dest: /etc/app/app.conf
    owner: "{{ app_user }}"
    mode: "0640"
    backup: true           # preserve previous version as app.conf.TIMESTAMP

# Write inline content directly
- name: Create motd banner
  ansible.builtin.copy:
    content: |
      ############################################
      # Managed by Ansible — do not edit manually
      # Host: {{ inventory_hostname }}
      ############################################
    dest: /etc/motd
    mode: "0644"

# Copy on the managed node itself (no control node involved)
- name: Back up config before modifying
  ansible.builtin.copy:
    src: /etc/app/app.conf
    dest: /etc/app/app.conf.bak
    remote_src: true       # src is on the managed node, not the control node

ansible.builtin.fetch — pull files from managed nodes

# Pull error log from every node to the control node
- name: Fetch application error log for analysis
  ansible.builtin.fetch:
    src: /var/log/app/error.log
    dest: ./fetched_logs/    # saves to ./fetched_logs/hostname/var/log/app/error.log
    flat: false              # default — preserve full path under dest/hostname/

# flat: true — save directly to the dest path
- name: Fetch SSL certificate for inspection
  ansible.builtin.fetch:
    src: /etc/ssl/certs/app.crt
    dest: "./certs/{{ inventory_hostname }}.crt"
    flat: true
# Files collected after fetch with flat: false across three nodes:
./fetched_logs/
├── web01.example.com/var/log/app/error.log
├── web02.example.com/var/log/app/error.log
└── web03.example.com/var/log/app/error.log

lineinfile and blockinfile

When you cannot replace an entire config file — because it is shared or managed by another tool — use lineinfile to manage individual lines or blockinfile to manage multi-line sections. Both are idempotent.

lineinfile
Manages a single line in a file
Finds a line by regex and replaces or removes it
Best for key=value config lines, SSH directives, PATH entries
blockinfile
Manages a multi-line block in a file
Wraps the block with marker comments to identify it
Best for SSH config stanzas, sudoers entries, cron blocks
# lineinfile — disable SSH root login
- name: Disable SSH root login
  ansible.builtin.lineinfile:
    path: /etc/ssh/sshd_config
    regexp: "^PermitRootLogin"
    line: "PermitRootLogin no"
    state: present
    backup: true
  notify: Restart SSH

# lineinfile — safe sudoers edit with validation
- name: Grant deploy user passwordless sudo
  ansible.builtin.lineinfile:
    path: /etc/sudoers
    line: "deploy ALL=(ALL) NOPASSWD: ALL"
    validate: "visudo -cf %s"      # validate BEFORE writing — prevents lockout!

# blockinfile — manage an SSH client config stanza
- name: Add GitHub SSH config block
  ansible.builtin.blockinfile:
    path: /home/{{ deploy_user }}/.ssh/config
    marker: "# {mark} ANSIBLE MANAGED BLOCK — github"
    block: |
      Host github.com
        HostName github.com
        User git
        IdentityFile ~/.ssh/id_ed25519
        StrictHostKeyChecking accept-new
    create: true
    owner: "{{ deploy_user }}"
    mode: "0600"

What just happened?

The blockinfile task wraps the SSH config stanza between # BEGIN ANSIBLE MANAGED BLOCK and # END ANSIBLE MANAGED BLOCK markers. On subsequent runs, Ansible finds these markers and replaces only the content between them — leaving the rest of the file untouched. This gives Ansible a bounded, identifiable footprint on shared files.

Package Management in Depth

Ansible has a generic package module and distro-specific modules. Knowing when to use each matters.

Generic

ansible.builtin.package

Auto-detects apt, dnf, yum, or zypper. Write once, works on any distro. Limited to install / remove / latest — no distro-specific options.

Debian / Ubuntu

ansible.builtin.apt

Full apt control — cache updates, autoremove, upgrade, dpkg options, .deb file installation, PPA management via apt_repository.

RHEL / Fedora

ansible.builtin.dnf

Full dnf control — module streams, group installs, repo management, GPG key handling, .rpm installation. Replaces the deprecated yum module.

Python

ansible.builtin.pip

Installs Python packages via pip. Supports virtual environments, requirements files, specific versions, and editable installs.

Version Pinning and Advanced Package Tasks

Version pinning is one of the most important practices in production package management. Allowing packages to install their latest version means a routine playbook run can unexpectedly upgrade a critical dependency — potentially breaking a running application.

Pinning to a specific version

# apt — pin to exact version
- name: Install specific Nginx version
  ansible.builtin.apt:
    name: nginx=1.24.0-1~focal
    state: present
    update_cache: true

# dnf — pin using name-version format
- name: Install specific PostgreSQL version
  ansible.builtin.dnf:
    name: postgresql15-server-15.4-1.rhel9.x86_64
    state: present

# pip — pin Python package version in a virtual environment
- name: Install pinned gunicorn into app virtualenv
  ansible.builtin.pip:
    name: gunicorn==21.2.0
    virtualenv: /var/www/app/venv
    state: present

Full apt workflow — add repo, import GPG key, install pinned package

---
- name: Install Docker CE from official repository
  hosts: all
  become: true

  tasks:
    - name: Install prerequisite packages
      ansible.builtin.apt:
        name: [ca-certificates, curl, gnupg]
        state: present
        update_cache: true

    - name: Create keyring directory
      ansible.builtin.file:
        path: /etc/apt/keyrings
        state: directory
        mode: "0755"

    - name: Download Docker GPG key
      ansible.builtin.get_url:
        url: https://download.docker.com/linux/ubuntu/gpg
        dest: /etc/apt/keyrings/docker.asc
        mode: "0644"

    - name: Add Docker apt repository
      ansible.builtin.apt_repository:
        repo: >
          deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.asc]
          https://download.docker.com/linux/ubuntu
          {{ ansible_distribution_release }} stable
        state: present
        filename: docker

    - name: Install Docker CE (pinned version)
      ansible.builtin.apt:
        name:
          - docker-ce=5:24.0.7-1~ubuntu.22.04~jammy
          - docker-ce-cli=5:24.0.7-1~ubuntu.22.04~jammy
          - containerd.io
        state: present
        update_cache: true

    - name: Ensure Docker service is running
      ansible.builtin.service:
        name: docker
        state: started
        enabled: true
PLAY [Install Docker CE from official repository] *****************************

TASK [Gathering Facts] ********************************************************
ok: [192.168.1.10]

TASK [Install prerequisite packages] ******************************************
ok: [192.168.1.10]      <-- already installed

TASK [Create keyring directory] ************************************************
ok: [192.168.1.10]      <-- already exists

TASK [Download Docker GPG key] *************************************************
changed: [192.168.1.10]

TASK [Add Docker apt repository] ***********************************************
changed: [192.168.1.10]

TASK [Install Docker CE (pinned version)] **************************************
changed: [192.168.1.10]

TASK [Ensure Docker service is running] ****************************************
changed: [192.168.1.10]

PLAY RECAP ********************************************************************
192.168.1.10   : ok=7  changed=4  unreachable=0  failed=0  skipped=0

What just happened?

The play followed the correct ordering for adding a third-party apt repository — prerequisites first, then keyring, then GPG key, then the repo definition, then the install. Each step depends on the previous one. Prerequisites and keyring directory reported ok (already present); the remaining steps made changes. A second run produces all ok — fully idempotent.

Package State Reference

Choosing the wrong state value is one of the most common causes of accidental upgrades in production.

Package module state values

present Install if not present; skip if already installed at any version. Use this in production — never causes unexpected upgrades.
latest Install or upgrade to the newest available version. Use with caution in production — a routine run can silently upgrade a critical dependency. Safe for dev, dangerous for prod.
absent Remove the package if installed. Idempotent — no error if already absent. Use for decommissioning services or removing conflicting packages.
fixed Attempt to correct a broken package state (apt only) — equivalent to apt --fix-broken install. Use when a previous install left the system in a partially broken state.

Never Use state: latest on Production Package Tasks

state: latest upgrades the package to its newest version every time the playbook runs. In production, a scheduled configuration run can silently upgrade a critical dependency and break a running application. Always use state: present. When you need to upgrade, do it deliberately with a version-pinned task — not by changing a state flag in an automated playbook.

Key Takeaways

state: directory with a loop creates full directory trees — equivalent to mkdir -p for each path, idempotent on every run.
Always validate sudoers changes with validate: "visudo -cf %s" — a syntax error in /etc/sudoers locks all users out of sudo. The validate parameter runs the check before writing.
Use blockinfile for files you do not fully own — its marker-delimited blocks leave a clear, removable footprint and are safe on shared config files managed by other tools.
Pin package versions in productionstate: present with an explicit version prevents accidental upgrades. Only upgrade deliberately.
Use backup: true on copy, template, and lineinfile — it keeps a timestamped copy of the previous version, giving you an instant rollback path when a config change causes a problem.

Teacher's Note

Go through your existing playbooks and check two things: no package task uses state: latest, and any lineinfile task that touches sudoers includes the validate parameter. These two checks take five minutes and prevent two categories of production incidents.

Practice Questions

1. You want to use ansible.builtin.copy to duplicate a file that already exists on the managed node — not transfer from the control node. Which parameter and value do you add?



2. Which lineinfile parameter runs a validation command against the file before writing — essential for preventing broken sudoers or SSH configs?



3. Which package module state value should always be used in production tasks to prevent unexpected upgrades during routine playbook runs?



Quiz

1. Why is blockinfile safer than copy for adding configuration to a file also managed by another process?


2. After a deployment you need to collect application logs from every server to the control node. Which module and direction handles this?


3. You need to create four application directories with the same owner and permissions. What is the most concise and idempotent approach?


Up Next · Lesson 19

Service Management

Master Ansible's service and systemd modules — starting, stopping, enabling services, managing unit files, and orchestrating rolling service restarts across a fleet.