Jenkins Lesson 44 – Jenkins Anti-Patterns | Dataplexa
Section V · Lesson 44

Jenkins Anti-Patterns

Every Jenkins problem in Lesson 43 had a root cause. Most of those root causes were anti-patterns — habits, shortcuts, and "good enough for now" decisions that accumulated quietly until they broke something. This lesson names them all so you can recognise and stop them before they find you.

This lesson covers

Security anti-patterns → Pipeline anti-patterns → Operations anti-patterns → Organisational anti-patterns → The warning signs checklist → What good Jenkins looks like

Anti-patterns are the opposite of best practices — they're the things that seem reasonable at the time but compound into serious problems. Each one in this lesson was observed in a real production Jenkins environment, caused a real incident, and is more common than you'd expect. Consider this lesson a field guide written by people who hit every one of these walls.

The Analogy

Jenkins anti-patterns are like technical debt in a codebase. Each one is small and harmless on its own. But they compound. A credential hardcoded "just for now" becomes a permanent fixture. A pipeline with no timeout "never hangs anyway." No backup "because nothing has gone wrong yet." Three years later, one outage hits all three simultaneously and the post-mortem reads like a checklist of everything this lesson warned about.

Security Anti-Patterns

AP-01

Hardcoding credentials in Jenkinsfiles

What it looks like

sh "docker login -u admin -p SuperSecret123 registry.acmecorp.com"

Why it's catastrophic

The Jenkinsfile is in Git. Git history is forever. Rotating the password doesn't remove it from history. Anyone who ever cloned the repo has the credential. Automated secret scanners will find it.

The fix

withCredentials([usernamePassword(credentialsId: 'docker-registry', usernameVariable: 'USER', passwordVariable: 'PASS')]) {
    sh "docker login -u ${USER} -p ${PASS} registry.acmecorp.com"
}
AP-02

Running builds on the master

Setting the master executor count above 0 means build scripts run with access to the entire JENKINS_HOME, all credentials, and all plugin state. A malicious or accidentally broken build script can read, modify, or delete anything Jenkins manages. Set master executors to 0. Always.

The fix

Manage Jenkins → Manage Nodes → master → Configure → Number of executors: 0

AP-03

"Logged-in users can do anything" authorisation

Any authenticated user — including contractors, interns, and compromised accounts — has full admin rights. They can install plugins, modify security settings, read all credentials, and delete any job. Covered in Lesson 28 — but worth repeating because it's still found in the wild constantly.

The fix

Switch to Matrix-based or Role-Based authorisation. Run the security audit Groovy script from Lesson 28 against every Jenkins you inherit.

Pipeline Anti-Patterns

AP-04

No pipeline timeout

A pipeline with no timeout() will hang forever if a waitUntil never resolves, a shell command blocks on stdin, or an external API stops responding. One hung build holds an executor for days, silently starving every other build in the queue.

The fix

options { timeout(time: 30, unit: 'MINUTES') }
AP-05

500-line Jenkinsfiles with inline logic

A Jenkinsfile that parses JSON, generates dynamic stage names, loops through service lists, and implements custom retry logic — all inline — is unmaintainable. Nobody reviews it properly in PRs because it's too long to read. Nobody fixes it because nobody fully understands it. Changes break things nobody expected.

The fix

The Jenkinsfile orchestrates — it calls shell scripts and shared library functions. Complex logic moves to scripts/ in the repo or to the shared library. Target: Jenkinsfile under 50 lines for a typical service.

AP-06

Using retry() to hide flaky tests

Wrapping a test step in retry(3) because "it sometimes fails" is hiding a real problem — a non-deterministic test, a race condition, or an external dependency without proper stubbing. The test passes in CI, ships to production, and fails in production where you can't retry three times automatically.

The fix

Fix the flaky test. Use retry only for genuine transient infrastructure failures (network timeouts, rate limits) — not for test failures. If a test needs three attempts to pass, it's broken.

AP-07

Notifying on every green build

Sending a Slack message for every successful build trains the channel to be ignored. When a real failure notification arrives, it's buried in the noise. Teams mute the channel. The value of the notification system collapses to zero.

The fix

Notify on failure and recovery only. Use currentBuild.previousBuild?.result == 'FAILURE' to detect recovery. Every notification should be actionable — if it isn't, don't send it.

AP-08

Using image:latest in Docker agents

agent { docker { image 'node:latest' } } means your build environment changes silently whenever the upstream image is updated. A Node.js major version bump, a Java security patch that breaks an API, a dependency change in the base image — all land invisibly in your CI without a PR, a review, or a Changelog entry.

The fix

agent { docker { image 'node:20.11-alpine3.19' } }   // pinned — explicit upgrade required

Operations Anti-Patterns

AP-09

No backup — "nothing has gone wrong yet"

The most expensive anti-pattern. JENKINS_HOME contains every job config, all credentials, the entire build history, and all plugin configurations. It lives on one disk, on one server. When that disk fails — and it will — the question is whether you have a backup from last night or a backup from never.

The fix

Daily automated backup of JENKINS_HOME to a different physical location. Test the restore process quarterly. If you haven't restored from the backup, you don't have a backup — you have an untested file.

AP-10

Never cleaning up build history or workspaces

Without buildDiscarder(logRotator()), jobs accumulate thousands of builds indefinitely. Each build stores logs, test results, and artifacts. After 6 months of active use, JENKINS_HOME is 200GB and the UI loads slowly because every page request scans the entire build history.

The fix

options {
    buildDiscarder(logRotator(numToKeepStr: '20', artifactNumToKeepStr: '5'))
}
AP-11

Installing every plugin that looks useful

Each plugin is code that runs on every request, hooks into Jenkins internals, and must be kept updated for security. A Jenkins with 150 plugins — 80 of which nobody uses — has 80 attack surface vectors, 80 potential compatibility failures on upgrade, and 80 reasons the UI could load slowly.

The fix

Quarterly plugin audit. Disable any plugin with zero active usage. Remove it after one release cycle if nothing breaks. Maintain a plugins.txt as the authoritative list — if a plugin isn't in plugins.txt, it shouldn't be installed.

AP-12

Configuring Jenkins entirely through the UI

A Jenkins configured only through the UI has no version history, no review process, and no recovery path if the server dies. Every change is a silent, unreviewable modification to a black box. When the server needs to be rebuilt, you start from scratch trying to remember what you configured three years ago.

The fix

Use JCasC (Lesson 37), Job DSL (Lesson 36), and Jenkinsfiles (throughout Section II). The UI is for reading, not for writing. If you can't commit it to Git, you shouldn't be configuring it in Jenkins.

Organisational Anti-Patterns

AP-13

One Jenkins admin who's the only person who understands it

When the only person who knows how Jenkins works goes on holiday, gets sick, or leaves the company, the entire CI/CD system becomes a black box nobody can maintain. Any outage becomes a crisis because the institutional knowledge walked out the door.

The fix

At minimum two people understand Jenkins deeply. JCasC and shared libraries are documented and in Git. The 5-minute triage checklist from Lesson 43 is in your team's runbook. Any engineer can diagnose common problems without the Jenkins expert.

AP-14

Treating a broken build as someone else's problem

A red build that everyone ignores because "it'll fix itself" or "that's the infrastructure team's job" creates a broken windows effect. Teams stop trusting the pipeline. They start bypassing it. Deployment discipline erodes. By the time someone fixes the root cause, nobody remembers the pipeline was supposed to be the safety net.

The fix

The team that broke the build owns fixing it. A broken build is a P1 — nothing else merges until the main branch is green. This is a cultural norm, not a technical one. Engineering leads enforce it.

AP-15

Adding agents to fix a slow master

When the Jenkins UI is slow and builds are queueing, the instinct is to add more agents. But if the master is the bottleneck — high CPU, heap pressure, thread exhaustion — adding agents increases master load and makes the problem worse. This anti-pattern causes expensive cloud bills and continued degradation.

The fix

Diagnose before scaling. Check executor utilisation and queue depth (Lesson 32). If executors are free but the UI is slow — tune the master (Lesson 35). Only add agents when the queue is full and executors are genuinely saturated.

Warning Signs — Your Jenkins Needs Attention

If you recognise three or more of these in your Jenkins, take a maintenance day:

The UI takes more than 5 seconds to load

OutOfMemoryError in jenkins.log in the last 30 days

Any pipeline has no timeout() configured

JENKINS_HOME has never been backed up and tested

Master executor count is above 0

Any job has 100+ builds retained with no log rotation

Authorization is "Logged-in users can do anything"

Any Jenkinsfile contains a hardcoded password or token

Only one person in the team knows how Jenkins works

Jenkins is more than 2 LTS versions behind current

What Good Jenkins Looks Like

Every secret comes from credentials() — no exceptions
Master executor count is 0 — all builds on agents
Every pipeline has a timeout() in options
Jenkinsfiles are under 50 lines — logic is in scripts or shared library
JENKINS_HOME backed up daily, restore tested quarterly
Log rotation on every job — max 20 builds kept
Matrix-based or Role-Based authorisation — no "everyone is admin"
JCasC defines all master configuration — UI is read-only
Prometheus metrics watched — queue depth and heap in dashboards
Two or more people can diagnose and fix common Jenkins problems

Teacher's Note

If you walked through this lesson and recognised five of these anti-patterns in your own Jenkins — that's normal. This is what real production Jenkins systems look like. The value of this lesson is recognising them before the next incident makes them unavoidable.

Practice Questions

1. Why is using retry(3) to make a failing test pass in CI an anti-pattern rather than a solution?



2. Which Jenkinsfile options directive prevents build history from accumulating indefinitely and consuming all available disk and memory?



3. The Jenkins UI is slow and builds are queueing. Why is adding more build agents the wrong first response?



Quiz

1. A hardcoded password in a Jenkinsfile was discovered and rotated immediately. Why is this still a serious security problem?


2. What is the correct approach to pipeline notifications to avoid alert fatigue?


3. What is the fix for the anti-pattern of configuring Jenkins entirely through the UI?


Up Next · Lesson 45

Mini Project — End-to-End CI/CD Pipeline

Everything from this course comes together. Build a production-grade CI/CD pipeline from scratch — Git, Docker, Kubernetes, security, notifications, and failure handling — in one complete real-world project.