CI/CD Lesson 21 – CI/CD Architecture Design | Dataplexa
Section III · Lesson 21

CI/CD Architecture Design

In this lesson

Architecture Principles Runner Architecture Monorepo vs Polyrepo Reusable Workflows Scaling the Pipeline

CI/CD architecture design is the discipline of structuring pipelines, runners, workflows, and repository layouts so that the delivery system scales with the organisation — remaining fast, maintainable, and reliable as the team grows, the codebase expands, and the number of services multiplies. A pipeline that works for a team of five often becomes a bottleneck for a team of fifty, not because the tools changed but because the architecture was never designed to scale. Section III covers how to build CI/CD systems that grow well — starting with the architectural decisions that shape everything else.

Core Architecture Principles

Before choosing tools or writing pipeline YAML, the architecture of a CI/CD system is shaped by a small set of principles that determine whether the system will remain manageable as it grows. Violating these principles early produces technical debt that compounds — pipelines that are hard to change, slow to run, and brittle to maintain.

Five Principles of Scalable CI/CD Architecture

🔁
Pipelines are code
Pipeline definitions live in version control alongside the application code they build. They are reviewed, versioned, and changed through the same PR process as everything else. A pipeline defined only in a UI is invisible, unreviewed, and impossible to audit. This is covered in depth in Lesson 22.
🧩
Reuse over repetition
Common pipeline steps — dependency installation, Docker builds, deployment scripts — are defined once and referenced everywhere. Copy-pasting pipeline YAML across repositories is the pipeline equivalent of code duplication: it multiplies maintenance cost with every copy.
Fast feedback first
The architecture must prioritise developer feedback speed. The fastest checks run earliest. Parallelism is used wherever possible. A pipeline designed for correctness but not speed will be worked around — developers will merge without waiting for results.
🔒
Least privilege throughout
Each pipeline job should have only the permissions it needs for its specific task. A test job does not need production deployment credentials. A build job does not need database access. Scoping permissions tightly limits the blast radius if a pipeline is compromised. Covered in Lesson 26.
📊
Observability built in
Pipeline runs, durations, failure rates, and deployment frequencies should be measurable. An architecture that produces no metrics about itself cannot be improved systematically. Build pipelines that emit the data needed to track DORA metrics from day one.

The City Infrastructure Analogy

A city designed for a population of 10,000 does not scale gracefully to 500,000. The roads are too narrow, the power grid is undersized, and the water system was never designed for the load. Retrofitting infrastructure into a grown city is enormously expensive and disruptive. CI/CD architecture has the same property: decisions made early — repository structure, runner capacity, workflow reuse patterns — become expensive to change later. Designing for scale from the beginning does not mean building everything now; it means making decisions that leave room to grow without requiring a full rebuild.

Runner Architecture — Hosted vs Self-Hosted

Every pipeline job runs on a runner — a machine that executes the steps. GitHub Actions offers hosted runners (managed by GitHub, provisioned on demand, destroyed after each job) and self-hosted runners (machines the team provisions, maintains, and registers with GitHub). The choice between them is one of the most consequential infrastructure decisions in CI/CD architecture.

Hosted vs Self-Hosted Runners — Trade-offs

GitHub-Hosted Runners
Self-Hosted Runners
Zero maintenance — GitHub provisions, patches, and retires runners automatically
Full control over hardware, OS, installed tools, and network access
Clean environment on every job — no state leakage between runs
Can run inside a private network — access to internal services without exposing them publicly
Standard hardware only — 2–4 CPU cores, 7–14 GB RAM on free tier
Any hardware — GPU runners, high-memory machines, ARM instances
Per-minute pricing — cost scales directly with pipeline usage
Fixed infrastructure cost — efficient at high volume, requires capacity planning

Most teams start with hosted runners and move selected workloads to self-hosted runners as volume or requirements grow. Common triggers for self-hosted runners: compliance requirements that prohibit code running on third-party infrastructure, integration tests that need access to internal services, builds that require specialised hardware, or volume high enough that self-hosted is significantly cheaper than per-minute billing.

Monorepo vs Polyrepo — Repository Structure and Pipeline Implications

The structure of repositories is one of the most debated decisions in software engineering, and it has direct consequences for CI/CD architecture. A monorepo stores multiple services or packages in a single repository. A polyrepo gives each service its own repository. Each approach requires a different pipeline architecture.

Pipeline Implications of Monorepo vs Polyrepo

Monorepo
One repository, one set of CI workflows. Requires path filtering — only triggering pipeline jobs for the services whose code has actually changed. Without path filtering, every commit rebuilds and redeploys everything, making pipelines slow and expensive. GitHub Actions supports path filtering natively with the paths trigger filter.
Polyrepo
Each service has its own pipeline, independently triggered and independently deployed. Simpler per-service pipeline logic, but requires coordination tooling when cross-service integration testing is needed. Reusable workflows (below) become critical — without them, the same pipeline boilerplate is duplicated across dozens of repositories.

Reusable Workflows — Eliminating Pipeline Duplication

GitHub Actions supports reusable workflows — workflow files that can be called from other workflows, passing inputs and secrets, and returning outputs. This is the mechanism that prevents pipeline logic from being duplicated across every repository in a polyrepo organisation, or across every service in a monorepo.

A platform team defines a reusable workflow for the standard build-and-push process, the standard test matrix, and the standard deploy procedure. Each service's pipeline calls these central workflows rather than reimplementing them. When the platform team improves the build process, every service gets the improvement automatically on the next run — not after each team manually updates their own copy. This is the pipeline equivalent of a shared library, and it is the foundation of platform engineering as described in Lesson 10.

Calling a Reusable Workflow — GitHub Actions

# .github/workflows/service-pipeline.yml
# Each service's own pipeline — calls centralised reusable workflows

on:
  push:
    branches: [main]

jobs:
  build:
    uses: myorg/.github/.github/workflows/build-and-push.yml@main   # Centralised build workflow
    with:
      image-name: my-service
      dockerfile: ./Dockerfile
    secrets: inherit                                                  # Pass secrets from caller

  test:
    needs: build
    uses: myorg/.github/.github/workflows/run-tests.yml@main        # Centralised test workflow
    with:
      image-tag: ${{ needs.build.outputs.image-tag }}

  deploy:
    needs: test
    uses: myorg/.github/.github/workflows/deploy.yml@main           # Centralised deploy workflow
    with:
      environment: production
      image-tag: ${{ needs.build.outputs.image-tag }}
    secrets: inherit

What just happened?

This service pipeline contains almost no implementation logic — it calls three centralised reusable workflows maintained by a platform team. Every service in the organisation uses the same build, test, and deploy workflows. When the platform team improves one, every service benefits automatically. The service team owns only the inputs — the image name, the Dockerfile path, the target environment.

Warning: Copy-Pasted Pipeline YAML Across Repositories Creates a Maintenance Crisis

When a security patch requires updating the Node.js version used in the build step, a team with 30 repositories that each contain a copy of the same pipeline YAML must make that change 30 times — across 30 PRs, with 30 code reviews, with the risk that some repositories are missed. This is not a hypothetical: it is the most common pipeline maintenance failure pattern in growing organisations. Reusable workflows, composite actions, and centralised pipeline libraries exist precisely to prevent it. If your organisation's pipeline code is not DRY, it is accumulating a maintenance debt that will eventually be paid during a security incident.

Key Takeaways from This Lesson

Architecture decisions compound early — repository structure, runner choice, and workflow reuse patterns made for a five-person team become expensive to change at fifty. Design for the organisation you will be, not just the one you are today.
Self-hosted runners solve specific problems, not general ones — start with hosted runners and move workloads to self-hosted only when compliance, network access, hardware, or cost requirements make it necessary.
Monorepos require path filtering — without it, every commit triggers the entire pipeline for every service, wasting runner time and slowing feedback. The paths trigger filter in GitHub Actions enables per-service pipeline scoping.
Reusable workflows eliminate pipeline duplication — centralising build, test, and deploy logic means improvements propagate automatically and security patches are applied once, not once per repository.
Least privilege and observability are architectural requirements, not afterthoughts — scoping job permissions tightly and building in metrics collection from the start is far easier than retrofitting them into a pipeline system that grew without them.

Teacher's Note

If your organisation has more than five repositories and no reusable workflows, count how many files contain the line npm ci — that number is the minimum size of your pipeline maintenance burden every time a Node version needs updating.

Practice Questions

Answer in your own words — then check against the expected answer.

1. What GitHub Actions feature allows a workflow file to be called from other workflows — enabling a platform team to define centralised build, test, and deploy logic that every service's pipeline references rather than reimplementing?



2. What technique — supported natively by GitHub Actions using the paths trigger key — prevents a monorepo pipeline from rebuilding and redeploying every service on every commit by only triggering jobs for services whose code has actually changed?



3. What type of GitHub Actions runner does a team provision and maintain themselves — giving full control over hardware, operating system, and private network access — typically adopted when compliance, specialised hardware, or high pipeline volume makes hosted runners unsuitable?



Lesson Quiz

1. A team migrates from a polyrepo to a monorepo containing 12 microservices. They copy their existing pipelines into the monorepo without modification. Every PR now takes 40 minutes. What is the most likely cause?


2. A security review finds that the test job in a pipeline has access to the same production deployment credentials as the deploy job. What principle does this violate and what is the risk?


3. An organisation has 30 repositories, each with a copy of the same pipeline YAML that includes a setup-node step. A new Node.js LTS version needs to be adopted. What architecture change would make this a single-file update rather than 30 pull requests?


Up Next · Lesson 22

Pipeline as Code

Pipelines defined in UI tools are invisible and unauditable. Pipeline as code means your delivery system is versioned, reviewed, and tested like everything else — and it changes the way teams think about pipeline quality.