Multiprocessing | Dataplexa

Multiprocessing in Python

Multiprocessing allows a program to run multiple processes at the same time. Unlike threads, each process has its own memory space, enabling true parallel execution. This is extremely useful for CPU-heavy tasks such as calculations, data processing, and scientific computing.

Why Do We Need Multiprocessing?

Python’s Global Interpreter Lock (GIL) prevents threads from running CPU tasks in parallel. Multiprocessing bypasses this limitation by launching separate processes instead of threads. This makes it perfect for tasks that require heavy computation.

The `multiprocessing` Module

Python provides the multiprocessing module to create and manage multiple processes. Each process operates independently and runs on a different CPU core if available. This allows large tasks to be divided into smaller parallel operations.

Creating a Simple Process

A process behaves like a separate mini-program within your system. Below is the simplest example of starting a new process that runs a custom function. This helps your main program continue without waiting for long calculations.

from multiprocessing import Process

def greet():
    print("Hello from a process!")

p = Process(target=greet)
p.start()
p.join()

Processes with Arguments

You can pass arguments into process functions just like threads. This makes multiprocessing flexible and suitable for dividing workloads. Each process receives its own copy of the arguments.

from multiprocessing import Process

def show(msg):
    print("Message:", msg)

p = Process(target=show, args=("Running...",))
p.start()
p.join()

Running Multiple Processes

Multiple processes can run at the same time across different CPU cores. This helps large datasets or heavy-computation tasks complete much faster. Each process runs independently and does not block others.

from multiprocessing import Process

def task(n):
    print("Task number:", n)

processes = []
for i in range(5):
    p = Process(target=task, args=(i,))
    processes.append(p)
    p.start()

for p in processes:
    p.join()

Using a Process Pool

A process pool allows you to manage a group of worker processes. It automatically divides work among available CPU cores. This approach is efficient for repeating a function many times.

from multiprocessing import Pool

def square(n):
    return n * n

with Pool() as pool:
    results = pool.map(square, [1, 2, 3, 4, 5])
    print(results)

Sharing Data Between Processes

Because processes do not share memory by default, Python provides tools such as Value, Array, and Manager for communication. These help tasks coordinate results safely across multiple processes.

from multiprocessing import Process, Value

def increment(value):
    value.value += 1

num = Value('i', 0)
p1 = Process(target=increment, args=(num,))
p2 = Process(target=increment, args=(num,))

p1.start()
p2.start()

p1.join()
p2.join()

print(num.value)

Multiprocessing vs Multithreading

Multithreading is ideal for I/O-based tasks such as reading files or network requests. Multiprocessing is ideal for CPU-heavy tasks involving calculations or large data processing. Choosing the right one improves performance and resource usage.

Real-World Uses of Multiprocessing

It is used for image processing and video encoding where heavy computation is required. It speeds up data cleaning and transformations during data analysis. It powers many scientific and machine-learning workflows that demand parallel execution.

📝 Practice Exercises

Exercise 1

Create a process that prints numbers from 1 to 5.

Exercise 2

Create three processes that each print a unique ID.

Exercise 3

Use a process pool to compute cubes of numbers from 1 to 5.

Exercise 4

Create a shared counter and increment it using two processes.

✅ Practice Answers

Answer 1

from multiprocessing import Process

def show():
    for i in range(1, 6):
        print(i)

p = Process(target=show)
p.start()
p.join()

Answer 2

from multiprocessing import Process

def display(id):
    print("Process ID:", id)

for i in range(3):
    p = Process(target=display, args=(i,))
    p.start()
    p.join()

Answer 3

from multiprocessing import Pool

def cube(n):
    return n ** 3

with Pool() as pool:
    result = pool.map(cube, [1, 2, 3, 4, 5])
    print(result)

Answer 4

from multiprocessing import Process, Value

def inc(val):
    val.value += 1

counter = Value('i', 0)

p1 = Process(target=inc, args=(counter,))
p2 = Process(target=inc, args=(counter,))

p1.start(); p2.start()
p1.join(); p2.join()

print("Final value:", counter.value)

← Previous Lesson Python Index Next ➜