Multiprocessing in Python
Multiprocessing allows a program to run multiple processes at the same time. Unlike threads, each process has its own memory space, enabling true parallel execution. This is extremely useful for CPU-heavy tasks such as calculations, data processing, and scientific computing.
Why Do We Need Multiprocessing?
Python’s Global Interpreter Lock (GIL) prevents threads from running CPU tasks in parallel. Multiprocessing bypasses this limitation by launching separate processes instead of threads. This makes it perfect for tasks that require heavy computation.
The multiprocessing Module
Python provides the multiprocessing module to create and manage multiple processes.
Each process operates independently and runs on a different CPU core if available.
This allows large tasks to be divided into smaller parallel operations.
Creating a Simple Process
A process behaves like a separate mini-program within your system. Below is the simplest example of starting a new process that runs a custom function. This helps your main program continue without waiting for long calculations.
from multiprocessing import Process
def greet():
print("Hello from a process!")
p = Process(target=greet)
p.start()
p.join()
Processes with Arguments
You can pass arguments into process functions just like threads. This makes multiprocessing flexible and suitable for dividing workloads. Each process receives its own copy of the arguments.
from multiprocessing import Process
def show(msg):
print("Message:", msg)
p = Process(target=show, args=("Running...",))
p.start()
p.join()
Running Multiple Processes
Multiple processes can run at the same time across different CPU cores. This helps large datasets or heavy-computation tasks complete much faster. Each process runs independently and does not block others.
from multiprocessing import Process
def task(n):
print("Task number:", n)
processes = []
for i in range(5):
p = Process(target=task, args=(i,))
processes.append(p)
p.start()
for p in processes:
p.join()
Using a Process Pool
A process pool allows you to manage a group of worker processes. It automatically divides work among available CPU cores. This approach is efficient for repeating a function many times.
from multiprocessing import Pool
def square(n):
return n * n
with Pool() as pool:
results = pool.map(square, [1, 2, 3, 4, 5])
print(results)
Sharing Data Between Processes
Because processes do not share memory by default, Python provides tools such as
Value, Array, and Manager for communication.
These help tasks coordinate results safely across multiple processes.
from multiprocessing import Process, Value
def increment(value):
value.value += 1
num = Value('i', 0)
p1 = Process(target=increment, args=(num,))
p2 = Process(target=increment, args=(num,))
p1.start()
p2.start()
p1.join()
p2.join()
print(num.value)
Multiprocessing vs Multithreading
Multithreading is ideal for I/O-based tasks such as reading files or network requests. Multiprocessing is ideal for CPU-heavy tasks involving calculations or large data processing. Choosing the right one improves performance and resource usage.
Real-World Uses of Multiprocessing
It is used for image processing and video encoding where heavy computation is required. It speeds up data cleaning and transformations during data analysis. It powers many scientific and machine-learning workflows that demand parallel execution.
📝 Practice Exercises
Exercise 1
Create a process that prints numbers from 1 to 5.
Exercise 2
Create three processes that each print a unique ID.
Exercise 3
Use a process pool to compute cubes of numbers from 1 to 5.
Exercise 4
Create a shared counter and increment it using two processes.
✅ Practice Answers
Answer 1
from multiprocessing import Process
def show():
for i in range(1, 6):
print(i)
p = Process(target=show)
p.start()
p.join()
Answer 2
from multiprocessing import Process
def display(id):
print("Process ID:", id)
for i in range(3):
p = Process(target=display, args=(i,))
p.start()
p.join()
Answer 3
from multiprocessing import Pool
def cube(n):
return n ** 3
with Pool() as pool:
result = pool.map(cube, [1, 2, 3, 4, 5])
print(result)
Answer 4
from multiprocessing import Process, Value
def inc(val):
val.value += 1
counter = Value('i', 0)
p1 = Process(target=inc, args=(counter,))
p2 = Process(target=inc, args=(counter,))
p1.start(); p2.start()
p1.join(); p2.join()
print("Final value:", counter.value)