NumPy Lesson 10 – Joining & Slicing | Dataplexa

Joining and Splitting NumPy Arrays

In real-world data processing, data is often spread across multiple arrays. NumPy provides powerful functions to join arrays together and split arrays into smaller parts.

These operations are heavily used in data preprocessing, feature engineering, and numerical workflows.


Why Joining and Splitting Matters

Joining and splitting arrays helps when:

  • Combining data from multiple sources
  • Separating training and testing datasets
  • Breaking large datasets into manageable chunks
  • Performing batch computations

Joining Arrays Using concatenate()

The concatenate() function joins arrays along a specified axis.

import numpy as np

a = np.array([10, 20, 30])
b = np.array([40, 50, 60])

combined = np.concatenate((a, b))
print(combined)

Output:

[10 20 30 40 50 60]

Both arrays are joined end-to-end into a single array.


Concatenating 2D Arrays

When working with 2D arrays, the axis parameter controls how arrays are joined.

x = np.array([[1, 2],
              [3, 4]])

y = np.array([[5, 6],
              [7, 8]])

result = np.concatenate((x, y), axis=0)
print(result)

Output:

[[1 2]
 [3 4]
 [5 6]
 [7 8]]

Arrays are stacked vertically (row-wise).


Horizontal Joining Using Axis

Set axis=1 to join arrays column-wise.

horizontal = np.concatenate((x, y), axis=1)
print(horizontal)

Output:

[[1 2 5 6]
 [3 4 7 8]]

Stacking Arrays

NumPy provides specialized stacking functions that are easier to use.

Vertical Stack (vstack)

vstacked = np.vstack((x, y))
print(vstacked)

This stacks arrays row-wise.


Horizontal Stack (hstack)

hstacked = np.hstack((x, y))
print(hstacked)

This stacks arrays column-wise.


Splitting Arrays

Splitting divides an array into multiple smaller arrays.

data = np.array([5, 10, 15, 20, 25, 30])

parts = np.split(data, 3)
print(parts)

Output:

[array([ 5, 10]),
 array([15, 20]),
 array([25, 30])]

The array is split into three equal parts.


Splitting 2D Arrays

You can split arrays along rows or columns using axis.

matrix = np.array([[10, 20, 30],
                   [40, 50, 60],
                   [70, 80, 90]])

row_split = np.split(matrix, 3, axis=0)
print(row_split)

This splits the matrix row-wise into three parts.


When to Use Join vs Split

  • Use joining when combining related datasets
  • Use splitting when dividing data for analysis or modeling
  • Ensure dimensions match before joining
  • Always confirm array shape after splitting

Practice Exercise

Exercise

Create two 2D arrays of shape 2 × 3 and:

  • Join them vertically
  • Join them horizontally
  • Split the result into two equal parts

Expected Outcome

You should confidently combine and divide NumPy arrays using multiple methods.


What’s Next?

In the next lesson, you will learn the difference between copy and view and how memory sharing works in NumPy.