Reading and Writing Data with NumPy
Working with data often requires saving arrays to files and loading them later. NumPy provides simple and efficient functions to read and write numerical data in different file formats.
In this lesson, you will learn how to:
- Save NumPy arrays to files
- Load arrays from files
- Work with text and binary formats
Why Reading and Writing Data Is Important
Saving and loading data allows you to:
- Reuse processed data
- Share datasets with others
- Store intermediate results
- Work with large numerical files efficiently
NumPy supports both text-based and binary file formats.
Creating a Sample Array
Let us start with a simple numerical array.
import numpy as np
data = np.array([[10, 20, 30],
[40, 50, 60],
[70, 80, 90]])
print(data)
This array represents tabular numerical data.
Saving Arrays to Text Files
Use np.savetxt() to store arrays in text format.
np.savetxt("data.txt", data)
This creates a readable text file where values are separated by spaces.
Loading Data from Text Files
Use np.loadtxt() to read numerical data back into NumPy.
loaded_data = np.loadtxt("data.txt")
print(loaded_data)
Output:
[[10. 20. 30.]
[40. 50. 60.]
[70. 80. 90.]]
Note that values are loaded as floating-point numbers by default.
Saving Arrays in CSV Format
CSV (Comma-Separated Values) is commonly used for data exchange.
np.savetxt("data.csv", data, delimiter=",")
This file can be opened in spreadsheets or other data tools.
Loading CSV Files
Use the same np.loadtxt() function with a delimiter.
csv_data = np.loadtxt("data.csv", delimiter=",")
print(csv_data)
The data structure remains unchanged.
Binary Files with NumPy
Binary formats are faster and preserve data types accurately.
Use np.save() to store arrays in binary format.
np.save("data.npy", data)
Loading Binary Files
Load binary files using np.load().
binary_data = np.load("data.npy")
print(binary_data)
Binary files are preferred for large numerical datasets.
Saving Multiple Arrays Together
Use np.savez() to store multiple arrays in one file.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.savez("arrays.npz", first=a, second=b)
Load them back using:
loaded = np.load("arrays.npz")
print(loaded["first"])
print(loaded["second"])
Best Practices
- Use CSV for data sharing
- Use binary formats for performance
- Always validate loaded data
- Keep file naming clear and consistent
Practice Exercise
Task
- Create a NumPy array
- Save it as text, CSV, and binary
- Load each file back
- Verify the data integrity
What’s Next?
In the next lesson, you will learn how to handle missing values efficiently in NumPy, a critical step in real-world data processing.