Reading and Writing Data
from TXT and CSV Files in NumPy
Next Topic ⮕Data Preprocessing in NumPy
Introduction
Whether you're analyzing a dataset or saving the results of a computation, the ability to read from and write to files is foundational in data science workflows. In this tutorial, we’ll explore how to work with TXT and CSV files using NumPy’s efficient file I/O methods.
Why Use NumPy for File I/O?
While Python’s built-in file handling works just fine, NumPy’s functions like np.loadtxt
and np.savetxt
are optimized for numerical data. They allow you to handle structured, tabular data with fewer lines of code—and more performance.
1. Reading Data from a TXT File using np.loadtxt
Step-by-Step
- Create or download a simple
data.txt
file with numeric values separated by spaces or commas. - Use
np.loadtxt
to load the data.
Example Code
import numpy as np
# Assuming data.txt contains:
# 1.0 2.0 3.0
# 4.0 5.0 6.0
data = np.loadtxt('data.txt')
print(data)
[[1. 2. 3.]
[4. 5. 6.]]
Explanation
Each row from the text file is interpreted as a row in a 2D NumPy array. By default, np.loadtxt
assumes space-delimited values.
Common Errors & Checks
- Ensure all rows in your file have the same number of columns.
- If using commas, pass
delimiter=','
. - Watch out for non-numeric values—they’ll raise
ValueError
.
2. Reading CSV Files with np.genfromtxt
Why genfromtxt
?
CSV files sometimes contain headers, missing values, or mixed data types. np.genfromtxt
is more flexible than loadtxt
for such cases.
Example Code
# sample.csv:
# Name,Age,Salary
# Alice,30,70000
# Bob,25,50000
data = np.genfromtxt('sample.csv', delimiter=',', dtype=None, encoding='utf-8', names=True)
print(data)
[('Alice', 30, 70000) ('Bob', 25, 50000)]
Explanation
Here, names=True
tells NumPy to use the first row as column headers. The dtype=None
infers the best data type for each column.
3. Writing Data to Files with np.savetxt
Example Code
arr = np.array([[10, 20, 30], [40, 50, 60]])
# Save to text file
np.savetxt('output.txt', arr, fmt='%d')
print("Data written to output.txt")
Output in File (output.txt)
10 20 30
40 50 60
Parameters to Know
fmt='%d'
: Specify format (here, integer)delimiter=','
: Use this to write CSV filesheader='Title'
: Add a header row
4. Writing CSV Format
np.savetxt('output.csv', arr, delimiter=',', fmt='%d', header='A,B,C', comments='')
Output in File (output.csv)
A,B,C
10,20,30
40,50,60
Best Practices & Safety Checks
- Backup original files before writing.
- Check if file paths are correct using
os.path.exists
. - Use
with open()
context manager if combining with vanilla Python for more control. - Be cautious with encoding—especially for CSVs with text data.
Summary
With just a few lines of NumPy code, you can streamline your workflow of reading and writing data in structured formats. Whether it's a space-separated TXT file or a header-rich CSV, NumPy makes the process efficient and beginner-friendly.
What’s Next?
Now that you know how to handle real-world file input/output, you're ready to move toward data preprocessing, statistical analysis, or even visualization using libraries like Matplotlib or Pandas. NumPy is your first solid step in the data-driven journey.