Files and System Interaction
By completing this module you will be able to:
- Read and write text files
- Work with CSV and JSON files
- Handle paths and directories
- Interact with the operating system
Why Work with Files?
Until now, all the data in your programs disappeared when you closed the program. Variables live in RAM, and when the program ends… poof! Everything vanishes. It’s like your program has amnesia.
Files are the solution to this problem. They’re the way to store information permanently: on the hard drive, on a USB drive, in the cloud… Think about all the things that need to persist:
- The configuration of an application (user preferences)
- The data you process (customer list, products, transactions)
- The results of your program (reports, logs, exports)
- The input data your program needs to read
In this chapter you’ll learn to read and write files in different formats, navigate the file system, and organize your data professionally.
Reading and Writing Files
Working with files in Python is like opening a book: you can read it from beginning to end, jump to a specific page, write notes in the margins, or even create a new book from scratch.
Opening Files
The open() function opens the “door” to the file. The mode tells Python what you want to do with it:
1# Opening modes (think of them as "intentions")
2# 'r' - Read: "I just want to read" (default)
3# 'w' - Write: "I want to write from scratch" (careful: erases everything!)
4# 'a' - Append: "I want to add at the end" (without deleting existing content)
5# 'x' - Exclusive: "Create new" (error if already exists)
6
7file = open('data.txt', 'r')
8content = file.read()
9file.close()It’s important to close files after using them to release resources. The best way is to use the with context.
Using Context Managers (with)
Here comes a very useful trick. Imagine you have a butler who opens the door, lets you do what you need, and when you’re done, closes the door automatically. That’s exactly what with does:
1# Recommended way: file is closed automatically
2with open('data.txt', 'r') as file:
3 content = file.read()
4 print(content)
5# File is already closed hereReading Methods
How do you prefer to read a book? All at once, page by page, or chapter by chapter? Python gives you similar options. The choice depends on file size: for small files, read it all at once; for huge files, better line by line.
1# Read all content
2with open('data.txt', 'r') as f:
3 content = f.read()
4 print(content)
5
6# Read N characters
7with open('data.txt', 'r') as f:
8 first_100 = f.read(100)1# Read line by line
2with open('data.txt', 'r') as f:
3 line1 = f.readline() # First line
4 line2 = f.readline() # Second line
5 print(line1)
6 print(line2)1# Read all lines as list
2with open('data.txt', 'r') as f:
3 lines = f.readlines()
4 for line in lines:
5 print(line.strip()) # strip() removes \n1# Most efficient way for large files
2with open('data.txt', 'r') as f:
3 for line in f:
4 print(line.strip())Writing Methods
Writing is just as simple. Just remember the crucial difference: mode 'w' is like starting a new notebook (erases everything before), while 'a' is like continuing to write where you left off.
1# Write text (overwrites file - careful!)
2with open('output.txt', 'w') as f:
3 f.write("First line\n")
4 f.write("Second line\n")
5
6# Write multiple lines
7lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
8with open('output.txt', 'w') as f:
9 f.writelines(lines)
10
11# Append to end of file
12with open('output.txt', 'a') as f:
13 f.write("New line appended\n")Encoding Handling
If you work with text that has special characters (accents, etc.), you need to specify utf-8 encoding. Without it, you might see strange characters like “España” instead of “España”.
1# Specify encoding (ALWAYS for text with special characters)
2with open('data.txt', 'r', encoding='utf-8') as f:
3 content = f.read()
4
5# Write with encoding
6with open('output.txt', 'w', encoding='utf-8') as f:
7 f.write("Text with accents: España, Señor")Working with CSV
Have you ever exported an Excel sheet? You probably got a .csv file. CSV stands for “Comma Separated Values”, and it’s basically a spreadsheet saved as plain text.
It’s the universal format for exchanging tabular data: Excel understands it, Google Sheets, databases, and practically any program. If you need to pass data from one system to another, CSV is your friend.
Reading CSV Files
1import csv
2
3# Read as list of lists
4with open('data.csv', 'r', encoding='utf-8') as f:
5 reader = csv.reader(f)
6 for row in reader:
7 print(row) # ['column1', 'column2', ...]
8
9# Read as dictionaries (uses first row as header)
10with open('data.csv', 'r', encoding='utf-8') as f:
11 reader = csv.DictReader(f)
12 for row in reader:
13 print(row['name'], row['age'])Writing CSV Files
1import csv
2
3# Write from list of lists
4data = [
5 ['name', 'age', 'city'],
6 ['Ana', 25, 'Madrid'],
7 ['Luis', 30, 'Barcelona']
8]
9
10with open('output.csv', 'w', newline='', encoding='utf-8') as f:
11 writer = csv.writer(f)
12 writer.writerows(data)
13
14# Write from dictionaries
15people = [
16 {'name': 'Ana', 'age': 25},
17 {'name': 'Luis', 'age': 30}
18]
19
20with open('output.csv', 'w', newline='', encoding='utf-8') as f:
21 fields = ['name', 'age']
22 writer = csv.DictWriter(f, fieldnames=fields)
23 writer.writeheader() # Write header
24 writer.writerows(people)On Windows, it’s important to use newline='' when opening CSV files to avoid extra blank lines.
Working with JSON
JSON is the universal language of the internet. Every time an app on your phone connects to a server, they’re probably speaking JSON. It’s also the preferred format for configuration files.
The best part? JSON syntax is almost identical to Python dictionaries. It’s as if someone designed a file format thinking about Python (although it actually comes from JavaScript).
Reading JSON
1import json
2
3# From file
4with open('data.json', 'r', encoding='utf-8') as f:
5 data = json.load(f)
6 print(data)
7
8# From string
9json_text = '{"name": "Ana", "age": 25}'
10data = json.loads(json_text)
11print(data['name']) # AnaWriting JSON
1import json
2
3data = {
4 'name': 'Ana',
5 'age': 25,
6 'hobbies': ['reading', 'music'],
7 'active': True
8}
9
10# To file
11with open('output.json', 'w', encoding='utf-8') as f:
12 json.dump(data, f, indent=2, ensure_ascii=False)
13
14# To string
15text = json.dumps(data, indent=2, ensure_ascii=False)
16print(text)Useful parameters:
indent: Indentation for readable formatensure_ascii=False: Allows non-ASCII characters (accents)sort_keys=True: Sorts keys alphabetically
Path and Directory Handling
Navigating the file system from code is like using Windows Explorer or Finder, but by writing commands. You need to know where you are, how to move between folders, and how to build paths to files.
Python has two ways to do this: the traditional (os.path) and the modern (pathlib). I’ll teach you both, but pathlib is clearly superior and you should use it in new code.
os.path Module (the veteran)
1import os
2
3# Get current directory
4print(os.getcwd())
5
6# Build paths portably
7path = os.path.join('folder', 'subfolder', 'file.txt')
8print(path) # folder/subfolder/file.txt (or with \ on Windows)
9
10# Path information
11path = '/home/user/document.txt'
12print(os.path.dirname(path)) # /home/user
13print(os.path.basename(path)) # document.txt
14print(os.path.splitext(path)) # ('/home/user/document', '.txt')
15
16# Check existence
17print(os.path.exists('file.txt'))
18print(os.path.isfile('file.txt'))
19print(os.path.isdir('folder'))pathlib Module (modern and elegant)
pathlib treats paths as smart objects that “know” things about themselves. Instead of calling functions passing strings, you create a Path object and ask it directly. It’s much more intuitive:
1from pathlib import Path
2
3# Create Path object
4path = Path('folder') / 'subfolder' / 'file.txt'
5print(path)
6
7# Current directory
8current = Path.cwd()
9home = Path.home()
10
11# Properties
12file = Path('/home/user/document.txt')
13print(file.name) # document.txt
14print(file.stem) # document
15print(file.suffix) # .txt
16print(file.parent) # /home/user
17
18# Checks
19print(file.exists())
20print(file.is_file())
21print(file.is_dir())
22
23# Read/write directly
24content = Path('data.txt').read_text(encoding='utf-8')
25Path('output.txt').write_text('Content', encoding='utf-8')Directory Operations
Creating folders, listing contents, searching for files… the typical tasks you would do manually in the file explorer.
1import os
2from pathlib import Path
3
4# Create directory
5os.makedirs('new/folder', exist_ok=True)
6# or with pathlib
7Path('new/folder').mkdir(parents=True, exist_ok=True)
8
9# List contents
10for item in os.listdir('.'):
11 print(item)
12
13# With pathlib (more powerful)
14for item in Path('.').iterdir():
15 if item.is_file():
16 print(f"File: {item}")
17
18# Find files with pattern
19for file in Path('.').glob('*.txt'):
20 print(file)
21
22# Find recursively
23for file in Path('.').rglob('*.py'):
24 print(file)Copy, Move, Delete
With great power comes great responsibility. These operations can destroy data if you’re not careful. Unlike when you delete something manually (which goes to the recycle bin), here deletion is permanent.
1import shutil
2import os
3from pathlib import Path
4
5# Copy file
6shutil.copy('source.txt', 'destination.txt')
7
8# Copy entire directory
9shutil.copytree('source_folder', 'destination_folder')
10
11# Move/rename
12shutil.move('old.txt', 'new.txt')
13# or with pathlib
14Path('old.txt').rename('new.txt')
15
16# Delete file
17os.remove('file.txt')
18# or with pathlib
19Path('file.txt').unlink()
20
21# Delete empty directory
22os.rmdir('empty_folder')
23
24# Delete directory with contents
25shutil.rmtree('folder_with_files')shutil.rmtree() deletes all content irreversibly. Always verify the path before using it.
Exception Handling with Files
Files are a common source of errors: the file doesn’t exist, you don’t have permissions, the disk is full… Your code should be prepared for these situations.
1from pathlib import Path
2
3try:
4 with open('nonexistent_file.txt', 'r') as f:
5 content = f.read()
6except FileNotFoundError:
7 print("File does not exist")
8except PermissionError:
9 print("You don't have permission to read the file")
10except IOError as e:
11 print(f"I/O error: {e}")
12
13# Check before opening
14file = Path('data.txt')
15if file.exists():
16 content = file.read_text()
17else:
18 print("File not found")Which Format to Use?
With so many options, it’s normal to wonder which one to choose. Here’s a practical guide:
| Type of data | Recommended format | Why? |
|---|---|---|
| Simple table (rows and columns) | CSV | Easy to open in Excel, lightweight |
| Structured/nested data | JSON | Supports lists within lists, nested dictionaries |
| Unformatted text | .txt | Simple, universal |
| Application configuration | JSON | Readable, supports data types |
| Logs/records | .txt or .log | Easy to append lines (‘a’ mode) |
| Exchange with web APIs | JSON | It’s the de facto standard |
If your data looks like a spreadsheet (rows with the same columns), use CSV. If your data has variable structure (objects with different fields, nested lists), use JSON.
Practical Exercises
Create a program that reads a text file and shows:
- Total number of lines
- Total number of words
- The 5 most frequent words
Write a function that converts a CSV file to JSON.
Create a script that organizes files in a folder into subfolders by extension.