Files and System Interaction

Module Objectives

By completing this module you will be able to:

  • Read and write text files
  • Work with CSV and JSON files
  • Handle paths and directories
  • Interact with the operating system

Why Work with Files?

Until now, all the data in your programs disappeared when you closed the program. Variables live in RAM, and when the program ends… poof! Everything vanishes. It’s like your program has amnesia.

Files are the solution to this problem. They’re the way to store information permanently: on the hard drive, on a USB drive, in the cloud… Think about all the things that need to persist:

  • The configuration of an application (user preferences)
  • The data you process (customer list, products, transactions)
  • The results of your program (reports, logs, exports)
  • The input data your program needs to read

In this chapter you’ll learn to read and write files in different formats, navigate the file system, and organize your data professionally.


Reading and Writing Files

Working with files in Python is like opening a book: you can read it from beginning to end, jump to a specific page, write notes in the margins, or even create a new book from scratch.

Opening Files

The open() function opens the “door” to the file. The mode tells Python what you want to do with it:

1# Opening modes (think of them as "intentions")
2# 'r' - Read: "I just want to read" (default)
3# 'w' - Write: "I want to write from scratch" (careful: erases everything!)
4# 'a' - Append: "I want to add at the end" (without deleting existing content)
5# 'x' - Exclusive: "Create new" (error if already exists)
6
7file = open('data.txt', 'r')
8content = file.read()
9file.close()
Always close files

It’s important to close files after using them to release resources. The best way is to use the with context.

Using Context Managers (with)

Here comes a very useful trick. Imagine you have a butler who opens the door, lets you do what you need, and when you’re done, closes the door automatically. That’s exactly what with does:

1# Recommended way: file is closed automatically
2with open('data.txt', 'r') as file:
3    content = file.read()
4    print(content)
5# File is already closed here

Reading Methods

How do you prefer to read a book? All at once, page by page, or chapter by chapter? Python gives you similar options. The choice depends on file size: for small files, read it all at once; for huge files, better line by line.

1# Read all content
2with open('data.txt', 'r') as f:
3    content = f.read()
4    print(content)
5
6# Read N characters
7with open('data.txt', 'r') as f:
8    first_100 = f.read(100)
1# Read line by line
2with open('data.txt', 'r') as f:
3    line1 = f.readline()  # First line
4    line2 = f.readline()  # Second line
5    print(line1)
6    print(line2)
1# Read all lines as list
2with open('data.txt', 'r') as f:
3    lines = f.readlines()
4    for line in lines:
5        print(line.strip())  # strip() removes \n
1# Most efficient way for large files
2with open('data.txt', 'r') as f:
3    for line in f:
4        print(line.strip())

Writing Methods

Writing is just as simple. Just remember the crucial difference: mode 'w' is like starting a new notebook (erases everything before), while 'a' is like continuing to write where you left off.

 1# Write text (overwrites file - careful!)
 2with open('output.txt', 'w') as f:
 3    f.write("First line\n")
 4    f.write("Second line\n")
 5
 6# Write multiple lines
 7lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
 8with open('output.txt', 'w') as f:
 9    f.writelines(lines)
10
11# Append to end of file
12with open('output.txt', 'a') as f:
13    f.write("New line appended\n")

Encoding Handling

If you work with text that has special characters (accents, etc.), you need to specify utf-8 encoding. Without it, you might see strange characters like “España” instead of “España”.

1# Specify encoding (ALWAYS for text with special characters)
2with open('data.txt', 'r', encoding='utf-8') as f:
3    content = f.read()
4
5# Write with encoding
6with open('output.txt', 'w', encoding='utf-8') as f:
7    f.write("Text with accents: España, Señor")

Working with CSV

Have you ever exported an Excel sheet? You probably got a .csv file. CSV stands for “Comma Separated Values”, and it’s basically a spreadsheet saved as plain text.

It’s the universal format for exchanging tabular data: Excel understands it, Google Sheets, databases, and practically any program. If you need to pass data from one system to another, CSV is your friend.

Reading CSV Files

 1import csv
 2
 3# Read as list of lists
 4with open('data.csv', 'r', encoding='utf-8') as f:
 5    reader = csv.reader(f)
 6    for row in reader:
 7        print(row)  # ['column1', 'column2', ...]
 8
 9# Read as dictionaries (uses first row as header)
10with open('data.csv', 'r', encoding='utf-8') as f:
11    reader = csv.DictReader(f)
12    for row in reader:
13        print(row['name'], row['age'])

Writing CSV Files

 1import csv
 2
 3# Write from list of lists
 4data = [
 5    ['name', 'age', 'city'],
 6    ['Ana', 25, 'Madrid'],
 7    ['Luis', 30, 'Barcelona']
 8]
 9
10with open('output.csv', 'w', newline='', encoding='utf-8') as f:
11    writer = csv.writer(f)
12    writer.writerows(data)
13
14# Write from dictionaries
15people = [
16    {'name': 'Ana', 'age': 25},
17    {'name': 'Luis', 'age': 30}
18]
19
20with open('output.csv', 'w', newline='', encoding='utf-8') as f:
21    fields = ['name', 'age']
22    writer = csv.DictWriter(f, fieldnames=fields)
23    writer.writeheader()  # Write header
24    writer.writerows(people)
newline=''

On Windows, it’s important to use newline='' when opening CSV files to avoid extra blank lines.


Working with JSON

JSON is the universal language of the internet. Every time an app on your phone connects to a server, they’re probably speaking JSON. It’s also the preferred format for configuration files.

The best part? JSON syntax is almost identical to Python dictionaries. It’s as if someone designed a file format thinking about Python (although it actually comes from JavaScript).

Reading JSON

 1import json
 2
 3# From file
 4with open('data.json', 'r', encoding='utf-8') as f:
 5    data = json.load(f)
 6    print(data)
 7
 8# From string
 9json_text = '{"name": "Ana", "age": 25}'
10data = json.loads(json_text)
11print(data['name'])  # Ana

Writing JSON

 1import json
 2
 3data = {
 4    'name': 'Ana',
 5    'age': 25,
 6    'hobbies': ['reading', 'music'],
 7    'active': True
 8}
 9
10# To file
11with open('output.json', 'w', encoding='utf-8') as f:
12    json.dump(data, f, indent=2, ensure_ascii=False)
13
14# To string
15text = json.dumps(data, indent=2, ensure_ascii=False)
16print(text)

Useful parameters:

  • indent: Indentation for readable format
  • ensure_ascii=False: Allows non-ASCII characters (accents)
  • sort_keys=True: Sorts keys alphabetically

Path and Directory Handling

Navigating the file system from code is like using Windows Explorer or Finder, but by writing commands. You need to know where you are, how to move between folders, and how to build paths to files.

Python has two ways to do this: the traditional (os.path) and the modern (pathlib). I’ll teach you both, but pathlib is clearly superior and you should use it in new code.

os.path Module (the veteran)

 1import os
 2
 3# Get current directory
 4print(os.getcwd())
 5
 6# Build paths portably
 7path = os.path.join('folder', 'subfolder', 'file.txt')
 8print(path)  # folder/subfolder/file.txt (or with \ on Windows)
 9
10# Path information
11path = '/home/user/document.txt'
12print(os.path.dirname(path))   # /home/user
13print(os.path.basename(path))  # document.txt
14print(os.path.splitext(path))  # ('/home/user/document', '.txt')
15
16# Check existence
17print(os.path.exists('file.txt'))
18print(os.path.isfile('file.txt'))
19print(os.path.isdir('folder'))

pathlib Module (modern and elegant)

pathlib treats paths as smart objects that “know” things about themselves. Instead of calling functions passing strings, you create a Path object and ask it directly. It’s much more intuitive:

 1from pathlib import Path
 2
 3# Create Path object
 4path = Path('folder') / 'subfolder' / 'file.txt'
 5print(path)
 6
 7# Current directory
 8current = Path.cwd()
 9home = Path.home()
10
11# Properties
12file = Path('/home/user/document.txt')
13print(file.name)      # document.txt
14print(file.stem)      # document
15print(file.suffix)    # .txt
16print(file.parent)    # /home/user
17
18# Checks
19print(file.exists())
20print(file.is_file())
21print(file.is_dir())
22
23# Read/write directly
24content = Path('data.txt').read_text(encoding='utf-8')
25Path('output.txt').write_text('Content', encoding='utf-8')

Directory Operations

Creating folders, listing contents, searching for files… the typical tasks you would do manually in the file explorer.

 1import os
 2from pathlib import Path
 3
 4# Create directory
 5os.makedirs('new/folder', exist_ok=True)
 6# or with pathlib
 7Path('new/folder').mkdir(parents=True, exist_ok=True)
 8
 9# List contents
10for item in os.listdir('.'):
11    print(item)
12
13# With pathlib (more powerful)
14for item in Path('.').iterdir():
15    if item.is_file():
16        print(f"File: {item}")
17
18# Find files with pattern
19for file in Path('.').glob('*.txt'):
20    print(file)
21
22# Find recursively
23for file in Path('.').rglob('*.py'):
24    print(file)

Copy, Move, Delete

With great power comes great responsibility. These operations can destroy data if you’re not careful. Unlike when you delete something manually (which goes to the recycle bin), here deletion is permanent.

 1import shutil
 2import os
 3from pathlib import Path
 4
 5# Copy file
 6shutil.copy('source.txt', 'destination.txt')
 7
 8# Copy entire directory
 9shutil.copytree('source_folder', 'destination_folder')
10
11# Move/rename
12shutil.move('old.txt', 'new.txt')
13# or with pathlib
14Path('old.txt').rename('new.txt')
15
16# Delete file
17os.remove('file.txt')
18# or with pathlib
19Path('file.txt').unlink()
20
21# Delete empty directory
22os.rmdir('empty_folder')
23
24# Delete directory with contents
25shutil.rmtree('folder_with_files')
Be careful with shutil.rmtree

shutil.rmtree() deletes all content irreversibly. Always verify the path before using it.

Exception Handling with Files

Files are a common source of errors: the file doesn’t exist, you don’t have permissions, the disk is full… Your code should be prepared for these situations.

 1from pathlib import Path
 2
 3try:
 4    with open('nonexistent_file.txt', 'r') as f:
 5        content = f.read()
 6except FileNotFoundError:
 7    print("File does not exist")
 8except PermissionError:
 9    print("You don't have permission to read the file")
10except IOError as e:
11    print(f"I/O error: {e}")
12
13# Check before opening
14file = Path('data.txt')
15if file.exists():
16    content = file.read_text()
17else:
18    print("File not found")

Which Format to Use?

With so many options, it’s normal to wonder which one to choose. Here’s a practical guide:

Type of data Recommended format Why?
Simple table (rows and columns) CSV Easy to open in Excel, lightweight
Structured/nested data JSON Supports lists within lists, nested dictionaries
Unformatted text .txt Simple, universal
Application configuration JSON Readable, supports data types
Logs/records .txt or .log Easy to append lines (‘a’ mode)
Exchange with web APIs JSON It’s the de facto standard
General rule

If your data looks like a spreadsheet (rows with the same columns), use CSV. If your data has variable structure (objects with different fields, nested lists), use JSON.


Practical Exercises

Exercise 1: Word Counter

Create a program that reads a text file and shows:

  • Total number of lines
  • Total number of words
  • The 5 most frequent words
 1from collections import Counter
 2
 3def analyze_text(file_path):
 4    with open(file_path, 'r', encoding='utf-8') as f:
 5        content = f.read()
 6
 7    lines = content.split('\n')
 8    words = content.lower().split()
 9
10    # Clean words from punctuation
11    import re
12    clean_words = [re.sub(r'[^\w]', '', w) for w in words if w]
13
14    frequency = Counter(clean_words)
15
16    print(f"Lines: {len(lines)}")
17    print(f"Words: {len(clean_words)}")
18    print("Top 5 words:")
19    for word, count in frequency.most_common(5):
20        print(f"  {word}: {count}")
21
22# Usage
23analyze_text('my_text.txt')
Exercise 2: CSV to JSON Converter

Write a function that converts a CSV file to JSON.

 1import csv
 2import json
 3
 4def csv_to_json(csv_file, json_file):
 5    data = []
 6
 7    with open(csv_file, 'r', encoding='utf-8') as f:
 8        reader = csv.DictReader(f)
 9        for row in reader:
10            data.append(row)
11
12    with open(json_file, 'w', encoding='utf-8') as f:
13        json.dump(data, f, indent=2, ensure_ascii=False)
14
15    print(f"Converted {len(data)} records to {json_file}")
16
17# Usage
18csv_to_json('data.csv', 'data.json')
Exercise 3: File Organizer

Create a script that organizes files in a folder into subfolders by extension.

 1from pathlib import Path
 2import shutil
 3
 4def organize_by_extension(source_folder):
 5    source = Path(source_folder)
 6
 7    if not source.exists():
 8        print("Folder does not exist")
 9        return
10
11    for file in source.iterdir():
12        if file.is_file():
13            # Get extension without the dot
14            extension = file.suffix.lower()[1:] or 'no_extension'
15
16            # Create destination folder
17            destination = source / extension
18            destination.mkdir(exist_ok=True)
19
20            # Move file
21            new_name = destination / file.name
22            shutil.move(str(file), str(new_name))
23            print(f"Moved: {file.name}{extension}/")
24
25# Usage
26organize_by_extension('downloads')

Quiz

🎮 Quiz: Files

0 / 0
Loading questions...

Previous: Strings and Dates Next: Basic Functions