VM-LEARNING /class.xii ·track.cs ·ch-1-6 session: 2026_27
$cd ..

~/Binary Files

root@vm-learning ~ $ open ch-1-6
UNIT 1 ▪ CHAPTER 6
06
Binary Files
pickle.dump · pickle.load · Create · Read · Search · Append · Update · seek & tell
A binary file stores raw bytes — the exact pattern of zeros and ones that Python (or any other program) decided to write. Binary files can hold anything: a Python list, a dictionary, an image, an audio clip, or a complete ML model. In this chapter you will learn how to save and reload entire Python objects with the pickle module.
Real-life analogy. Writing to a text file is like dictating — you speak the characters out loud and the listener writes them down. Writing to a binary file with pickle is like vacuum-sealing a fruit basket for storage — everything inside (layout, labels, even the smallest detail) is preserved exactly so that tomorrow you can unseal it and the basket is identical to how you packed it.

6.1 What is a Binary File?

In Chapter 4 you saw that every file on disk is ultimately a stream of bytes. A text file is a stream where those bytes happen to be the character codes of letters, digits and punctuation. A binary file is a stream where the bytes can mean anything the writing program decides.

Examples of binary files you already use:

Try opening a .jpg file in Notepad — what you see is meaningless-looking scrambled text. The file is perfectly fine; it just isn’t meant for human eyes.

6.2 Text vs Binary — Recap & Refinement

Text modeBinary mode
Mode string"r" / "w" / "a""rb" / "wb" / "ab"
Data you readstr (characters)bytes
Data you writestrbytes (or via pickle)
Line endingsOS-convertedLeft untouched
Human-readable in Notepad?YesNo
Best forNotes, logs, CSVImages, audio, full Python objects

6.3 Opening a Binary File

Just add a "b" to any mode from Chapter 4. Everything else — closing, with, exceptions — works the same way.

ModePurposeFile must exist?Starts at
"rb"Read binaryYesStart
"wb"Write binary (truncates)NoStart
"ab"Append binaryNoEnd
"rb+"Read + write binaryYesStart
"wb+"Write + read (truncates)NoStart
"ab+"Append + readNoEnd
Do not open a pickle file in text mode. You will get a UnicodeDecodeError or — worse — a subtly corrupted file.

6.4 The pickle Module — saving Python objects

Pickling (also called serialisation) is the process of converting a Python object into a byte-stream that can be stored on disk. Unpickling (deserialisation) rebuilds the original object from those bytes.

6.4.1 Why do we need pickle?

Suppose you want to save a dictionary of student marks:

student = {"name": "Asha", "cls": 12, "marks": [78, 85, 92]}

If you save this in a text file with write(), you must convert every piece to a string (and later write code to parse it back). pickle does all of that for you in one line — lists, dicts, tuples, nested structures, even your own classes — all round-trip perfectly.

6.4.2 Importing the module

import pickle

6.4.3 pickle.dump(object, file_object) — write

Takes any Python object and a file opened in a binary-write mode, and stores the object in that file.

import pickle student = {"name": "Asha", "cls": 12, "marks": [78, 85, 92]} with open("student.dat", "wb") as f: pickle.dump(student, f) print("Saved.")

6.4.4 pickle.load(file_object) — read

Takes a file opened in a binary-read mode and rebuilds one object. Each load reads exactly one object that was written by one matching dump.

import pickle with open("student.dat", "rb") as f: s = pickle.load(f) print(type(s), s) print("Name :", s["name"]) print("Marks:", s["marks"])
<class 'dict'> {'name': 'Asha', 'cls': 12, 'marks': [78, 85, 92]} Name : Asha Marks: [78, 85, 92]
Rule of thumb: one dump → one load. If you wrote three dictionaries with three dump calls, you must call load three times to read them all back.

6.4.5 What pickle can (and cannot) save

Can pickleCannot pickle
int, float, bool, str, bytes, NoneOpen file objects
list, tuple, dict, set (any nesting)Network sockets
Your own classes and their instancesLambda / local functions (in most cases)
Many standard-library objectsGUI window handles

6.5 Writing Multiple Records

A “record” is whatever object you wrote with one dump call. You have two clean patterns:

6.5.1 Pattern A — one big list, one dump

import pickle students = [ {"roll": 1, "name": "Asha", "marks": 92}, {"roll": 2, "name": "Rahul", "marks": 78}, {"roll": 3, "name": "Riya", "marks": 85}, ] with open("students.dat", "wb") as f: pickle.dump(students, f) # one dump, one list with open("students.dat", "rb") as f: data = pickle.load(f) # one load, one list print(data)

Easiest when the data fits in memory. One dump/load, simple to read, simple to search (just loop over the list).

6.5.2 Pattern B — many small dumps

import pickle with open("students.dat", "wb") as f: for s in students: pickle.dump(s, f) # one dump per record # Reading back — loop until EOFError with open("students.dat", "rb") as f: while True: try: rec = pickle.load(f) print(rec) except EOFError: break # file finished

Useful when records arrive over time and you want to append them without rewriting the whole file.

Catching EOFError is how you detect end-of-file in Pattern B. pickle.load() raises EOFError when there is nothing left to read.

6.6 Appending New Records — mode "ab"

Open the file in "ab" and call pickle.dump — the new object is appended at the end, the old records are untouched.

import pickle new_rec = {"roll": 4, "name": "Ankit", "marks": 71} with open("students.dat", "ab") as f: pickle.dump(new_rec, f) print("Appended Ankit.")
Only Pattern B files can be truly appended. If you used Pattern A (one big list), you must load the list, append to it, and dump it again in "wb".

6.6.1 Append pattern for a “one-big-list” file

with open("students.dat", "rb") as f: students = pickle.load(f) students.append({"roll": 4, "name": "Ankit", "marks": 71}) with open("students.dat", "wb") as f: pickle.dump(students, f)

6.7 Searching for a Record

Reading every record until you find the one you want.

import pickle target_roll = int(input("Roll to find: ")) found = False with open("students.dat", "rb") as f: while True: try: rec = pickle.load(f) except EOFError: break if rec["roll"] == target_roll: print("Match:", rec) found = True break # stop at the first hit if not found: print("No student with roll", target_roll)

6.8 Updating an Existing Record

Binary files cannot be edited in the middle. The standard CBSE pattern is read everything → modify → write everything back.

import pickle roll_to_update = 2 new_marks = 88 # 1) Read every record into a list records = [] with open("students.dat", "rb") as f: while True: try: records.append(pickle.load(f)) except EOFError: break # 2) Modify the target record for r in records: if r["roll"] == roll_to_update: r["marks"] = new_marks break # 3) Write everything back (truncates the old file) with open("students.dat", "wb") as f: for r in records: pickle.dump(r, f) print(f"Updated roll {roll_to_update} → marks = {new_marks}")
Exactly the same pattern works for deleting: filter the unwanted record out of the list before step 3, then rewrite.

6.9 Deleting a Record

import pickle roll_to_delete = 3 records = [] with open("students.dat", "rb") as f: while True: try: records.append(pickle.load(f)) except EOFError: break records = [r for r in records if r["roll"] != roll_to_delete] with open("students.dat", "wb") as f: for r in records: pickle.dump(r, f) print("Deleted.")

6.10 File Pointer in Binary Mode — seek() & tell()

Binary mode is the only mode where seek(offset, whence=1 or 2) is fully reliable. The position is measured in bytes.

whenceConstantReference point
0os.SEEK_SETStart of the file
1os.SEEK_CURCurrent pointer position
2os.SEEK_ENDEnd of the file
with open("students.dat", "rb") as f: f.seek(0, 2) # jump to end print("File size =", f.tell(), "bytes") f.seek(0) # back to start rec = pickle.load(f) print("First record:", rec)

6.10.1 Reading the last record without scanning

This is advanced; CBSE typically asks only the read/scan method — but it’s worth seeing once.

import pickle with open("students.dat", "rb") as f: last = None while True: try: last = pickle.load(f) except EOFError: break print("Last record:", last)

6.11 Safety Note — never unpickle untrusted data

pickle.load reconstructs arbitrary Python objects, including ones that run code while being re-created. Never unpickle a .dat / .pkl file that came from a stranger — it is equivalent to running their Python program. For data exchange with other systems, prefer text formats like JSON or CSV.

6.12 CBSE-style Worked Programs

6.12.1 Create a binary file of student records

import pickle def create_file(): while True: roll = int(input("Roll : ")) name = input("Name : ") marks = float(input("Marks: ")) rec = {"roll": roll, "name": name, "marks": marks} with open("students.dat", "ab") as f: pickle.dump(rec, f) if input("Another? (y/n): ").lower() != "y": break create_file()

6.12.2 Display every record from the file

import pickle with open("students.dat", "rb") as f: while True: try: rec = pickle.load(f) print(rec) except EOFError: break

6.12.3 Count how many students scored more than 75

import pickle count = 0 with open("students.dat", "rb") as f: while True: try: r = pickle.load(f) if r["marks"] > 75: count += 1 except EOFError: break print("Students above 75:", count)

6.12.4 Search by name (case-insensitive)

import pickle target = input("Name to search: ").lower() with open("students.dat", "rb") as f: while True: try: r = pickle.load(f) except EOFError: break if r["name"].lower() == target: print("Found:", r) break else: print("No match.")

6.12.5 Update the marks for a given roll number

Shown in §6.8 — the classic read-all / modify / write-all pattern.

6.12.6 Copy every “pass” student (≥ 33) to a new file

import pickle with open("students.dat", "rb") as src, open("passed.dat", "wb") as dst: while True: try: r = pickle.load(src) if r["marks"] >= 33: pickle.dump(r, dst) except EOFError: break print("Copied passing students to passed.dat")

6.12.7 Phone book — add / display / search

import pickle, os FILE = "phonebook.dat" def load(): if not os.path.exists(FILE): return {} with open(FILE, "rb") as f: return pickle.load(f) def save(book): with open(FILE, "wb") as f: pickle.dump(book, f) def menu(): while True: print("\n1. Add 2. Show 3. Search 4. Quit") c = input("Choice: ") book = load() if c == "1": name = input("Name : ") ph = input("Phone: ") book[name] = ph save(book) elif c == "2": for n, p in book.items(): print(n, "->", p) elif c == "3": n = input("Name: ") print(book.get(n, "Not found")) else: break menu()

6.13 Common Mistakes to Avoid

#MistakeFix
1Opening a pickle file in text mode ("r" / "w")Always use "rb" / "wb" / "ab"
2Forgetting the try / except EOFError loopIt is the standard way to reach end-of-file in Pattern B
3Wrong pairing — dumped 3 objects but called load() only onceOne dump → one load
4Trying to update a record in placeBinary files need read-all / modify / write-all
5Forgetting import pickleAlways add it at the top
6Using "wb" when you meant "ab" — file was wiped"wb" truncates; use "ab" to add
7Unpickling a file received from someone you don’t trustNever do it — pickle can execute arbitrary code

Quick-revision summary

  • Binary files store raw bytes; open them with "rb", "wb" or "ab".
  • pickle.dump(obj, f) writes one Python object; pickle.load(f) reads one back.
  • Two writing patterns: (A) one big list — simple; (B) many small dumps — allows genuine append.
  • End-of-file in Pattern B is detected with try … except EOFError.
  • Records cannot be edited in place. To update or delete, load everything, change the list, and write the whole thing back in "wb".
  • Binary mode allows full seek(offset, whence) with whence = 0, 1, 2; positions are in bytes.
  • Never unpickle files from untrusted sources.
🧠Practice Quiz — test yourself on this chapter