pickle module.
pickle is like vacuum-sealing a fruit basket for storage — everything inside (layout, labels, even the smallest detail) is preserved exactly so that tomorrow you can unseal it and the basket is identical to how you packed it.
6.1 What is a Binary File?
In Chapter 4 you saw that every file on disk is ultimately a stream of bytes. A text file is a stream where those bytes happen to be the character codes of letters, digits and punctuation. A binary file is a stream where the bytes can mean anything the writing program decides.
Examples of binary files you already use:
- Images —
.jpg,.png,.gif - Audio / video —
.mp3,.mp4,.wav - Programs —
.exe,.dll - Compressed archives —
.zip,.rar - Python “pickle” files —
.dat,.pkl(the focus of this chapter)
.jpg file in Notepad — what you see is meaningless-looking scrambled text. The file is perfectly fine; it just isn’t meant for human eyes.
6.2 Text vs Binary — Recap & Refinement
| Text mode | Binary mode | |
|---|---|---|
| Mode string | "r" / "w" / "a" | "rb" / "wb" / "ab" |
| Data you read | str (characters) | bytes |
| Data you write | str | bytes (or via pickle) |
| Line endings | OS-converted | Left untouched |
| Human-readable in Notepad? | Yes | No |
| Best for | Notes, logs, CSV | Images, audio, full Python objects |
6.3 Opening a Binary File
Just add a "b" to any mode from Chapter 4. Everything else — closing, with, exceptions — works the same way.
| Mode | Purpose | File must exist? | Starts at |
|---|---|---|---|
"rb" | Read binary | Yes | Start |
"wb" | Write binary (truncates) | No | Start |
"ab" | Append binary | No | End |
"rb+" | Read + write binary | Yes | Start |
"wb+" | Write + read (truncates) | No | Start |
"ab+" | Append + read | No | End |
UnicodeDecodeError or — worse — a subtly corrupted file.
6.4 The pickle Module — saving Python objects
6.4.1 Why do we need pickle?
Suppose you want to save a dictionary of student marks:
If you save this in a text file with write(), you must convert every piece to a string (and later write code to parse it back). pickle does all of that for you in one line — lists, dicts, tuples, nested structures, even your own classes — all round-trip perfectly.
6.4.2 Importing the module
6.4.3 pickle.dump(object, file_object) — write
Takes any Python object and a file opened in a binary-write mode, and stores the object in that file.
6.4.4 pickle.load(file_object) — read
Takes a file opened in a binary-read mode and rebuilds one object. Each load reads exactly one object that was written by one matching dump.
dump → one load. If you wrote three dictionaries with three dump calls, you must call load three times to read them all back.
6.4.5 What pickle can (and cannot) save
| Can pickle | Cannot pickle |
|---|---|
int, float, bool, str, bytes, None | Open file objects |
list, tuple, dict, set (any nesting) | Network sockets |
| Your own classes and their instances | Lambda / local functions (in most cases) |
| Many standard-library objects | GUI window handles |
6.5 Writing Multiple Records
A “record” is whatever object you wrote with one dump call. You have two clean patterns:
6.5.1 Pattern A — one big list, one dump
Easiest when the data fits in memory. One dump/load, simple to read, simple to search (just loop over the list).
6.5.2 Pattern B — many small dumps
Useful when records arrive over time and you want to append them without rewriting the whole file.
EOFError is how you detect end-of-file in Pattern B. pickle.load() raises EOFError when there is nothing left to read.
6.6 Appending New Records — mode "ab"
Open the file in "ab" and call pickle.dump — the new object is appended at the end, the old records are untouched.
"wb".
6.6.1 Append pattern for a “one-big-list” file
6.7 Searching for a Record
Reading every record until you find the one you want.
6.8 Updating an Existing Record
Binary files cannot be edited in the middle. The standard CBSE pattern is read everything → modify → write everything back.
6.9 Deleting a Record
6.10 File Pointer in Binary Mode — seek() & tell()
Binary mode is the only mode where seek(offset, whence=1 or 2) is fully reliable. The position is measured in bytes.
whence | Constant | Reference point |
|---|---|---|
0 | os.SEEK_SET | Start of the file |
1 | os.SEEK_CUR | Current pointer position |
2 | os.SEEK_END | End of the file |
6.10.1 Reading the last record without scanning
This is advanced; CBSE typically asks only the read/scan method — but it’s worth seeing once.
6.11 Safety Note — never unpickle untrusted data
pickle.load reconstructs arbitrary Python objects, including ones that run code while being re-created. Never unpickle a .dat / .pkl file that came from a stranger — it is equivalent to running their Python program. For data exchange with other systems, prefer text formats like JSON or CSV.
6.12 CBSE-style Worked Programs
6.12.1 Create a binary file of student records
6.12.2 Display every record from the file
6.12.3 Count how many students scored more than 75
6.12.4 Search by name (case-insensitive)
6.12.5 Update the marks for a given roll number
Shown in §6.8 — the classic read-all / modify / write-all pattern.
6.12.6 Copy every “pass” student (≥ 33) to a new file
6.12.7 Phone book — add / display / search
6.13 Common Mistakes to Avoid
| # | Mistake | Fix |
|---|---|---|
| 1 | Opening a pickle file in text mode ("r" / "w") | Always use "rb" / "wb" / "ab" |
| 2 | Forgetting the try / except EOFError loop | It is the standard way to reach end-of-file in Pattern B |
| 3 | Wrong pairing — dumped 3 objects but called load() only once | One dump → one load |
| 4 | Trying to update a record in place | Binary files need read-all / modify / write-all |
| 5 | Forgetting import pickle | Always add it at the top |
| 6 | Using "wb" when you meant "ab" — file was wiped | "wb" truncates; use "ab" to add |
| 7 | Unpickling a file received from someone you don’t trust | Never do it — pickle can execute arbitrary code |
Quick-revision summary
- Binary files store raw bytes; open them with
"rb","wb"or"ab". pickle.dump(obj, f)writes one Python object;pickle.load(f)reads one back.- Two writing patterns: (A) one big list — simple; (B) many small dumps — allows genuine append.
- End-of-file in Pattern B is detected with
try … except EOFError. - Records cannot be edited in place. To update or delete, load everything, change the list, and write the whole thing back in
"wb". - Binary mode allows full
seek(offset, whence)withwhence = 0, 1, 2; positions are in bytes. - Never unpickle files from untrusted sources.