Back to articles

Reading Files: Line-by-Line vs Loading Entire File into RAM

SystemsPerformanceMemoryFile I/O

When your program reads a file, it makes a fundamental choice: load everything into memory at once, or process it piece by piece. For small files, it doesn't matter. For large ones—logs, datasets, ETL pipelines—this decision can be the difference between a fast program and an out-of-memory crash.


Reading Line-by-Line (Streaming)

Instead of pulling the whole file into RAM, your program reads one line at a time, processes it, and moves on. The previous line is eligible for garbage collection immediately.

Python example

# Streams the file — only one line in memory at a time
with open("access.log", "r") as f:
    for line in f:
        if "ERROR" in line:
            print(line.strip())

Even if access.log is 50 GB, this script uses only a few KB of RAM at any given moment.

Go example

file, _ := os.Open("access.log")
defer file.Close()

scanner := bufio.NewScanner(file)
for scanner.Scan() {
    line := scanner.Text()
    if strings.Contains(line, "ERROR") {
        fmt.Println(line)
    }
}

bufio.Scanner handles buffering internally — it reads in chunks (commonly 4–64 KB) from the OS, then delivers lines one at a time to your code.

Advantages

  • Low memory usage. Only a small portion of the file lives in RAM at once.
  • Immediate processing. Your program can start outputting results before the file finishes reading.
  • Scales to any file size. A 1 GB and a 1 TB file cost about the same in memory.
  • Safer for constrained environments like containers with tight memory limits.

Disadvantages

  • Single-pass only. If you need to go back and re-read earlier lines, you have to restart.
  • Complex multi-step operations. Sorting, joins, and deduplication are hard without holding everything in memory.
  • Slightly higher overhead. More loop iterations and system call coordination than a bulk read.

Loading the Entire File into RAM

Read the file once into a data structure (list, array, string), then work with it freely.

Python example

# Loads entire file into a list of strings
with open("transactions.csv", "r") as f:
    lines = f.readlines()

# Now you can sort, slice, and scan multiple times
lines.sort()
for i, line in enumerate(lines):
    if line == lines[i - 1]:
        print(f"Duplicate found: {line.strip()}")

Go example

data, _ := os.ReadFile("transactions.csv")
lines := strings.Split(string(data), "\n")

sort.Strings(lines)
for i := 1; i < len(lines); i++ {
    if lines[i] == lines[i-1] {
        fmt.Printf("Duplicate: %s\n", lines[i])
    }
}

Advantages

  • Random access. Jump to line 10,000 or scan backwards — no seeking required.
  • Multiple passes. Sort, filter, then scan again without re-opening the file.
  • Simpler code for operations that naturally need the whole dataset (sorting, aggregations, joins).
  • Better CPU cache locality. Data is contiguous in RAM, so the CPU prefetcher works efficiently.
  • Fewer system calls. One large read beats thousands of small ones.

Disadvantages

  • High RAM consumption. A 2 GB file may consume 6–8 GB of RAM after parsing into strings or objects due to per-object overhead.
  • Risk of OOM crashes. If the file grows beyond what RAM can hold, your process dies.
  • Higher startup latency. Processing can't begin until the full load completes.

The Memory Overhead Trap

A common mistake: assuming a 500 MB file uses 500 MB of RAM when loaded.

In Python, each string object carries metadata (type pointer, reference count, length, hash). A file with 10 million short lines can easily use 3–5× more RAM than the raw file size.

import sys

lines = ["hello world"] * 1_000_000
# Raw bytes: ~11 MB
# Python list + string objects: ~85+ MB
print(sys.getsizeof(lines))        # list overhead
print(sys.getsizeof(lines[0]))     # per-string overhead

This overhead doesn't exist when streaming — you process one string at a time and discard it.


Important Nuance: OS Page Cache

Even when reading line-by-line, your program is not actually making one disk read per line. Modern operating systems read ahead in page-sized chunks (commonly 4 KB–64 KB) and cache those pages in RAM.

Your code → reads one line
OS        → fetches 64 KB from disk (or serves from page cache)
Your code → reads the next line
OS        → already in cache, no disk I/O

This means:

  • Line-by-line reading doesn't cause excessive disk seeks.
  • If you read the same file twice in a short window, the second read may be served entirely from the OS page cache — nearly free.
  • The gap in I/O performance between streaming and full-load is smaller than it appears, especially on warm cache.

Hybrid Approach: Chunk Reads

For very large files where line-by-line is too slow but full-load is too expensive, chunk reads offer a middle ground.

CHUNK_SIZE = 64 * 1024  # 64 KB

with open("bigfile.bin", "rb") as f:
    while chunk := f.read(CHUNK_SIZE):
        process(chunk)

Memory-mapped files (mmap) go further — the OS maps the file into the process's virtual address space and loads only the pages you actually touch:

import mmap

with open("largefile.dat", "rb") as f:
    with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as mm:
        # Seek and read arbitrary offsets without loading the whole file
        mm.seek(1_000_000)
        header = mm.read(128)

mmap is particularly useful for binary formats where you need random access but can't afford to load everything.


Rule of Thumb

ScenarioApproach
Log scanning, ETL pipelines, large CSVsStream line-by-line
Sorting, deduplication, multi-pass analyticsLoad into RAM
File larger than ~25% of available RAMStream or chunk
Random access patterns on large filesmmap
Small files (<100 MB, low memory pressure)Either works

High-performance systems often don't choose one or the other — they use buffered chunk reads to balance throughput and memory, or mmap to let the OS manage what actually stays in RAM.


Conclusion

Loading a file into RAM wins on simplicity and speed for small-to-medium datasets — random access, sorting, and multi-pass operations all become trivial. Streaming wins on memory efficiency and scalability, letting you process arbitrarily large files safely. The OS page cache narrows the I/O gap between them, but the RAM cost of full-load remains real and can be 3–5× the file size in practice. When in doubt, ask: does your algorithm need the whole file at once? If not, stream it.