84. Handling Large Data with Generators

Generators in Python are a great way to handle large datasets efficiently without loading everything into memory. They allow you to iterate over data one item at a time, keeping memory usage low, which is especially useful when dealing with large files or datasets.

Here are 10 Python code snippets demonstrating various ways of handling large data with generators:

1. Basic Generator for Large Data

def large_range(start, end):
    for number in range(start, end):
        yield number

# Example usage
for num in large_range(1, 1000000):
    if num > 10:
        break
    print(num)

This generator yields numbers in a range without creating the entire range in memory.


2. Reading Large Files Line by Line Using a Generator

def read_large_file(file_path):
    with open(file_path, 'r') as file:
        for line in file:
            yield line.strip()

# Example usage
for line in read_large_file('large_file.txt'):
    if 'keyword' in line:
        print(line)

This example reads a file line by line, yielding one line at a time, which is memory efficient for large files.


3. Using yield to Simulate a Chunked File Reader

This generator reads the file in chunks, allowing you to process large binary files in parts.


4. Filtering Data with Generators

This generator filters out even numbers from a large dataset.


5. Generating Infinite Sequences

This example demonstrates an infinite generator that keeps yielding numbers indefinitely.


6. Working with Large Data from an API (Mocked Example)

This generator streams data from an API, yielding each line without storing the entire response in memory.


7. Processing Large Logs with a Generator

This generator processes each log line, splitting it into parts, without storing the entire log file in memory.


8. Creating a Generator to Calculate Large Fibonacci Sequences

This generator produces an infinite Fibonacci sequence, one number at a time.


9. Using Generators for Lazy Data Transformation

This generator transforms data lazily, converting each string to uppercase as it's processed.


10. Generator with itertools for Efficient Data Processing

Using itertools.islice, we can efficiently work with large ranges and take slices from a generator.


These examples demonstrate how to use generators to handle large datasets in an efficient, memory-friendly manner. Whether you're reading large files, processing data from APIs, or working with infinite sequences, generators allow you to process data one item at a time, significantly reducing memory usage.

Last updated