Data Flow and Reliability in the Google File System (Part 3)

2025-06-01

Intro

Welcome back to our deep dive into the Google File System, or GFS. In the previous parts, we explored why GFS was needed and its fundamental architecture.

Today, we delve deeper into how GFS handles data through its read and write paths, and explore advanced features like its consistency model, data replication, recovery mechanisms, and atomic record appends.

Data Replication

Why Replication?

GFS runs on commodity hardware prone to failures, and losing data due to hardware failure is unacceptable.

To prevent this, GFS stores each chunk on multiple servers, typically using a replication factor of 3.

Benefits

Fault tolerance
High availability
Load distribution

These replicas are leveraged during both reads and writes, which we’ll cover in the next sections.

Read Path Flow

In GFS, the master server controls the metadata, while the chunk servers handle actual data. Here’s how a read operation works:

Client → Master: Client requests metadata about the chunk containing the required file block.
Master → Client: Master responds with the chunk handle and locations of all its replicas.
Client → Chunk Server: Client picks one of the chunk servers and reads the data directly.

Optimizations

Clients cache metadata responses from the master with a TTL.
Clients request metadata for multiple chunks in a single call.

Analogy

Imagine going to a library and asking a head librarian for a book. Since the book has multiple copies, the librarian tells you several shelf locations. You then go to the nearest shelf and pick up your book.

Failure Scenario in Read Path

If one of the chunk servers is down, the client can simply fetch the chunk from another replica returned by the master. This fault tolerance is one of the key advantages of replication.

Write Path Flow & Atomic Record Appends

Writing data in GFS is more complex than reading, and for good reason. GFS must ensure all chunk replicas remain consistent, even with multiple clients writing concurrently. Let’s explore how GFS handles this using leases and a primary replica.

🔄 Step-by-step Write Flow

Client → Master:
- The client sends a request to the master asking for the chunk’s metadata and lease information.
- Master assigns the lease:
  - The master designates one primary chunk server for that chunk and returns the list of all replicas (e.g., CS1, CS2, CS4).
Client → All Replicas:
- The client pushes the data to all replicas (including the primary), which each hold the data in memory and respond with an acknowledgement.
Client → Primary Replica:
- Once acknowledgements are received, the client sends a commit request to the primary chunk server (say, CS2).
Primary → All Replicas:
- The primary writes the data and then instructs other replicas to flush it in the same order, ensuring all replicas have a consistent chunk state.

🧩 Why There Are No Conflicting Writes

GFS also supports atomic record appends, which are especially useful when multiple clients write to the same file simultaneously.

You might wonder: If multiple clients write to the same chunk, won’t there be conflicting data?

The answer is no — and here’s why:

Only one chunk server is the primary for a given chunk at any point in time.
Even if multiple clients attempt to write to that chunk simultaneously, they all go through the same primary.
The primary decides the final order of the writes and sends the same ordered instructions to the secondary replicas.

This ensures that all replicas apply writes in the exact same order, maintaining consistency and avoiding conflicts.

🔥 Failure Scenarios in Write Flow

If the primary chunk server goes down before commit:
- The commit coordination doesn’t happen. Since only the primary can assign the write order, the write is considered failed, and the client must retry the operation after the master reassigns a new primary.
If a replica is down during the commit phase:
- The primary won’t receive acknowledgement from all replicas, so it considers the write unsuccessful. The client gets an error and retries the operation. Any partial writes are discarded, maintaining atomicity and consistency.
If the client does not receive acknowledgements from all replicas during data push (before commit):
- The client considers the write incomplete and does not proceed to commit. Since the primary never receives the commit request, the data is never written to disk. All chunk servers eventually discard the uncommitted buffered data. The client retries the full write.

Consistency Model

GFS follows an eventual consistency model. It’s possible that, temporarily:

A chunk may be considered replicated by the master.
But one of the replicas may be lost or corrupted due to hardware failure.

However, GFS is built to self-heal. Over time, it ensures that all chunks meet the desired replication level, restoring consistency automatically.

Recovery

Failure Detection

Chunk servers send heartbeats to the master regularly.
These heartbeats include metadata about the chunks they store.

Recovery Process

If a chunk server crashes or sends corrupt data info, the master initiates replication of the lost/corrupted chunk to other healthy servers.

This ensures that even after failures, GFS can recover and continue serving requests without interruption.

Outro

To recap, in this part, we explored advanced GFS features:

Consistency Model: GFS ensures eventual consistency across chunk replicas.
Data Replication: Chunks are stored on multiple servers for fault tolerance and availability.
Recovery: GFS uses heartbeat-based detection and automatic replication to recover from failures.
Write Flow & Atomic Appends: GFS ensures all replicas have a consistent view by designating a primary chunk server to order writes, even for concurrent client appends.

These features make GFS not just scalable and reliable, but also resilient enough to support Google’s massive data infrastructure.

Coming Up Next

In the next part of our series, we’ll explore how GFS manages data for performance and reliability using its unique data placement and load balancing strategies. Stay tuned!

References

https://storage.googleapis.com/gweb-research2023-media/pubtools/4446.pdf

GFS Other Parts

Other Blogs

https://medium.com/@shivamgor498/java-virtual-thread-ced98c382212

Enjoyed the read? Share it:

Twitter LinkedIn WhatsApp