GFS Architecture
Google organized the GFS into clusters of computers. A cluster is simply a network of computers. Each cluster might contain hundreds or even thousands of machines. Within GFS clusters there are three kinds of entities: clients, master servers and chunkservers. In the world of GFS, the term "client" refers to any entity that makes a file request. Requests can range from retrieving and manipulating existing files to creating new files on the system. Clients can be other computers or computer applications. You can think of clients as the customers of the GFS. The master server acts as the coordinator for the cluster. The master's duties include maintaining an operation log, which keeps track of the activities of the master's cluster. The operation log helps keep service interruptions to a minimum -- if the master server crashes, a replacement server that has monitored the operation log can take its place. The master server also keeps track of metadata, which is the information that describes chunks. The metadata tells the master server to which files the chunks belong and where they fit within the overall file. Upon startup, the master polls all the chunkservers in its cluster. The chunkservers respond by telling the master server the contents of their inventories. From that moment on, the master server keeps track of the location of chunks within the cluster. There's only one active master server per cluster at any one time (though each cluster has multiple copies of the master server in case of a hardware failure). That might sound like a good recipe for a bottleneck -- after all, if there's only one machine coordinating a cluster of thousands of computers, wouldn't that cause data traffic jams? The GFS gets around this sticky situation by keeping the messages the master server sends and receives very small. The master server doesn't actually handle file data at all. It leaves that up to the chunkservers. Chunkservers are the workhorses of the GFS. They're responsible for storing the 64-MB file chunks. The chunkservers don't send chunks to the master server. Instead, they send requested chunks directly to the client. The GFS copies every chunk multiple times and stores it on different chunkservers. Each copy is called a replica. By default, the GFS makes three replicas per chunk, but users can change the setting and make more or fewer replicas if desired.
Summary
Google organized the GFS into clusters of computers. A cluster is simply a network of computers. Each cluster might contain hundreds or even thousands of machines. Within GFS clusters there are three kinds of entities: clients, master servers and chunkservers. In the world of GFS, the term "client" refers to any entity that makes a file request. Requests can range from retrieving and manipulating existing files to creating new files on the system. Clients can be other computers or computer applications. You can think of clients as the customers of the GFS. The master server acts as the coordinator for the cluster. The master's duties include maintaining an operation log, which keeps track of the activities of the master's cluster. The operation log helps keep service interruptions to a minimum -- if the master server crashes, a replacement server that has monitored the operation log can take its place. The master server also keeps track of metadata, which is the information that describes chunks. The metadata tells the master server to which files the chunks belong and where they fit within the overall file. Upon startup, the master polls all the chunkservers in its cluster. The chunkservers respond by telling the master server the contents of their inventories. From that moment on, the master server keeps track of the location of chunks within the cluster. There's only one active master server per cluster at any one time (though each cluster has multiple copies of the master server in case of a hardware failure). That might sound like a good recipe for a bottleneck -- after all, if there's only one machine coordinating a cluster of thousands of computers, wouldn't that cause data traffic jams? The GFS gets around this sticky situation by keeping the messages the master server sends and receives very small. The master server doesn't actually handle file data at all. It leaves that up to the chunkservers. Chunkservers are the workhorses of the GFS. They're responsible for storing the 64-MB file chunks. The chunkservers don't send chunks to the master server. Instead, they send requested chunks directly to the client. The GFS copies every chunk multiple times and stores it on different chunkservers. Each copy is called a replica. By default, the GFS makes three replicas per chunk, but users can change the setting and make more or fewer replicas if desired.
Things to Remember
- The minimal state is stored in the master, and state information is gathered in a more real-time fashion allowing faster addressing of faults.
- large chunk sizes are used which fits ideally for the design goal of huge file communication between clients and master is kept minimal. the master just directs the clients to the appropriate chunkservers. This improves scale.
- chunk replication and chunk checksumming improve reliability and fault tolerance.
- while operation log is maintained on the master, checkpointing of the same is done to improve performance and minimize replay overhead.
- Master is a single point of failure in their design. While replication of mater is available, the model doesn't look like a hot-standby model so that downtime seems to exist.
- The consistency model looks flaky to me. A lot seems to be left to the applications to figure out.
- The filesystem will perform rather bad for small files, and if there's a huge volume of small files, metadata pressure on the master can build up.
- The directory structuring of filenames seems rather artificial given the flat namespace.
- The lazy strategy is adopted in many places to improve performance matching is employed in operation log, the chunk to chunkserver mapping, garbage collection etc to improve performance.
- Copy on write model is employed for snapshots.But the main technique is chunking and distribution of chunks across the cluster.
- A major tradeoff is in selecting the chunk size. Larger chunk sizes reduce metadata volume and improve master performance whereas smaller chunk sizes would have supported small filesizes better.
MCQs
No MCQs found.
Subjective Questions
Q1:
Define tetanus.
Type: Very_short Difficulty: Easy
<p> </p>
Q2:
List the symptoms of tetanus.
Type: Short Difficulty: Easy
<p>- Spasms and stiffness in your jaw muscles</p>
<p>- Stiffness of your neck muscles</p>
<p>- Difficulty swallowing</p>
<p>- Stiffness of your abdominal muscles</p>
<p>- Painful body spasms lasting for several minutes, typically triggered by minor occurrences.</p>
<p>- Fever</p>
<p>- Sweating</p>
<p>- Elevated blood pressure</p>
<p>- Rapid heart rate</p>
Q3:
How tetanus is treated ? Explain.
Type: Long Difficulty: Easy
<p>Since there's no cure for tetanus, treatment consists of wound care, medications to ease symptoms and supportive care.</p>
<ol>
<li>Wound care</li>
</ol>
<p>Cleaning the wound is essential to preventing growth of tetanus spores. This involves removing dirt, foreign objects and dead tissue from the wound.</p>
<ol>
<li>Medications</li>
</ol>
<p>- Antitoxin. Your doctor may give you a tetanus antitoxin, such as tetanus immune globulin. However, the antitoxin can neutralize only toxin that hasn't yet bonded to nerve tissue.</p>
<p>- Antibiotics. Your doctor may also give you antibiotics, either orally or by injection, to fight tetanus bacteria.</p>
<p>- Vaccine. Having tetanus once doesn't make you immune to the bacteria afterward. So you'll need to receive a tetanus vaccine in order to prevent future tetanus infection.</p>
<p>- Sedatives. Doctors generally use powerful sedatives to control muscle spasms.</p>
<p>- Other drugs. Other medications, such as magnesium sulfate and certain beta blockers, may be used to help regulate involuntary muscle activity, such as your heartbeat and breathing. Morphine may be used for this purpose as well as sedation.</p>
<p> </p>
<p>iii. Supportive therapies</p>
<p>Tetanus infection often requires a long period of treatment in an intensive care setting. Since sedatives may result in shallow breathing, you may need to be supported temporarily by a ventilator.</p>
<p> </p>
Videos
No videos found.

GFS Architecture
Architecture
On a single machine file system:
- An upper layer maintains the metadata
- A lower i.e. disk stores data in the units called blocks
In the GFS
- A master process maintains the metadata
- A lower layer (ie set of chunk servers) stores data in units called “chunks”

- GoogleFS cluster consists of a single master and multiple chunk servers.
- The basic analogy of GFS is master , client , chunk servers.
- Single GFS master and multiple GFS chunk-server accessed by multiple GFS clients
- GFS Files are divided into fixed size chunks( of 64MB)
- Each chunk is identified by a globally unique “chunk handle”(64 bits)
- Chunk-servers store chunks on local disks as Linux file
- Master maintains all file system metadata
- For reliability, each chunk is replicated on multiple chunk servers (default is 3)
- Clients interact with the master for metadata operations
- Chunkservers need not cache file data

Master
Single process ,running on a separate machine that maintains all FS metadata
-namespace, access control, chunk mapping and locations(files-> chunks-> replicas)
Send periodic heartbeat messages with chunk servers
- instructions+state monitoring
Client
Contact master to get the metadata to contact the chunk servers (control plane)
Contact chunk servers for read-write operations (data plane)
Chunk
- Similar to the concept of the block in file systems.
- Compared to file systems, the size of a chunk is 64 MB.
- Fewer chunks and less metadata for chunks in the master.
- Property of chunk is chunks are stored in chunkserver as file, chunk handle, i.e., chunk file name
- Extends only if need (avoids waste space ie no fragmentation)
why large chunk-size ?
Advantages
- Reduces client interaction with GFS master for chunk location information
- Reduces size of metadata stored on master (full in-memory)
- Reduces network overhead by keeping persistent TCP connection to the chunk server over an extended period of time
Disadvantages
- Problem in this chunk size is developing a hotspot
Metadata
Master stores 3 types of metadata:
- the file and chunk namespaces (in-memory + operation log)
- the mapping from files to chunks (in-memory + operation log)
- the location of each chunk’s replicas (in-memory)
First, two types are kept persistent to an operation log stored on the master’s local disk.
Metadata is stored in memory, master operations are fast.
Easy and efficient for the master to periodically scan
All meta-data is kept in GFS master memory ie RAM
-Periodic scanning is used to implement chunk garbage collection, re-replication, and chunk migration
Operation Logs File
- The logical time line that defines the order of concurrent operations
- Contains only metadata (Namespaces + Chunk mapping )
- Are kept on GFS master’s local disk and replicated on remote machines
- Has no persistent record for chunk locations (master polls chunk servers at startup)
- GFS master checkpoints its state (B-tree) whenever the log grows beyond a certain size
Read Algorithm
- Application creates the read request
- GFS client interprets the request from (filename, byte range) -> (filename, chunk, index), and sends it to master
- Master responds with chunk handle and replica locations (i.e chunk servers where replicas are stored)
- Client picks a location and sends the (chunk handle, byte range) request to that location
- Chunkserver sends requested data to the client
- Client forwards the data to the application

Write Algorithm
- Application originates with request
- GFS client interprets request from (filename, data) -> (filename, chunk index) and sends it to master
- Master responds with chunk handle and (primary + secondary) replica locations
- Client pushes write data to all locations
- Data is stored in chunk servers’ internal buffer
- Client sends write command to primary
- Primary determines serial order for data instances stored in its buffer and writes the instances in that order to the chunk
- Primary sends order to the secondaries and tells them to perform the write Secondaries responds to the primary
- Primary responds back to the client
- If the write fails at one of the chunk servers, the client is informed and rewrites the write.

Record Append Algorithm
Important operations at Google:
- Using file as producer – consumer queue
- Merging results from multiple machines in one file
- Application originates record append request
- GFS client translates request and send it to master
- Master responds with chunk handle and (primary + secondary) replica locations
- Client pushes write data to all locations Primary checks if record fits in specified chunk
- If record does fit, then the primary:
- Pads the chunk
- Tells secondaries to do the same
- And informs the client
- Client then retries the append with the next chunk
- If record matches/fits, the primary:
- Appends the record
- Tells secondaries to do the same
- Receives responses from secondaries
- And sends final response to the client
Consistency Model
- Write – data are written at application specific offset
- Record append – data appended automatically at least once at offset of GFS’s choosing (Regular Append – write at offset, client thinks is EOF)
- GFS
- Applies mutation to chunk in some order on all replicas
- Uses chunk version numbers to detect stale replicas
- Garbage collected, updated next time contact master
- Additional features – regular handshake master and chunk servers, checksumming
- Data only lost if all replicas lost before GFS can react

References:
- "Google FIlesystem : Google Research" , S Ghemawat , 2003
- "Distributed file system", Books,LLC
- "How google file system works?" at computer.howstuffswork.com
- "Colossus: Successor to the Google FIle System(GFS) at www.systutorials.com
Lesson
Google File System
Subject
Computer Engineering
Grade
Engineering
Recent Notes
No recent notes.
Related Notes
No related notes.