# DDIA - Chapter 9 - Consistency and Consensus - Thoughts and notes

*Disclaimer: this blog has been written with the help of LLM, please do not hate me*

I never imagined that consensus would be such a difficult problem, in a democracy one has consensus right, it is so easy. Even if there are multiple failure points the leader gets elected.

Anyway, this chapter speaks a lot about linearisability, consistency, consensus, how to be fault tolerant during consensus. Thing is that the tradeoff with making the system consistent is performance. There wasn’t much to discover in this chapter, I did find it a bit boring. The basic idea while building a distributed system should be that time is linear. I’d always use epochs.

# Linearizability: Core Concepts

## Definition

* Makes a distributed system appear as a single replica without replication lag
    
* Ensures all clients see the most recent value once it's written
    
* Requires consistent data across all replicas at the same time
    

## Key Characteristics

* Strict consistency model
    
* Once a client sees a new value, all other clients must see that same new value
    
* Critical for scenarios like:
    
    * Leader election in single-leader systems
        
    * Choosing unique usernames
        
    * Seat booking systems
        

## Trade-Offs

* Performance-intensive
    
* Often sacrificed for availability
    
* Costly in multi-data center environments
    
* Not many systems provide true linearizability
    

## Limitations

* High performance overhead
    
* Difficult to implement across distributed systems
    
* Network delays can significantly impact implementation
    

# Distributed Transactions

## Types

1. Database-internal transactions
    
    * Used within same database system
        
    * Easier to implement
        
2. Heterogeneous distributed transactions
    
    * Involve multiple, different systems
        
    * Focus on ensuring atomic commits
        

## Key Mechanism: Two-Phase Commit (2PC)

* Two phases: Prepare and Commit
    
* Transaction coordinator manages process
    
* Ensures all nodes commit or abort together
    

## Challenges

* Performance bottlenecks
    
* Complexity of coordinating across systems
    
* Potential for system-wide blocking if coordinator fails
    

# Consensus

## Definition

Getting multiple nodes to agree on a value with specific properties:

* Uniform agreement
    
* Integrity
    
* Validity
    
* Termination
    

## Key Characteristics

* Requires majority of nodes to be functioning
    
* Uses epoch numbering for leader selection
    
* Implements total order broadcast
    

## Limitations

* Requires strict majority of nodes
    
* Difficult to add/remove nodes dynamically
    
* Relies on timeout mechanisms
    
* Performance can be impacted by frequent leader elections
    

# Coordination Services (e.g., Zookeeper, etcd)

## Features

* Linearizable atomic operations
    
* Total ordering of operations
    
* Failure detection
    
* Change notifications
    
* Service discovery
