Course:CICS525/Notes/CaseStudy-1

Atomicity – yes
- Updates are atomic by definition
- There are no transactions
Consistency – no
- Data is eventually consistent
- Loose consistency is tolerated
- Reconciliation is performed by the client
Isolation
- No isolation – one update at a time
Durability – yes
- Durability is provided via replication

Good service time is key for Amazon
Not good when a credit card transaction times out
Service-level agreement: the client’s response must be answered within 300ms
Must provide this service for 99.9% of transactions at the load of 500 requests/second

Loose consistency
Synchronous replica reconciliation during the request cannot be done
Contact a few replicas, if some do not reply, request is considered failed
When to resolve conflicting updates? During reads or during writes?
- Usually resolved during writes
- Dynamo resolves it during reads
- Motivation: must have an always writable data store (cannot lose customer shopping card data)

get ( key )

Locate object replicas
Return:
- A single object
- A list of objects with conflicting versions
- Context (opaque information about object versioning)

put (key, value, context)

Hash the key
Find the node with the corresponding ring position
Walk the ring clockwise to find the first node with the greater position than that of the key
Similar search algorithms are used in distributed hash tables

May lead to imbalance in load distribution
Solution:
- Each node is a virtual node
- Assign multiple virtual nodes to one physical node

Each node has a coordinator (the node determined by the hash)
The coordinator hashes the node at N other replicas
N replicas that are next to the coordinator node in the ring in the clockwise fashion
Virtual nodes are skipped to ensure that replicas are located on different physical nodes

Through a generic load balancer
- May forward request to a node who is NOT a coordinator
- The recipient node will forward the request to the coordinator
Through a partition-aware client library that directly selects a coordinator

Dynamo is configured with two parameters: R and W
R is the minimum number of nodes who participate in the successful Read operation
W is the minimum number of nodes who participate in the successful Write operation
Request handling protocol (for writes):
- Coordinator receives request
- Coordinator computes vector clock and writes new version to disk
- Coordinator sends the new version and vector clock to the N replicas
- If at least W-1 respond, the request is successful

What if some of the N replicas are temporarily unavailable?
- This could limit system’s availability
- Cannot use strict quorum
Use sloppy quorum
- If one of N replicas is unavailable, use another node that is not a replica
- That node will temporarily store the data
- Will forward it to the real replica when the replica is back up

Uses Merkle trees
- Leaves are hashes of keys
- Can compare trees incrementally, without transferring the whole tree
If a part the tree is not modified, the parent nodes’ hashes will be identical
- So parts of the tree can be compared without sending data between two replicas
- Only keys that are out of sync are transferred

Membership is always explicit
Nodes are added/removed by the operator
So there is no need for “coordinator election”
If a node is unavailable, this is considered temporary
A node that starts up chooses a set of tokens (virtual nodes) and maps virtual nodes to physical nodes
Membership information is eventually propagated via gossip protocol
Mapping is made persistent on disk