Course:CICS525/Notes/Consistency-1
< Course:CICS525 | Notes
Introduction
- Consistency
- Consistency models
- Strict consistency
- Sequential consistency
Replicated data a huge theme in distributed systems
- For performance and fault tolerance
- Often easier to replicate data than computation
- Examples:
- Caching of web pages (Web browser)
- memcached for web servers
- All these examples involve optimizations for performance
- How do we know if an optimization is correct?
- We need to know how to think about correct execution of distributed programs.
- Most of these ideas from multiprocessors and databases 20/30 years ago.
- For now, just correctness and efficiency, not fault-tolerance.
- Replicating content that isn't modified (or has a single writer) is "simple"
- See, for example, Web browser
- Browser obeys HTTP expiration directive and last-modified
- More interesting with multiple writers!
Naive distributed memory
- (diagram that you can draw)
- CPU0, CPU1, CPU2, network
- each host has a local copy of all of memory
- read: from local memory
- write: send update msg to each other host (but don't wait)
- fast: never waits for communication
- Does this memory work well?
- Example 1:
CPU0: v0 = f0(); done0 = true; CPU1: while(done0 == false) ; v1 = f1(v0); done1 = true; CPU2: while(done1 == false) ; v2 = f2(v0, v1);
- Intuitive intent:
- CPU2 should execute f2() with results from CPU0 and CPU1
- waiting for CPU1 implies waiting for CPU0
- Example 1 won't work with naive distributed memory:
- Problem A:
[time diagram] CPU0's writes of v0 and done0 may be interchanged by network leaving v0 unset but done0=true
- how to fix this problem?
- Problem B:
[time diagram] CPU2 sees CPU1's writes before CPU0's writes i.e. CPU2 and CPU1 disagree on order of CPU0 and CPU1 writes
- how to fix?
- Naive distributed memory is fast but has unexpected behavior
- maybe it isn't "correct"
- maybe we should never have expected Example 1 to work
Time synchronization and challenges
- Global time sync can help resolve some consistency issues in a distributed system
- A global clock is hard to implement
- Protocols such as NTP may sometimes not offer sufficient precision
- Memory system promises to behave according to certain rules.
- We write programs assuming those rules.
- Rules are a "consistency model"
- Contract between memory system and programmer
What makes a good consistency model?
- There are no "right" or "wrong" models
- A model may make it harder or easier to program
- i.e. lead to more or less intuitive results
- A model may be harder or easier to implement efficiently
- Also application dependent
- A consistency model for Web pages different than for memory
How about "strict consistency"
- each instruction stamped with its start time (global time)
- Rule 1: LD gets value of most recent previous ST to same address
- Rule 2: each CPU executes instructions one at a time, in order
- Essentially the same as on uniprocessor
- Very intuitive consistency model
- Would strict consistency avoid problem A and B?
- How do you implement strict consistency?
Time: 1 2 3 4 CPU0: ST ST CPU1: LD LD
- Time between instructions << speed-of-light between CPUs!
- How is LD@2 even aware of ST@1?
- How does ST@4 know to pause until LD@3 has finished?
- How does ST@4 know how long to wait?
- Too hard to implement!
A reasonable model: sequential consistency
- Is an execution (a set of operations) correct?
- There must be some total order of operations such that
- each CPU's instructions appear in-order in the total order
- all CPUs see results consistent with that total order
- i.e. reads see most recent write in the total order
- A sequentially consistent system would not have Problems A/B
- Problem A
CPU0's execution order was v0= done0= CPU1 saw done0= v0= each CPU's operations must appear in execution order so cannot happen w/ sequential consistency
- Problem B
CPU1 saw v0= done0= done1= CPU2 saw done1= v0= this cannot occur given a single total order so cannot happen w/ sequential consistency
- Better performance than strict consistency
- System has some freedom in how it interleaves different CPUs' ops
- not forced to order by op start time, as in strict consistency
- system can delay a read or write while it finds current values
- Performance is still not great
- Once the system reveals a written value to a read operation, the system has committed to a little bit of partial order. this may have transitive effects.
- example: system can delay revealing CPU1's done1=true to CPU2, but once revealed, must also reveal CPU0's writes. So the system only has freedom in ordering concurrent operations ones that haven't been observed yet
- A simple implementation of sequential consistency
- each CPU sends R/W operations to a single memory server
- memory server chooses order (interleaves requests from diff CPUs)
- server executes operations one at a time, sends R replies
- Simple implementation will be slow
- single server will get overloaded
- no local cache, so all operations are slow
- each CPU must usually block waiting for each read reply