Course:CICS525/Notes/Consistency-1

Introduction

Consistency
Consistency models
- Strict consistency
- Sequential consistency

Replicated data a huge theme in distributed systems

For performance and fault tolerance
Often easier to replicate data than computation
Examples:
- Caching of web pages (Web browser)
- memcached for web servers
All these examples involve optimizations for performance
- How do we know if an optimization is correct?
- We need to know how to think about correct execution of distributed programs.
Most of these ideas from multiprocessors and databases 20/30 years ago.
For now, just correctness and efficiency, not fault-tolerance.

Replicating content that isn't modified (or has a single writer) is "simple"
- See, for example, Web browser
- Browser obeys HTTP expiration directive and last-modified
More interesting with multiple writers!

Naive distributed memory

(diagram that you can draw)
CPU0, CPU1, CPU2, network
each host has a local copy of all of memory
read: from local memory
write: send update msg to each other host (but don't wait)
fast: never waits for communication
Does this memory work well?
Example 1:

CPU0:
  v0 = f0();
  done0 = true;
CPU1:
  while(done0 == false)
   ;
  v1 = f1(v0);          
  done1 = true;
CPU2:
  while(done1 == false)
   ;
  v2 = f2(v0, v1);

Intuitive intent:
- CPU2 should execute f2() with results from CPU0 and CPU1
- waiting for CPU1 implies waiting for CPU0
Example 1 won't work with naive distributed memory:
- Problem A:

[time diagram]
CPU0's writes of v0 and done0 may be interchanged by network
leaving v0 unset but done0=true

- how to fix this problem?
- Problem B:

[time diagram]
CPU2 sees CPU1's writes before CPU0's writes
i.e. CPU2 and CPU1 disagree on order of CPU0 and CPU1 writes

- how to fix?
Naive distributed memory is fast but has unexpected behavior
- maybe it isn't "correct"
- maybe we should never have expected Example 1 to work

Time synchronization and challenges

Global time sync can help resolve some consistency issues in a distributed system
A global clock is hard to implement
Protocols such as NTP may sometimes not offer sufficient precision

How can we write correct distributed programs with shared storage?

Memory system promises to behave according to certain rules.
We write programs assuming those rules.
Rules are a "consistency model"
Contract between memory system and programmer

What makes a good consistency model?

There are no "right" or "wrong" models
A model may make it harder or easier to program
- i.e. lead to more or less intuitive results
A model may be harder or easier to implement efficiently
- Also application dependent
- A consistency model for Web pages different than for memory

How about "strict consistency"

each instruction stamped with its start time (global time)
Rule 1: LD gets value of most recent previous ST to same address
Rule 2: each CPU executes instructions one at a time, in order
Essentially the same as on uniprocessor
Very intuitive consistency model
Would strict consistency avoid problem A and B?
- How do you implement strict consistency?

 Time:  1  2  3  4
 CPU0:  ST       ST
 CPU1:     LD LD

- Time between instructions << speed-of-light between CPUs!
- How is LD@2 even aware of ST@1?
- How does ST@4 know to pause until LD@3 has finished?
- How does ST@4 know how long to wait?
- Too hard to implement!

A reasonable model: sequential consistency

Is an execution (a set of operations) correct?
There must be some total order of operations such that

each CPU's instructions appear in-order in the total order
all CPUs see results consistent with that total order

- i.e. reads see most recent write in the total order

A sequentially consistent system would not have Problems A/B
- Problem A

   CPU0's execution order was v0= done0=
   CPU1 saw done0= v0=
   each CPU's operations must appear in execution order
   so cannot happen w/ sequential consistency

- Problem B

   CPU1 saw v0= done0= done1=
   CPU2 saw done1= v0=
   this cannot occur given a single total order
   so cannot happen w/ sequential consistency

Better performance than strict consistency
- System has some freedom in how it interleaves different CPUs' ops
- not forced to order by op start time, as in strict consistency
- system can delay a read or write while it finds current values

Performance is still not great
- Once the system reveals a written value to a read operation, the system has committed to a little bit of partial order. this may have transitive effects.
- example: system can delay revealing CPU1's done1=true to CPU2, but once revealed, must also reveal CPU0's writes. So the system only has freedom in ordering concurrent operations ones that haven't been observed yet

A simple implementation of sequential consistency
- each CPU sends R/W operations to a single memory server
- memory server chooses order (interleaves requests from diff CPUs)
- server executes operations one at a time, sends R replies

Simple implementation will be slow
- single server will get overloaded
- no local cache, so all operations are slow
- each CPU must usually block waiting for each read reply