Overview

All coherence-related activity is broadcasted to all cache controllers in the system

The caches are all accessible via broadcast interconnect (e.g., bus or network), and all cache controllers monitor or snoop the interconnect to determine whether or not they have a copy of a block that is requested

Because the interconnect is shared, only one CPU can acquire medium access and have exclusive access for writing at a time. Medium arbitration and acquisition enforce write serialization

Broadcasting makes snooping easy to implement but limits scalability because every message may be broadcasted to all devices through the same medium. The medium becomes a bottleneck

Here we assume the following simplifications:

Bus transactions are atomic: only one transaction is in progress on the bus at a time
Memory operations are atomic: the CPU waits until its previous memory operation is complete before issuing another memory operation
Transitions between states (stable states) are atomic

In reality, these simplifications are not true. See Snoop-Based Multiprocessor Design for implementation details

MSI

MSI is write-back, write-invalidate, snooping coherency protocol

Modified/Exclusive (M): exclusive read/write access, dirty
Shared (S): shared read access, clean
Invalid (I): line is invalid

FSM for MSI (here “exclusive” means “modified”):

MSI FSM.png|800

In contrast to SI write-through protocol, a line is flushed to memory only on eviction (write-miss, read-miss) and snooped write-miss and read-miss

Satisfying Coherency

MSI satisfies coherency conditions:

“Program order” is implicitly satisfied by uniprocessor
“Write propagation” is satisfied by the combination of invalidation and write-back actions
“Write serialization” is satisfied:
1. Writes that appear on interconnect are ordered by the order they appear on interconnect
2. Reads that appear on interconnect are ordered by the order they appear on interconnect
3. Writes that don’t appear on interconnect (write-back cache):
  1. Writes come between two interconnect transactions
  2. All writes are performed by the same CPU and ordered due to the 1st condition (“program order”)
  3. All other CPUs observe writes only after the interconnect transaction.
  4. So all CPUs see writes in the same order

MESI

Modified, Shared, and Invalid states same as in MSI

Exclusive clean (E) state: exclusive read/write access, clean

MSI requires two interconnect transactions for the case of reading an address and then writing to it even if no sharing at all:

Read miss on interconnect on CPU read to move from Invalid to Shared
Invalidate on interconnect on CPU write to move from Shared to Modified (Exclusive)

In the uniprocessor scenario (no shared lines), only the first transaction is needed

With Exclusive clean state, MESI requires only one interconnect transaction to write to a line that is not shared:

Read miss on interconnect on CPU read to move from Invalid to Exclusive clean if no sharing in other caches
On CPU write, move from Shared to Modified (Exclusive)

Thus, the behavior is the same as in a uniprocessor scenario

For implementation details, see: Snoop-Based Multiprocessor Design.

MOESI

“Modified,” “Shared,” “Invalid,” “Exclusive clean” states same as in MESI

Owned (O): shared read access, dirty

Allows sending dirty cache lines directly between caches instead of writing back to a shared lower cache (or memory) and then reading from there

🪴 Quartz 4.0

Explorer

Snooping Cache Coherence Protocols

Overview

MSI

Satisfying Coherency

MESI

MOESI

References

Graph View

Table of Contents

Backlinks