Internals of Replicate
From GlusterDocumentation
This information is out of date
and does not contain information related to the current version of Gluster
Documentation Home
This document describes in detail the internal algorithms of replicate. For a more user-level view, see Understanding AFR Translator.
[edit]
Read operations
[edit]
Read
- Read subvolume
[edit]
Readdir
- Failover in readdir
[edit]
Write operations
[edit]
Changelog
- changelog format
trusted.afr.volume1 = 0x000000230000000100000004 trusted.afr.volume2 = 0x000000000000000000000000
This means that:
| Pending on | data | metadata | entry |
|---|---|---|---|
| volume1 | 23 | 1 | 4 |
| volume2 | 0 | 0 | 0 |
[edit]
Transaction
- Types: DATA, METADATA, ENTRY, RENAME
- steps of transaction
- Lock: inode or entry.
- Pre-op: Write changelog.
- Op: Do the op.
- Post-op: Erase changelog.
- Unlock: What was locked.
- Special hacks
- RMDIR
- post-post op hook for open-fd self-heal
- optimizations (quick unwind, lock server count = 0, changelog = off)
[edit]
Self-heal
- detecting when self-heal is needed
- States of subvolumes
Let us define that a subvolume A accuses subvolume B if the changelog on A states that operations are pending on the subvolume B. With the definition in mind, the four states that a subvolume can be in are:
- Ignorant: There is no changelog present on this subvolume.
- Innocent: Changelog is present on this subvolume but it accuses no other subvolume.
- Fool: Changelog is present and the subvolume accuses itself.
- Wise: Changelog is present but it accuses other subvolumes and not itself.
- State transition matrix
| From -> To | Ignorant | Innocent | Fool | Wise |
|---|---|---|---|---|
| Ignorant | - | write
self-heal | failed write | write |
| Innocent | backend rm | - | failed write | write (other failed) |
| Fool | backend rm | self-heal | - | ?? |
| Wise | backend rm | self-heal | ?? | - |
- Source selection matrix
| 1/2 | Ignorant | Innocent | Fool | Wise |
|---|---|---|---|---|
| Ignorant | Biggest | ↑ | ↑ | ↑ |
| Innocent | ← | N/A | ← | ↑ |
| Fool | ← | ↑ | Biggest | ↑ |
| Wise | ← | ← | ← | Split-brain |
- setting read subvolume to source
- doing self-heal in background
- picking the data sync algo
- diff and full
This information is out of date
and does not contain information related to the current version of Gluster
Documentation Home


