Internals of Replicate

From GlusterDocumentation

This information is out of date
and does not contain information related to the current version of Gluster

Documentation Home


This document describes in detail the internal algorithms of replicate. For a more user-level view, see Understanding AFR Translator.

Read operations

Read

  • Read subvolume

Readdir

  • Failover in readdir

Write operations

Changelog

  • changelog format
 trusted.afr.volume1 = 0x000000230000000100000004
 trusted.afr.volume2 = 0x000000000000000000000000

This means that:

Pending on data metadata entry
volume1 23 1 4
volume2 0 0 0

Transaction

  • Types: DATA, METADATA, ENTRY, RENAME
  • steps of transaction
    • Lock: inode or entry.
    • Pre-op: Write changelog.
    • Op: Do the op.
    • Post-op: Erase changelog.
    • Unlock: What was locked.
  • Special hacks
    • RMDIR
    • post-post op hook for open-fd self-heal
  • optimizations (quick unwind, lock server count = 0, changelog = off)

Self-heal

  • detecting when self-heal is needed
  • States of subvolumes

Let us define that a subvolume A accuses subvolume B if the changelog on A states that operations are pending on the subvolume B. With the definition in mind, the four states that a subvolume can be in are:

  • Ignorant: There is no changelog present on this subvolume.
  • Innocent: Changelog is present on this subvolume but it accuses no other subvolume.
  • Fool: Changelog is present and the subvolume accuses itself.
  • Wise: Changelog is present but it accuses other subvolumes and not itself.
  • State transition matrix
From -> To Ignorant Innocent Fool Wise
Ignorant - write

self-heal

failed write write
Innocent backend rm - failed write write (other failed)
Fool backend rm self-heal -  ??
Wise backend rm self-heal  ?? -
  • Source selection matrix
1/2 Ignorant Innocent Fool Wise
Ignorant Biggest
Innocent N/A
Fool Biggest
Wise Split-brain


  • setting read subvolume to source
  • doing self-heal in background
  • picking the data sync algo
    • diff and full

This information is out of date
and does not contain information related to the current version of Gluster

Documentation Home

 

Copyright © Gluster, Inc. All Rights Reserved.