GlusterFS Features

From GlusterDocumentation

  • User-space Design: User space design offers a number of advantages. No kernel patches or kernel modules are required. Complex features can be added relatively easily. Easy to debug and maintain. Bugs do not crash the OS. With so many advantages, GlusterFS can run as fast or even faster than kernel based FS.
  • Stackable Modules: Stackable modular design allows GlusterFS features to be extended beyond the scope of a regular file system without compromising the elegance in framework. Almost all of the features (from performance options and distributed locking to replication and striping) are implemented as stackable modules (translators). Users can select appropriate translators specific to their application and hardware needs and build an optimized storage system. GlusterFS borrowed the concept of user-space stackable file system from the GNU Hurd kernel.
  • No Meta-data: Unlike other cluster file systems which addressed the parallelization at block level, GlusterFS engineers believed that problem is at volume management and I/O scheduling level. This enabled meta-data info to be offloaded to underlying mature disk file systems. Eliminating the centralized meta-data server gave significant scaling and reliability advantages to GlusterFS.
  • Self-healing: As your volume size grows beyond 32TBs, fsck (filesystem check) downtime becomes a huge problem. GlusterFS has no fsck. It heals itself transparently with very little impact on performance.
  • NFS-like Backend: Users' files and folders are stored as it is at the backend. Users can always access the data through scp or ftp (like NFS), even without GlusterFS installed. This simplicity gives a lot of confidence to scale to multi-peta bytes.
  • Automatic Replication: The Automatic File Replication (AFR) feature in GlusterFS replicates all your I/O in real time. With AFR, GlusterFS can withstand hardware failures.
  • Aggregation: The Unify feature in GlusterFS allows the aggregation of various storage bricks (servers) into one large volume. It does distribution at the file level. Distribution policy is decided by the chosen I/O scheduler.
  • Scalable Striping: GlusterFS striping scales to a huge number of bricks unlike a meta-data based approach. Even striped files can easily be recovered by simply dd'ing strided blocks back into regular files.
  • Pluggable I/O Schedulers: Users can choose different I/O schedulers depending upon the application's requirement. Available options are adaptive-least-usage self tuning I/O scheduler, round robin I/O scheduler, non-uniform-memory-access I/O scheduler, random I/O scheduler, wild-card scheduler. It is fairly easy to develop custom schedulers.
  • Pluggable Transport: GlusterFS supports TCP/IP based networks such as Fast Ethernet, GigE, and 10 GigE, as well as RDMA based Infiniband. Available options are TCP, IB-verbs, Unix-IPC.
  • Pluggable Auth: GlusterFS supports IP and user/pass based authentication. It is possible to extend the auth interface to support MySQL or LDAP based authentication.
  • Distributed Locking: The Locks translator in GlusterFS supports full featured POSIX distributed locking.
  • Distributed BDB: The BerkeleyDB backend module enables GlusterFS to store small files very efficiently. Billions of small files can be packed into fewer BDB files spread across multiple storage bricks. The user is still presented with a POSIX compliant file system view.
  • Embeddable: The entire GlusterFS filesystem can be embedded into a web farm of Apache or Lighttpd web servers. This enables web requests to bypass the kernel and access data directly. Particularly if you have Infiniband, Apache or Lighttpd won't even know it is performing RDMA I/O.
  • Performance Modules: A number of performance modules, such as IO-Cache, IO-Threads, Read-Ahead and Write-Behind, are available for optimizing your storage performance.
  • Flexible Volume Management: Every feature in GlusterFS (e.g. network, scheduler, cache to disk) is represented as a logical volume. Users can stack them up in any meaningful order to build a highly customized, optimized storage environment.
  • Undelete: The Trashcan module provides undelete functionality by transparently moving all deleted/modified files into a /trash directory.
  • Encryption: As of now, GlusterFS supports only the rot-13 encryption module. Rot-13 is a ridiculously weak encryption algorithm. The main purpose of this module is to act as a reference implementation for future development.
  • Trace: Application I/O can be traced call by call by inserting the trace translator at specific points in the file system. It is useful for debugging.

 

Copyright © Gluster, Inc. All Rights Reserved.