Advanced Striping with GlusterFS 1.3
From GlusterDocumentation
Contents |
Mixed Storage Requirements
There are two ways of scheduling the I/O. One at file level (using unify translator) and other at block level (using stripe translator). Striped I/O is good for files that are potentially large and require high parallel throughput (for example, a single file of 400GB being accessed by 100s and 1000s of systems simultaneously and randomly). For most of the cases, file level scheduling works best.
In the real world, it is desirable to mix file level and block level scheduling on a single storage volume. Alternatively users can choose to have two separate volumes and hence two mount points, but the applications may demand a single storage system to host both.
This document explains how to mix file level scheduling with stripe.
Configuration Brief
This setup demonstrates how users can configure unify translator with appropriate I/O scheduler for file level scheduling and strip for only matching patterns. This way, GlusterFS chooses appropriate I/O profile and knows how to efficiently handle both the types of data.
A simple technique to achieve this effect is to create a stripe set of unify and stripe blocks, where unify is the first sub-volume. Files that do not match the stripe policy passed on to first unify sub-volume and inturn scheduled arcoss the cluster using its file level I/O scheduler.
Preparing GlusterFS Envoronment
- Create the directories /export/namespace, /export/unify and /export/stripe on all the storage bricks.
- Place the following server and client volume spec file under /etc/glusterfs (or appropriate installed path)
and replace the IP addresses / access control fields to match your environment.
Volume Specification
Place both the server and client volume specification files under /etc/glusterfs on all the storage bricks. Clients will fetch the client volume spec file from any of the storage bricks (depending up on the IP address passed).
GlusterFS Server Volume Specification File:
## file: /etc/glusterfs/glusterfs-server.vol
volume posix-unify
type storage/posix
option directory /export/for-unify
end-volume
volume posix-stripe
type storage/posix
option directory /export/for-stripe
end-volume
volume posix-namespace
type storage/posix
option directory /export/for-namespace
end-volume
volume server
type protocol/server
option transport-type tcp/server
option auth.ip.posix-unify.allow 192.168.1.*
option auth.ip.posix-stripe.allow 192.168.1.*
option auth.ip.posix-namespace.allow 192.168.1.*
subvolumes posix-unify posix-stripe posix-namespace
end-volume
GlusterFS Client Volume Specification File:
## file: /etc/glusterfs/glusterfs-client.vol
volume client-namespace
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.1
option remote-subvolume posix-namespace
end-volume
volume client-unify-1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.1
option remote-subvolume posix-unify
end-volume
volume client-unify-2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.2
option remote-subvolume posix-unify
end-volume
volume client-unify-3
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.3
option remote-subvolume posix-unify
end-volume
volume client-unify-4
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.4
option remote-subvolume posix-unify
end-volume
volume client-stripe-1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.1
option remote-subvolume posix-stripe
end-volume
volume client-stripe-2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.2
option remote-subvolume posix-stripe
end-volume
volume client-stripe-3
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.3
option remote-subvolume posix-stripe
end-volume
volume client-stripe-4
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.4
option remote-subvolume posix-stripe
end-volume
volume client-ns-1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.1
option remote-subvolume posix-namespace
end-volume
volume stripe
type cluster/stripe
option block-size *:2MB # All files ending with .img are striped with 2MB stripe block size.
subvolumes cluster-stripe-1 cluster-stripe-2 cluster-stripe-3 cluster-stripe-4
end-volume
volume unify
type cluster/unify
option namespace client-ns-1
option scheduler switch
option switch.case *.img:stripe # other nodes are assumed for default pattern matching.
subvolumes stripe cluster-unify-1 cluster-unify-2 cluster-unify-3 cluster-unify-4
end-volume
Bring up the Storage
Starting GlusterFS Server: If you have installed through binary package, you can start the service through init.d startup script. If not..
# glusterfsd
Mounting GlusterFS Volumes:
# glusterfs -s [BRICK-IP-ADDRESS] /mnt/cluster
Improving upon this Setup
- Infiniband Verbs RDMA transport is much faster than TCP/IP GigE transport.
- Use of performance translators such as read-ahead, write-behind, io-cache, io-threads, booster is recommended.
- Replace round-robin (rr) scheduler with ALU handle more dynamic storage environments.

