Community/GlusterFS on Scalr/EC2
From GlusterDocumentation
This information is out of date
and does not contain information related to the current version of Gluster
Documentation Home
These instructions are going to be step by step on how to install Gluster on Scalr. That means they are specific to Ubuntu Hardy on a Xen kernel. However, it is trivial to run it on another distro and I will give pointers along the way for how to do it if you happen to be trying another distro or you are not on Amazon.
Let's get started!
First thing we are going to do is setup our storage node(s). Gluster allows for all sorts of cool configurations where you can unify multiple storage nodes together so they appear as one logical volume, automatically replicate files to other storage nodes for redundancy, stripe parts of files, etc. This tutorial is meant to get you up and running so I will go through a 1 storage node setup. If you take a look at the Gluster wiki (http://gluster.org/docs/index.php/GlusterFS) there are lot's of tutorials to setup multiple storage node configurations.
Storage Node
Ok, from Scalr boot up a base image (32 or 64). I personally run 32 bit (m1.small) because it only costs $0.10/hour but you can definitely do m1.large if you really want to.
SSH into your machine and run the following commands:
* apt-get install flex bison xfsprogs ntp
* ln -sf /usr/share/zoneinfo/US/Pacific /etc/localtime
(optional, but whatever you do make sure ntp is installed and working and all your nodes are on the same timezone)
* wget http://ftp.gluster.com/pub/gluster/glusterfs/1.3/glusterfs-1.3.12.tar.gz
(this is the current release, but you can obviously pick a newer one if this is out of date)
* tar -zxvf glusterfs-1.3.12.tar.gz
* cd glusterfs-1.3.12
* 32-bit machines run this: ./configure --prefix=
* 64-bit machines run this: ./configure --prefix= --libdir=/usr/lib64
* make install
Note: the end of the configure commands will say that ib-verbs and fuse are not supported, but you don't need to worry about that since it only applies to machines that will mount the Gluster volume (clients)
Next get whatever volume and folder you want to share ready. I personally used an EBS volume but you could use any drive connected to your machine. (Also, /dev/sdb was the device I wanted to use. Obviously change that to whatever you want to use)
* mkfs.xfs /dev/sdb * echo "/dev/sdb /gluster xfs noatime 0 0" >> /etc/fstab * mkdir /gluster * mount /gluster * mkdir /gluster/export * optional for unify: mkdir /gluster/export-ns * nano /etc/glusterfs/glusterfs-server.vol (you can use vi if you wish or emacs even ;-))
# /etc/glusterfs/glusterfs-server.vol volume brick-raw type storage/posix option directory /gluster/export end-volume volume brick type features/posix-locks subvolumes brick-raw end-volume # optional for unify # #volume brick-ns # type storage/posix # option directory /gluster/export-ns #end-volume ### Add network serving capability to above brick. volume server type protocol/server option transport-type tcp/server # For TCP/IP transport subvolumes brick brick-ns option auth.ip.brick.allow 10.* # Allow access to "brick" volume # optional for unify # #option auth.ip.brick-ns.allow 127.0.0.1 # Allow access to "brick-ns" volume option auth.login.brick.allow gluster_client # optional for unify # #option auth.login.brick-ns.allow gluster_client option auth.login.gluster_client.password GiveMeData end-volume
* glusterfsd -f /etc/glusterfs/glusterfs-server.vol
The gluster server daemon should now be running and you should see it with a: ps aux | grep gluster. If it doesn't start take a look at /var/log/glusterfs/glusterfsd.log
Client Node
At this point we have a Gluster server sharing it's data now we just need to access it with a client and mount it!
Boot up an app/web/base whatever instance.
SSH into your new instance and follow enter these commands:
* apt-get remove fuse-utils (don't be alarmed if your instance doesn't have it, some do some don't) * rmmod fuse
Now for the fun part. The stock fuse module in Ubuntu Hardy doesn't work with Gluster for some reason. Anyways we want better performance out of the special Gluster patched version of fuse, so we are going to install that. I will make another post at some point to explain how to roll your own module, but it took me days to figure out and is incredibly complicated. Why don't you enjoy the fruits of my efforts and download a tarball that has the precompiled module all ready to go? That's what open source is all about anyways!
* 32-bit machine (Scalr): wget http://www.envoymediagroup.com/downloads/fuse-2.6.16-xenU.32.tar.gz (md5sum: 812664836d50ca14527b85821c6e8cb4) * 64-bit machine (Scalr): wget http://www.envoymediagroup.com/downloads/fuse-2.6.16.33-xenU.64.tar.gz (md5sum: c257c015fe1c6aadac70c13bf8ce7314) * mv fuse-*.tar.gz / * tar -zxvf fuse-* * depmod -a * modprobe fuse
If you tail /var/log/messages you should see the Fuse module loaded into the kernel at this point.
Let's say hypothetically you came across this page and you aren't running on Scalr. Well, you should! But if you want to install the custom fuse module on any other distro do this (Scalr users just ignore this section):
* wget http://ftp.gluster.com/pub/gluster/glusterfs/fuse/fuse-2.7.3glfs10.tar.gz * tar -zxvf fuse*.tar.gz * cd fuse* * ./configure --prefix=/usr --enable-kernel-module * make install * ldconfig * depmod -a * rmmod fuse * modprobe fuse
Ok, continuing on for Scalr users and everyone else!
* wget http://ftp.gluster.com/pub/gluster/glusterfs/1.3/glusterfs-1.3.12.tar.gz (this is the current release, but you can obviously pick a newer one if this is out of date, Just make sure you use the SAME version as you did on the storage node.) * tar -zxvf glusterfs-1.3.12.tar.gz * cd glusterfs-1.3.12 * 32-bit machines run this: ./configure --prefix= * 64-bit machines run this: ./configure --prefix= --libdir=/usr/lib64 * make install * nano /etc/glusterfs/glusterfs-server.vol (you can use vi if you wish or emacs even ;-))
# /etc/glusterfs/glusterfs-client.vol volume sto1-brick type protocol/client option transport-type tcp/client option remote-host <ip or hostname of storage node> option remote-subvolume brick option username gluster_client option password GiveMeData end-volume # Performance Boosters volume iot type performance/io-threads option thread-count 2 subvolumes sto1-brick end-volume volume wb type performance/write-behind subvolumes iot end-volume volume ioc type performance/io-cache option cache-size 512MB # Choose this according to how much memory you want to dedicate subvolumes wb end-volume # optional for unify # #volume sto-ns # type protocol/client # option transport-type tcp/client # option remote-host <ip or hostname of storage node with namespace> # option remote-subvolume brick-ns # option username gluster_client # option password GiveMeData #end-volume # #volume unify0 # type cluster/unify # option scheduler rr # round robin # option namespace sto-ns # subvolumes sto1-brick #end-volume # # Performance Boosters # - If you use this section remove the Performance Boosters section above #volume iot # type performance/io-threads # option thread-count 2 # subvolumes unify0 #end-volume # #volume wb # type performance/write-behind # subvolumes iot #end-volume # #volume ioc # type performance/io-cache # option cache-size 512MB # Choose this according to how much memory you want to dedicate # subvolumes wb #end-volume
There is a lot of info in this file and there are so many config possibilities. You will see I included performance boosters that I am currently using in production but depending on your application you may or may not want to use these. For more info on these and other possible config paramaters please take a look at: http://www.gluster.org/docs/index.php/GlusterFS
Almost there!
* mkdir /san * glusterfs -f /etc/glusterfs/glusterfs-client.vol /san
Typing "mount" should now show that glusterfs is mounted on /san (you can obviously use whatever folder name you want). Also a "df -h" will show you the amount of space available and how much has been used just like any other filesystem.
What if I want Gluster to start automatically?
Great question. Here is how you do it:
* nano /etc/init.d/glusterfs
# /etc/init.d/glusterfs
#!/bin/sh -e
ENV="env -i LANG=C PATH=/usr/local/bin:/usr/bin:/bin:/sbin"
do_start() {
echo "Running ldconfig";
ldconfig;
sleep 1;
echo "Running depmod -a";
depmod -a;
sleep 1;
echo "Running modprobe fuse";
modprobe fuse;
sleep 1;
if [ -x /san ] ; then
HAVE_SAN=1
else
echo "Mount point: /san doesn't exist creating..."
mkdir /san;
fi
sleep 1;
echo "Starting glusterfs";
glusterfs -f /etc/glusterfs/glusterfs-client.vol /san;
# Add whatever else you want to start after gluster here
# Imagine if we were serving /etc/nginx/nginx.conf off of gluster
#echo "Starting nginx";
#sleep 3;
#/etc/init.d/nginx start;
}
do_stop() {
echo "Unmounting /san";
umount /san;
}
do_restart() {
do_stop;
sleep 3;
do_start;
}
case "$1" in
start)
do_start
;;
restart|reload|force-reload)
do_restart
;;
stop)
do_stop
;;
esac
* chmod +x /etc/init.d/glusterfs * update-rc.d glusterfs defaults 51
Note: change /san to whatever folder you want to mount gluster on and take a look at the commented out section to find out how you can start services that may rely on files in the Gluster volume.
You can now use /san just like any other folder try making two clients and then make a file on one, switch to the other and you will see the file there. The possibilities are endless. For an example of what is possible I do this on my app instances:
* mv /etc/apache2/apache2.conf /etc/apache2/apache2.conf.orig * ln -s /san/www/conf/apache2.conf /etc/apache2/apache2.conf
This allows me to share across all my app instances my apache settings!
Ok, this post got a bit longer than I thought. I will follow it up with another one soon on how to encrypt the connection between your storage nodes and clients using stunnel.
Enjoy!
This information is out of date
and does not contain information related to the current version of Gluster
Documentation Home


