Community/GlusterFS on Scalr/EC2

From GlusterDocumentation

(Redirected from GlusterFS on Scalr/EC2)

This information is out of date
and does not contain information related to the current version of Gluster

Documentation Home


These instructions are going to be step by step on how to install Gluster on Scalr. That means they are specific to Ubuntu Hardy on a Xen kernel. However, it is trivial to run it on another distro and I will give pointers along the way for how to do it if you happen to be trying another distro or you are not on Amazon.


Let's get started!

First thing we are going to do is setup our storage node(s). Gluster allows for all sorts of cool configurations where you can unify multiple storage nodes together so they appear as one logical volume, automatically replicate files to other storage nodes for redundancy, stripe parts of files, etc. This tutorial is meant to get you up and running so I will go through a 1 storage node setup. If you take a look at the Gluster wiki (http://gluster.org/docs/index.php/GlusterFS) there are lot's of tutorials to setup multiple storage node configurations.


Storage Node

Ok, from Scalr boot up a base image (32 or 64). I personally run 32 bit (m1.small) because it only costs $0.10/hour but you can definitely do m1.large if you really want to.


SSH into your machine and run the following commands:

   * apt-get install flex bison xfsprogs ntp
   * ln -sf /usr/share/zoneinfo/US/Pacific /etc/localtime
     (optional, but whatever you do make sure ntp is installed and working and all your nodes are on the same timezone)
   * wget http://ftp.gluster.com/pub/gluster/glusterfs/1.3/glusterfs-1.3.12.tar.gz
     (this is the current release, but you can obviously pick a newer one if this is out of date)
   * tar -zxvf glusterfs-1.3.12.tar.gz
   * cd glusterfs-1.3.12
   * 32-bit machines run this: ./configure --prefix=
   * 64-bit machines run this: ./configure --prefix= --libdir=/usr/lib64
   * make install

Note: the end of the configure commands will say that ib-verbs and fuse are not supported, but you don't need to worry about that since it only applies to machines that will mount the Gluster volume (clients)


Next get whatever volume and folder you want to share ready. I personally used an EBS volume but you could use any drive connected to your machine. (Also, /dev/sdb was the device I wanted to use. Obviously change that to whatever you want to use)

   * mkfs.xfs /dev/sdb
   * echo "/dev/sdb /gluster xfs noatime 0 0" >> /etc/fstab
   * mkdir /gluster
   * mount /gluster
   * mkdir /gluster/export
   * optional for unify: mkdir /gluster/export-ns
   * nano /etc/glusterfs/glusterfs-server.vol  (you can use vi if you wish or emacs even ;-))

# /etc/glusterfs/glusterfs-server.vol
volume brick-raw
 type storage/posix
 option directory /gluster/export
end-volume

volume brick
 type features/posix-locks
 subvolumes brick-raw
end-volume

# optional for unify #
#volume brick-ns
#  type storage/posix
#  option directory /gluster/export-ns
#end-volume

### Add network serving capability to above brick.
volume server
 type protocol/server
 option transport-type tcp/server     # For TCP/IP transport
 subvolumes brick brick-ns
 option auth.ip.brick.allow 10.* # Allow access to "brick" volume
 # optional for unify #
 #option auth.ip.brick-ns.allow 127.0.0.1 # Allow access to "brick-ns" volume
 option auth.login.brick.allow gluster_client
 # optional for unify #
 #option auth.login.brick-ns.allow gluster_client
 option auth.login.gluster_client.password GiveMeData
end-volume


   * glusterfsd -f /etc/glusterfs/glusterfs-server.vol

The gluster server daemon should now be running and you should see it with a: ps aux | grep gluster. If it doesn't start take a look at /var/log/glusterfs/glusterfsd.log


Client Node

At this point we have a Gluster server sharing it's data now we just need to access it with a client and mount it!

Boot up an app/web/base whatever instance.

SSH into your new instance and follow enter these commands:

   * apt-get remove fuse-utils  (don't be alarmed if your instance doesn't have it, some do some don't)
   * rmmod fuse

Now for the fun part. The stock fuse module in Ubuntu Hardy doesn't work with Gluster for some reason. Anyways we want better performance out of the special Gluster patched version of fuse, so we are going to install that. I will make another post at some point to explain how to roll your own module, but it took me days to figure out and is incredibly complicated. Why don't you enjoy the fruits of my efforts and download a tarball that has the precompiled module all ready to go? That's what open source is all about anyways!

   * 32-bit machine (Scalr): wget http://www.envoymediagroup.com/downloads/fuse-2.6.16-xenU.32.tar.gz  (md5sum: 812664836d50ca14527b85821c6e8cb4)
   * 64-bit machine (Scalr): wget http://www.envoymediagroup.com/downloads/fuse-2.6.16.33-xenU.64.tar.gz  (md5sum: c257c015fe1c6aadac70c13bf8ce7314)
   * mv fuse-*.tar.gz /
   * tar -zxvf fuse-*
   * depmod -a
   * modprobe fuse

If you tail /var/log/messages you should see the Fuse module loaded into the kernel at this point.

Let's say hypothetically you came across this page and you aren't running on Scalr. Well, you should! But if you want to install the custom fuse module on any other distro do this (Scalr users just ignore this section):

   * wget http://ftp.gluster.com/pub/gluster/glusterfs/fuse/fuse-2.7.3glfs10.tar.gz
   * tar -zxvf fuse*.tar.gz
   * cd fuse*
   * ./configure --prefix=/usr --enable-kernel-module
   * make install
   * ldconfig
   * depmod -a
   * rmmod fuse
   * modprobe fuse

Ok, continuing on for Scalr users and everyone else!

   * wget http://ftp.gluster.com/pub/gluster/glusterfs/1.3/glusterfs-1.3.12.tar.gz
     (this is the current release, but you can obviously pick a newer one if this is out of date, Just make sure you use the SAME version as you did on the storage node.)
   * tar -zxvf glusterfs-1.3.12.tar.gz
   * cd glusterfs-1.3.12
   * 32-bit machines run this: ./configure --prefix=
   * 64-bit machines run this: ./configure --prefix= --libdir=/usr/lib64
   * make install
   * nano /etc/glusterfs/glusterfs-server.vol  (you can use vi if you wish or emacs even ;-)) 

# /etc/glusterfs/glusterfs-client.vol

volume sto1-brick
  type protocol/client
  option transport-type tcp/client
  option remote-host <ip or hostname of storage node>
  option remote-subvolume brick
  option username gluster_client
  option password GiveMeData
end-volume

# Performance Boosters
volume iot
  type performance/io-threads
  option thread-count 2
  subvolumes sto1-brick
end-volume

volume wb
  type performance/write-behind
  subvolumes iot
end-volume

volume ioc
  type performance/io-cache
  option cache-size 512MB # Choose this according to how much memory you want to dedicate
  subvolumes wb
end-volume


# optional for unify #
#volume sto-ns
#  type protocol/client
#  option transport-type tcp/client
#  option remote-host <ip or hostname of storage node with namespace>
#  option remote-subvolume brick-ns
#  option username gluster_client
#  option password GiveMeData
#end-volume
#
#volume unify0
#  type cluster/unify
#  option scheduler rr # round robin
#  option namespace sto-ns
#  subvolumes sto1-brick
#end-volume
#

# Performance Boosters
#  - If you use this section remove the Performance Boosters section above
#volume iot
#  type performance/io-threads
#  option thread-count 2
#  subvolumes unify0
#end-volume
#
#volume wb
#  type performance/write-behind
#  subvolumes iot
#end-volume
#
#volume ioc
#  type performance/io-cache
#  option cache-size 512MB # Choose this according to how much memory you want to dedicate
#  subvolumes wb
#end-volume

There is a lot of info in this file and there are so many config possibilities. You will see I included performance boosters that I am currently using in production but depending on your application you may or may not want to use these. For more info on these and other possible config paramaters please take a look at: http://www.gluster.org/docs/index.php/GlusterFS


Almost there!

   * mkdir /san
   * glusterfs -f /etc/glusterfs/glusterfs-client.vol /san

Typing "mount" should now show that glusterfs is mounted on /san (you can obviously use whatever folder name you want). Also a "df -h" will show you the amount of space available and how much has been used just like any other filesystem.


What if I want Gluster to start automatically?

Great question. Here is how you do it:

   * nano /etc/init.d/glusterfs

# /etc/init.d/glusterfs

#!/bin/sh -e

ENV="env -i LANG=C PATH=/usr/local/bin:/usr/bin:/bin:/sbin"

do_start() {
        echo "Running ldconfig";
        ldconfig;
        sleep 1;
        echo "Running depmod -a";
        depmod -a;
        sleep 1;
        echo "Running modprobe fuse";
        modprobe fuse;
        sleep 1;
        if [ -x /san ] ; then
                HAVE_SAN=1
        else
                echo "Mount point: /san doesn't exist creating..."
                mkdir /san;
        fi
        sleep 1;
        echo "Starting glusterfs";
        glusterfs -f /etc/glusterfs/glusterfs-client.vol /san;
       

        # Add whatever else you want to start after gluster here
        # Imagine if we were serving /etc/nginx/nginx.conf off of gluster
        #echo "Starting nginx";
        #sleep 3;
        #/etc/init.d/nginx start;
       
}

do_stop() {
        echo "Unmounting /san";
        umount /san;
}

do_restart() {
        do_stop;
        sleep 3;
        do_start;
}

case "$1" in
        start)
                do_start
                ;;
        restart|reload|force-reload)
                do_restart
                ;;
        stop)
                do_stop
                ;;
esac



   * chmod +x /etc/init.d/glusterfs
   * update-rc.d glusterfs defaults 51

Note: change /san to whatever folder you want to mount gluster on and take a look at the commented out section to find out how you can start services that may rely on files in the Gluster volume.


You can now use /san just like any other folder try making two clients and then make a file on one, switch to the other and you will see the file there. The possibilities are endless. For an example of what is possible I do this on my app instances:

   * mv /etc/apache2/apache2.conf /etc/apache2/apache2.conf.orig
   * ln -s /san/www/conf/apache2.conf /etc/apache2/apache2.conf

This allows me to share across all my app instances my apache settings!


Ok, this post got a bit longer than I thought. I will follow it up with another one soon on how to encrypt the connection between your storage nodes and clients using stunnel.


Enjoy!

This information is out of date
and does not contain information related to the current version of Gluster

Documentation Home


 

Copyright © Gluster, Inc. All Rights Reserved.