The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

Automated Hadoop Deployment on GlusterFS with Apache Ambari

Gluster
2013-10-16

The glusterfs-hadoop team is pleased to announce that the Apache Ambari project now supports the automated deployment and configuration of Hadoop on top of GlusterFS.

What is Apache Ambari?

Apache Ambari is a browser based Hadoop Management Web UI that is used to provision, manage and monitor Hadoop clusters. Once Apache Ambari is installed on a management server a user can use it to select a particular Hadoop stack to deploy on a group of servers. In addition, the user can specify which services within the stack they want deployed as well as the appropriate configurations for each of the services. Up until now, Apache Ambari has only supported the automated deployment and configuration of Hortonworks Data Plaform stacks on top of the Hadoop Distributed FileSystem (HDFS).

Deploying HDP 1.3.2 on GlusterFS within Apache Ambari

Over the last several months a number of engineers from Hortonworks and Red Hat have collaborated within the Apache Ambari Incubator project to modify the core HDP 1.3.2 stack to provide users the choice of either HDFS or GlusterFS. Should one select GlusterFS, the Hadoop distribution is then configured to use the Hadoop FileSystem plug-in for GlusterFS.

Prior to the GlusterFS support in Ambari, one had to separately download Apache Hadoop and configure it to use the glusterfs-hadoop Hadoop FileSystem plugin in order to get Hadoop to run on GlusterFS. All of these steps are now automated.

Figure 1 – Users can select a stack that includes the GlusterFS Hadoop FileSystem

stack

Figure 2 – Users can choose whether they want HDFS or GlusterFS as the Hadoop FileSystem

services

In order to take advantage of this new feature please follow the instructions on the glusterfs-hadoop project wiki.

So what’s next?

The Apache Ambari project is currently working on a re-architecture of the stack definition in order to support the ability to arbitrarily define and extend Ambari stacks. This should go a long way to enabling broader support for Hadoop Compatible FileSystems and improving Hadoop Interoperability.

Lastly, at the time of writing, Apache Ambari only works on RHEL, CentOS, OEL and SLES. Thus, we are we’ve also been putting some time in getting Apache Ambari working on Fedora so that the Fedora community has access to it. This should also make integration with the existing glusterfs-hadoop and related projects a lot simpler.

 

BLOG

  • 06 Dec 2020
    Looking back at 2020 – with g...

    2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

    Read more
  • 27 Apr 2020
    Update from the team

    It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

    Read more
  • 03 Feb 2020
    Building a longer term focus for Gl...

    The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

    Read more