[Gluster-users] df causes hang

phil cryer phil at cryer.us
Fri Feb 4 17:48:50 UTC 2011


On Thu, Feb 3, 2011 at 11:02 PM, Anand Avati <anand.avati at gmail.com> wrote:
> Ah! you must be mounting it wrong.. please mount it from a server (not using
> volfile)
> mount -t glusterfs SERVER:/vol /mnt
> or
> glusterfs -s SERVER --volfile-id vol /mnt
> that should fix it
> Avati

And that's it! The command I was using was getting info from the old
(pre 3.x) setup in fstab. This command worked for me:
mount -t glusterfs clustr-01:bhl-volume /mnt/glusterfs

and now df -h works, and I can see my files:
df -h | tail -n1
clustr-01:bhl-volume   96T   85T   11T  90% /mnt/glusterfs

I still have the error:
[2011-02-04 12:41:53.79745] E
[client-handshake.c:1079:client_query_portmap_cbk]
bhl-volume-client-98: failed to get the port number for remote
subvolume
[2011-02-04 12:41:53.79807] I [client.c:1590:client_rpc_notify]
bhl-volume-client-98: disconnected

But since that's not effecting this I'll write another email for it.
Thanks again for the help!

P

>
> On Thu, Feb 3, 2011 at 7:07 PM, phil cryer <phil at cryer.us> wrote:
>>
>> Avati - thanks for your reply, my comments below
>>
>> >> [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS
>> >> resolution failed on host /etc/glusterfs/glusterfs.vol
>>
>> > Please make sure you are able to resolve hostnames as given in volume
>> > info
>> > in all of your servers via 'dig'. The logs clearly show that host
>> > resolution
>> > seems to be failing.
>>
>> Agreed, however that does seem to be the issue because I can dig the
>> host (they're all defined in my hosts file too so it doesn't have to
>> look them up) named clustr-02 and in fact there are 23 other 'bricks'
>> on that host that are working fine:
>>
>> # gluster volume info | grep clustr-02
>> Brick2: clustr-02:/mnt/data01
>> Brick8: clustr-02:/mnt/data02
>> Brick14: clustr-02:/mnt/data03
>> Brick20: clustr-02:/mnt/data04
>> Brick26: clustr-02:/mnt/data05
>> Brick32: clustr-02:/mnt/data06
>> Brick38: clustr-02:/mnt/data07
>> Brick44: clustr-02:/mnt/data08
>> Brick50: clustr-02:/mnt/data09
>> Brick56: clustr-02:/mnt/data10
>> Brick62: clustr-02:/mnt/data11
>> Brick68: clustr-02:/mnt/data12
>> Brick74: clustr-02:/mnt/data13
>> Brick80: clustr-02:/mnt/data14
>> Brick86: clustr-02:/mnt/data15
>> Brick92: clustr-02:/mnt/data16
>> Brick98: clustr-02:/mnt/data17
>> Brick104: clustr-02:/mnt/data18
>> Brick110: clustr-02:/mnt/data19
>> Brick116: clustr-02:/mnt/data20
>> Brick122: clustr-02:/mnt/data21
>> Brick128: clustr-02:/mnt/data22
>> Brick134: clustr-02:/mnt/data23
>> Brick140: clustr-02:/mnt/data24
>>
>> I logged into that host, unmounted that mount, ran fsck.ext4 on it,
>> but it came back clean.
>>
>> Also thing, the log says: "glusterfs: DNS >> resolution failed on host
>> /etc/glusterfs/glusterfs.vol" - however, there is obviously no host
>> named  /etc/glusterfs/glusterfs.vol - does this point to an issue?
>>
>> And lastly, I even have a file named /etc/glusterfs/glusterfs.vol"
>>
>> ls -ls /etc/glusterfs
>> -rw-r--r-- 1 root root  229 Jan 16 21:15 glusterd.vol
>> -rw-r--r-- 1 root root 1908 Jan 16 21:15 glusterfsd.vol.sample
>> -rw-r--r-- 1 root root 2005 Jan 16 21:15 glusterfs.vol.sample
>>
>> I created all of the configs via the gluster> commandline tool.
>>
>> Thanks
>>
>> P
>>
>>
>>
>>
>> On Thu, Feb 3, 2011 at 6:39 PM, Anand Avati <anand.avati at gmail.com> wrote:
>> > Please make sure you are able to resolve hostnames as given in volume
>> > info
>> > in all of your servers via 'dig'. The logs clearly show that host
>> > resolution
>> > seems to be failing.
>> > Avati
>> >
>> > On Thu, Feb 3, 2011 at 1:08 PM, phil cryer <phil at cryer.us> wrote:
>> >>
>> >> This wasn't my issue, but I'm still having the issue. Today I purged
>> >> glusterfs 3.1.1 and installed 3.1.2 fresh from deb. I recreated my
>> >> volume, started it, everything was going fine, mounted the share, then
>> >> ran df -h to see it, now every few seconds my logs posts this:
>> >>
>> >> ==> /var/log/glusterfs/nfs.log <==
>> >> [2011-02-03 15:55:57.145626] E
>> >> [client-handshake.c:1079:client_query_portmap_cbk]
>> >> bhl-volume-client-98: failed to get the port number for remote
>> >> subvolume
>> >> [2011-02-03 15:55:57.145694] I [client.c:1590:client_rpc_notify]
>> >> bhl-volume-client-98: disconnected
>> >>
>> >> ==> /var/log/glusterfs/mnt-glusterfs.log <==
>> >> [2011-02-03 15:55:57.605802] E [common-utils.c:124:gf_resolve_ip6]
>> >> resolver: getaddrinfo failed (Name or service not known)
>> >> [2011-02-03 15:55:57.605834] E
>> >> [name.c:251:af_inet_client_get_remote_sockaddr] glusterfs: DNS
>> >> resolution failed on host /etc/glusterfs/glusterfs.vol
>> >>
>> >> over and over. Any clues as to how I can fix this? This one issue has
>> >> made our entire 100TB store unusable.
>> >>
>> >> and again, gluster volume info shows all the bricks are OK, including
>> >> 98:
>> >>
>> >> gluster> volume info
>> >>
>> >> Volume Name: bhl-volume
>> >> Type: Distributed-Replicate
>> >> Status: Started
>> >> Number of Bricks: 72 x 2 = 144
>> >> Transport-type: tcp
>> >> Bricks:
>> >> [...]
>> >> Brick92: clustr-02:/mnt/data16
>> >> Brick93: clustr-03:/mnt/data16
>> >> Brick94: clustr-04:/mnt/data16
>> >> Brick95: clustr-05:/mnt/data16
>> >> Brick96: clustr-06:/mnt/data16
>> >> Brick97: clustr-01:/mnt/data17
>> >> Brick98: clustr-02:/mnt/data17
>> >> Brick99: clustr-03:/mnt/data17
>> >> Brick100: clustr-04:/mnt/data17
>> >> Brick101: clustr-05:/mnt/data17
>> >> Brick102: clustr-06:/mnt/data17
>> >> Brick103: clustr-01:/mnt/data18
>> >> Brick104: clustr-02:/mnt/data18
>> >> Brick105: clustr-03:/mnt/data18
>> >> [...]
>> >>
>> >>
>> >> P
>> >>
>> >>
>> >> On Mon, Jan 31, 2011 at 4:26 PM, Anand Avati <anand.avati at gmail.com>
>> >> wrote:
>> >> > Can you post your server logs? What happens if you run 'df -k' on
>> >> > your
>> >> > backend export filesystems?
>> >> >
>> >> > Thanks
>> >> > Avati
>> >> >
>> >> > On Mon, Jan 17, 2011 at 5:27 AM, Joe Warren-Meeks
>> >> > <joe at encoretickets.co.uk>wrote:
>> >> >
>> >> >>
>> >> >> (sorry about topposting.)
>> >> >>
>> >> >> Just changing the timeout would only mask the problem. The real
>> >> >> issue
>> >> >> is
>> >> >> that running 'df' on either node causes a hang.
>> >> >>
>> >> >> All other operations seem fine, files can be created and deleted as
>> >> >> normal with the results showing up on both.
>> >> >>
>> >> >> I'd like to work out why it's hanging on df so I can fix it and get
>> >> >> my
>> >> >> monitoring and cron scripts running again :)
>> >> >>
>> >> >>  -- joe.
>> >> >>
>> >> >> -----Original Message-----
>> >> >> From: gluster-users-bounces at gluster.org
>> >> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Daniel Maher
>> >> >> Sent: 17 January 2011 12:48
>> >> >> To: gluster-users at gluster.org
>> >> >> Subject: Re: [Gluster-users] df causes hang
>> >> >>
>> >> >> On 01/17/2011 10:47 AM, Joe Warren-Meeks wrote:
>> >> >> > Hey chaps,
>> >> >> >
>> >> >> > Anyone got any pointers as to what this might be? This is still
>> >> >> causing
>> >> >> > a lot of problems for us whenever we attempt to do df.
>> >> >> >
>> >> >> >   -- joe.
>> >> >> >
>> >> >> > -----Original Message-----
>> >> >>
>> >> >> > However, for some reason, they've got into a bit of a state such
>> >> >> > that
>> >> >> > typing 'df -k' causes both to hang, resulting in a loss of service
>> >> >> for42
>> >> >> > seconds. I see the following messages in the log files:
>> >> >> >
>> >> >> >
>> >> >>
>> >> >> 42 seconds is the default tcp timeout time for any given node - you
>> >> >> could try tuning that down and seeing how it works for you.
>> >> >>
>> >> >>
>> >> >>
>> >> >> http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Se
>> >> >> tting_Volume_Options
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Daniel Maher <dma+gluster AT witbe DOT net>
>> >> >> _______________________________________________
>> >> >> Gluster-users mailing list
>> >> >> Gluster-users at gluster.org
>> >> >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> Gluster-users mailing list
>> >> >> Gluster-users at gluster.org
>> >> >> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> >> >>
>> >> >
>> >> > _______________________________________________
>> >> > Gluster-users mailing list
>> >> > Gluster-users at gluster.org
>> >> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> http://philcryer.com
>> >
>> >
>>
>>
>>
>> --
>> http://philcryer.com
>
>



-- 
http://philcryer.com



More information about the Gluster-users mailing list