[Gluster-users] uninterruptible processes writing to glusterfsshare

bxmatus at gmail.com bxmatus at gmail.com
Wed Jun 8 19:21:33 UTC 2011


Im using kernel 2.6.34 + fuse 2.5.5 + gluster 3.2 from beginning and
it happen again today ...
php-fpm freeze and reboot was only solution.

Matus


2011/6/7 Markus Fröhlich <markus.froehlich at xidras.com>:
> hi!
>
> there ist no relavant output from dmesg.
> no entries in the server log - only the one line in the client-server log, I
> already posted.
>
> the glusterfs version on the server had been updated to gfs 3.2.0 more than
> a month ago.
> because of the troubles on the backup server, I deleted the whole backup
> share and started from scratch.
>
>
> I looked for a update of "fuse" and upgraded from 2.7.2-61.18.1 to
> 2.8.5-41.1
> maybe this helps.
>
> here is the changelog info:
>
> Authors:
> --------
>    Miklos Szeredi <miklos at szeredi.hu>
> Distribution: systemsmanagement:baracus / SLE_11_SP1
> * Tue Mar 29 2011 dbahi at novell.com
> - remove the --no-canonicalize usage for suse_version <= 11.3
>
> * Mon Mar 21 2011 coolo at novell.com
> - licenses package is about to die
>
> * Thu Feb 17 2011 mszeredi at suse.cz
> - In case of failure to add to /etc/mtab don't umount. [bnc#668820]
>  [CVE-2011-0541]
>
> * Tue Nov 16 2010 mszeredi at suse.cz
> - Fix symlink attack for mount and umount [bnc#651598]
>
> * Wed Oct 27 2010 mszeredi at suse.cz
> - Remove /etc/init.d/boot.fuse [bnc#648843]
>
> * Tue Sep 28 2010 mszeredi at suse.cz
> - update to 2.8.5
>  * fix option escaping for fusermount [bnc#641480]
>
> * Wed Apr 28 2010 mszeredi at suse.cz
> - keep examples and internal docs in devel package (from jnweiger)
>
> * Mon Apr 26 2010 mszeredi at suse.cz
> - update to 2.8.4
>  * fix checking for symlinks in umount from /tmp
>  * fix umounting if /tmp is a symlink
>
>
> kind regards
> markus froehlich
>
> Am 06.06.2011 21:19, schrieb Anthony J. Biacco:
>>
>> Could be fuse, check 'dmesg' for kernel module timeouts.
>>
>> In a similar vein, has anyone seen signifigant performance/reliability
>> with diff fuse versions? say, latest source vs. Rhel distro rpms vers.
>>
>> -Tony
>>
>>
>>
>> -----Original Message-----
>> From: Mohit Anchlia<mohitanchlia at gmail.com>
>> Sent: June 06, 2011 1:14 PM
>> To: Markus Fröhlich<markus.froehlich at xidras.com>
>> Cc: gluster-users at gluster.org<gluster-users at gluster.org>
>> Subject: Re: [Gluster-users] uninterruptible processes writing to
>> glusterfsshare
>>
>> Is there anything in the server logs? Does it follow any particular
>> pattern before going in this mode?
>>
>> Did you upgrade Gluster or is this new install?
>>
>> 2011/6/6 Markus Fröhlich<markus.froehlich at xidras.com>:
>>>
>>> hi!
>>>
>>> sometimes we've on some client-servers hanging uninterruptible processes
>>> ("ps aux" stat is on "D" ) and on one the CPU wait I/O grows within some
>>> minutes to 100%.
>>> you are not able to kill such processes - also "kill -9" doesnt work -
>>> when
>>> you connect via "strace" to such an process, you wont see anything and
>>> you
>>> cannot detach it again.
>>>
>>> there are only two possibilities:
>>> killing the glusterfs process (umount GFS share) or rebooting the server.
>>>
>>> the only log entry I found, was on one client - just a single line:
>>> [2011-06-06 10:44:18.593211] I
>>> [afr-common.c:581:afr_lookup_collect_xattr]
>>> 0-office-data-replicate-0: data self-heal is pending for
>>>
>>> /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db.
>>>
>>> one of the client-servers is a samba-server, the other one a
>>> backup-server
>>> based on rsync with millions of small files.
>>>
>>> gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0
>>>
>>> and here are the configs from server and client:
>>> server config
>>>
>>> "/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol":
>>> volume office-data-posix
>>>    type storage/posix
>>>    option directory /GFS/office-data02
>>> end-volume
>>>
>>> volume office-data-access-control
>>>    type features/access-control
>>>    subvolumes office-data-posix
>>> end-volume
>>>
>>> volume office-data-locks
>>>    type features/locks
>>>    subvolumes office-data-access-control
>>> end-volume
>>>
>>> volume office-data-io-threads
>>>    type performance/io-threads
>>>    subvolumes office-data-locks
>>> end-volume
>>>
>>> volume office-data-marker
>>>    type features/marker
>>>    option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
>>>    option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp
>>>    option xtime off
>>>    option quota off
>>>    subvolumes office-data-io-threads
>>> end-volume
>>>
>>> volume /GFS/office-data02
>>>    type debug/io-stats
>>>    option latency-measurement off
>>>    option count-fop-hits off
>>>    subvolumes office-data-marker
>>> end-volume
>>>
>>> volume office-data-server
>>>    type protocol/server
>>>    option transport-type tcp
>>>    option auth.addr./GFS/office-data02.allow *
>>>    subvolumes /GFS/office-data02
>>> end-volume
>>>
>>>
>>> --------------
>>> client config "/etc/glusterd/vols/office-data/office-data-fuse.vol":
>>> volume office-data-client-0
>>>    type protocol/client
>>>    option remote-host gfs-01-01
>>>    option remote-subvolume /GFS/office-data02
>>>    option transport-type tcp
>>> end-volume
>>>
>>> volume office-data-replicate-0
>>>    type cluster/replicate
>>>    subvolumes office-data-client-0
>>> end-volume
>>>
>>> volume office-data-write-behind
>>>    type performance/write-behind
>>>    subvolumes office-data-replicate-0
>>> end-volume
>>>
>>> volume office-data-read-ahead
>>>    type performance/read-ahead
>>>    subvolumes office-data-write-behind
>>> end-volume
>>>
>>> volume office-data-io-cache
>>>    type performance/io-cache
>>>    subvolumes office-data-read-ahead
>>> end-volume
>>>
>>> volume office-data-quick-read
>>>    type performance/quick-read
>>>    subvolumes office-data-io-cache
>>> end-volume
>>>
>>> volume office-data-stat-prefetch
>>>    type performance/stat-prefetch
>>>    subvolumes office-data-quick-read
>>> end-volume
>>>
>>> volume office-data
>>>    type debug/io-stats
>>>    option latency-measurement off
>>>    option count-fop-hits off
>>>    subvolumes office-data-stat-prefetch
>>> end-volume
>>>
>>>
>>>  -- Mit freundlichen Grüssen
>>>
>>> Markus Fröhlich
>>> Techniker
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>



More information about the Gluster-users mailing list