[Gluster-users] Weird glusterfs behaviour after add-bricks and fix-layout

Barak Sason Rofman bsasonro at redhat.com
Mon Nov 9 11:29:37 UTC 2020


Greetings Thomas,

I'll try to assist in determining RC and resolving the issue.
The following will help me assisting you:

   1. Please create a an issue on GitHub with all the relevant details,
   it'll be easier to track
   2. Please provide all client-side, brick-side and fix-layout logs

With that information I could begin an initial assessment of the situation.

Thank you,

On Sun, Nov 8, 2020 at 5:35 PM Thomas Bätzler <t.baetzler at bringe.com> wrote:

> Hi,
>
> the other day we decided to expand our gluster storage by adding two
> bricks, going from a 3x2 to a 4x2 distributed-replicated setup. In order
> to get to this point we had done rolling upgrades from 3.something to
> 5.13 to 7.8, all without issues. We ran into a spot of trouble during
> the fix-layout when both of the new nodes crashed in the space of two
> days. We rebooted them and the fix-layout process seemed to cope by
> starting itself again on these nodes.
>
> Now weird things are happening. We noticed that we can't access some
> files diretly anymore. Doing a stat on them returns a file not found.
> However, if we list the directory containing the files, the file is
> shown as present and subsequently it can be accessed on the client that
> ran the list, but not on other clients! Also, if we umount and remount,
> the file is inaccessible again.
>
> Does anybody have any ideas what's going on here? Is there any way to
> fix the volume without taking it offline for days? We have about 60T of
> data online and we need that data to be consistent and available.
>
> OS is Debian 10 with glusterfs-server 7.8-3.
>
> Volume configuration:
>
> Volume Name: hotcache
> Type: Distributed-Replicate
> Volume ID: 4c006efa-6fd6-4809-93b0-28dd33fee2d2
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4 x 2 = 8
> Transport-type: tcp
> Bricks:
> Brick1: hotcache1:/data/glusterfs/drive1/hotcache
> Brick2: hotcache2:/data/glusterfs/drive1/hotcache
> Brick3: hotcache3:/data/glusterfs/drive1/hotcache
> Brick4: hotcache4:/data/glusterfs/drive1/hotcache
> Brick5: hotcache5:/data/glusterfs/drive1/hotcache
> Brick6: hotcache6:/data/glusterfs/drive1/hotcache
> Brick7: hotcache7:/data/glusterfs/drive1/hotcache
> Brick8: hotcache8:/data/glusterfs/drive1/hotcache
> Options Reconfigured:
> performance.readdir-ahead: off
> diagnostics.client-log-level: INFO
> diagnostics.brick-log-level: INFO
> diagnostics.count-fop-hits: on
> diagnostics.latency-measurement: on
> server.statedump-path: /var/tmp
> diagnostics.brick-sys-log-level: ERROR
> storage.linux-aio: on
> performance.read-ahead: off
> performance.write-behind-window-size: 4MB
> performance.cache-max-file-size: 200kb
> nfs.disable: on
> performance.cache-refresh-timeout: 1
> performance.io-cache: on
> performance.stat-prefetch: off
> performance.quick-read: on
> performance.io-thread-count: 16
> auth.allow: *
> cluster.readdir-optimize: on
> performance.flush-behind: off
> transport.address-family: inet
> cluster.self-heal-daemon: enable
>
> TIA!
>
> Best regards,
> Thomas Bätzler
> --
> BRINGE Informationstechnik GmbH
> Zur Seeplatte 12
> D-76228 Karlsruhe
> Germany
>
> Fon: +49 721 94246-0
> Fon: +49 171 5438457
> Fax: +49 721 94246-66
> Web: http://www.bringe.de/
>
> Geschäftsführer: Dipl.-Ing. (FH) Martin Bringe
> Ust.Id: DE812936645, HRB 108943 Mannheim
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
*Barak Sason Rofman*

Gluster Storage Development

Red Hat Israel <https://www.redhat.com/>

34 Jerusalem rd. Ra'anana, 43501

bsasonro at redhat.com <adi at redhat.com>    T: *+972-9-7692304*
M: *+972-52-4326355*
@RedHat <https://twitter.com/redhat>   Red Hat
<https://www.linkedin.com/company/red-hat>  Red Hat
<https://www.facebook.com/redhat.il/>
<https://red.ht/sig>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201109/5c6e3a8d/attachment.html>


More information about the Gluster-users mailing list