<div dir="ltr"><div>Milos,<br><br></div>I just managed to take a look into a similar issue and my analysis is at [1]. I remember you mentioning about some incorrect /etc/hosts entries which lead to this same problem in earlier case, do you mind to recheck the same?<br><br>[1] <a href="http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html">http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html</a> </div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Dec 14, 2016 at 2:57 AM, MiloÅ¡ ÄŒuÄulović - MDPI <span dir="ltr"><<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi All,<br>
<br>
Moving forward with my issue, sorry for the late reply!<br>
<br>
I had some issues with the storage2 server (original volume), then decided to use 3.9.0, si I have the latest version.<br>
<br>
For that, I synced manually all the files to the storage server. I installed there gluster 3.9.0, started it, created new volume called storage and all seems to work ok.<br>
<br>
Now, I need to create my replicated volume (add new brick on storage2 server). Almost all the files are there. So, I was adding on storage server:<br>
<br>
* sudo gluter peer probe storage2<br>
* sudo gluster volume add-brick storage replica 2 storage2:/data/data-cluster force<br>
<br>
But there I am receiving "volume add-brick: failed: Host storage2 is not in 'Peer in Cluster' state"<br>
<br>
Any idea?<span class="im HOEnZb"><br>
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
Skype: milos.cuculovic.mdpi<br>
<br></span><div class="HOEnZb"><div class="h5">
On 08.12.2016 17:52, Ravishankar N wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 12/08/2016 09:44 PM, MiloÅ¡ ÄŒuÄulović - MDPI wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I was able to fix the sync by rsync-ing all the directories, then the<br>
hale started. The next problem :), as soon as there are files on the<br>
new brick, the gluster mount will render also this one for mounts, and<br>
the new brick is not ready yet, as the sync is not yet done, so it<br>
results on missing files on client side. I temporary removed the new<br>
brick, now I am running a manual rsync and will add the brick again,<br>
hope this could work.<br>
<br>
What mechanism is managing this issue, I guess there is something per<br>
built to make a replica brick available only once the data is<br>
completely synced.<br>
</blockquote>
This mechanism was introduced in 3.7.9 or 3.7.10<br>
(<a href="http://review.gluster.org/#/c/13806/" rel="noreferrer" target="_blank">http://review.gluster.org/#/c<wbr>/13806/</a>). Before that version, you<br>
manually needed to set some xattrs on the bricks so that healing could<br>
happen in parallel while the client still would server reads from the<br>
original brick. I can't find the link to the doc which describes these<br>
steps for setting xattrs.:-(<br>
<br>
Calling it a day,<br>
Ravi<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
Skype: milos.cuculovic.mdpi<br>
<br>
On 08.12.2016 16:17, Ravishankar N wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
On Thu, Dec 8, 2016 at 6:44 PM, MiloÅ¡ ÄŒuÄulović - MDPI<br>
<<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>> wrote:<br>
<br>
  Ah, damn! I found the issue. On the storage server, the storage2<br>
  IP address was wrong, I inversed two digits in the /etc/hosts<br>
  file, sorry for that :(<br>
<br>
  I was able to add the brick now, I started the heal, but still no<br>
  data transfer visible.<br>
<br>
</blockquote>
1. Are the files getting created on the new brick though?<br>
2. Can you provide the output of `getfattr -d -m . -e hex<br>
/data/data-cluster` on both bricks?<br>
3. Is it possible to attach gdb to the self-heal daemon on the original<br>
(old) brick and get a backtrace?<br>
  `gdb -p <pid of self-heal daemon on the orignal brick>`<br>
   thread apply all bt -->share this output<br>
  quit gdb.<br>
<br>
<br>
-Ravi<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
@Ravi/Pranith - can you help here?<br>
<br>
<br>
<br>
  By doing gluster volume status, I have<br>
<br>
  Status of volume: storage<br>
  Gluster process            TCP Port RDMA Port<br>
Online Pid<br>
------------------------------<wbr>------------------------------<wbr>------------------<br>
<br>
  Brick storage2:/data/data-cluster   49152   0 Y<br>
   23101<br>
  Brick storage:/data/data-cluster   49152   0 Y<br>
   30773<br>
  Self-heal Daemon on localhost     N/A    N/A Y<br>
   30050<br>
  Self-heal Daemon on storage      N/A    N/A Y<br>
   30792<br>
<br>
<br>
  Any idea?<br>
<br>
  On storage I have:<br>
  Number of Peers: 1<br>
<br>
  Hostname: 195.65.194.217<br>
  Uuid: 7c988af2-9f76-4843-8e6f-d94866<wbr>d57bb0<br>
  State: Peer in Cluster (Connected)<br>
<br>
<br>
  - Kindest regards,<br>
<br>
  Milos Cuculovic<br>
  IT Manager<br>
<br>
  ---<br>
  MDPI AG<br>
  Postfach, CH-4020 Basel, Switzerland<br>
  Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
  Tel. +41 61 683 77 35<br>
  Fax +41 61 302 89 18<br>
  Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
  Skype: milos.cuculovic.mdpi<br>
<br>
  On 08.12.2016 13:55, Atin Mukherjee wrote:<br>
<br>
    Can you resend the attachment as zip? I am unable to extract<br>
the<br>
    content? We shouldn't have 0 info file. What does gluster peer<br>
    status<br>
    output say?<br>
<br>
    On Thu, Dec 8, 2016 at 4:51 PM, MiloÅ¡ ÄŒuÄulović - MDPI<br>
    <<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
    <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>>> wrote:<br>
<br>
      I hope you received my last email Atin, thank you!<br>
<br>
      - Kindest regards,<br>
<br>
      Milos Cuculovic<br>
      IT Manager<br>
<br>
      ---<br>
      MDPI AG<br>
      Postfach, CH-4020 Basel, Switzerland<br>
      Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
      Tel. +41 61 683 77 35<br>
      Fax +41 61 302 89 18<br>
      Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
    <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>><br>
      Skype: milos.cuculovic.mdpi<br>
<br>
      On 08.12.2016 10:28, Atin Mukherjee wrote:<br>
<br>
<br>
        ---------- Forwarded message ----------<br>
        From: *Atin Mukherjee* <<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
    <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>><br>
        <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
    <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>>> <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
    <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>><br>
        <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
    <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>>>><wbr>><br>
        Date: Thu, Dec 8, 2016 at 11:56 AM<br>
        Subject: Re: [Gluster-users] Replica brick not working<br>
        To: Ravishankar N <<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>><br>
        <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a> <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>><br>
        <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>>>><br>
        Cc: MiloÅ¡ ÄŒuÄulović - MDPI <<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
    <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
        <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>><br>
        <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
    <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>>>><wbr>,<br>
        Pranith Kumar Karampuri<br>
        <<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>><br>
    <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>><br>
        <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
    <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
    <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>><wbr>>,<br>
        gluster-users<br>
        <<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a><br>
    <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>><br>
    <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
    <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>>><br>
        <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
    <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>><br>
        <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
    <mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>>>>><br>
<br>
<br>
<br>
<br>
        On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N<br>
        <<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>> <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>><br>
        <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>> <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
    <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>>>><br>
<br>
        wrote:<br>
<br>
          On 12/08/2016 10:43 AM, Atin Mukherjee wrote:<br>
<br>
            >From the log snippet:<br>
<br>
            [2016-12-07 09:15:35.677645] I [MSGID: 106482]<br>
<br>
    [glusterd-brick-ops.c:442:__gl<wbr>usterd_handle_add_brick]<br>
            0-management: Received add brick req<br>
            [2016-12-07 09:15:35.677708] I [MSGID: 106062]<br>
<br>
    [glusterd-brick-ops.c:494:__gl<wbr>usterd_handle_add_brick]<br>
            0-management: replica-count is 2<br>
            [2016-12-07 09:15:35.677735] E [MSGID: 106291]<br>
<br>
    [glusterd-brick-ops.c:614:__gl<wbr>usterd_handle_add_brick]<br>
        0-management:<br>
<br>
            The last log entry indicates that we hit the<br>
    code path in<br>
            gd_addbr_validate_replica_coun<wbr>t ()<br>
<br>
                    if (replica_count ==<br>
        volinfo->replica_count) {<br>
                        if (!(total_bricks %<br>
            volinfo->dist_leaf_count)) {<br>
                            ret = 1;<br>
                            goto out;<br>
            }<br>
                    }<br>
<br>
<br>
          It seems unlikely that this snippet was hit<br>
    because we print<br>
        the E<br>
          [MSGID: 106291] in the above message only if<br>
ret==-1.<br>
          gd_addbr_validate_replica_coun<wbr>t() returns -1 and<br>
    yet not<br>
        populates<br>
          err_str only when in volinfo->type doesn't match<br>
    any of the<br>
        known<br>
          volume types, so volinfo->type is corrupted<br>
perhaps?<br>
<br>
<br>
        You are right, I missed that ret is set to 1 here in<br>
    the above<br>
        snippet.<br>
<br>
        @Milos - Can you please provide us the volume info<br>
    file from<br>
        /var/lib/glusterd/vols/<volnam<wbr>e>/ from all the three<br>
    nodes to<br>
        continue<br>
        the analysis?<br>
<br>
<br>
<br>
          -Ravi<br>
<br>
            @Pranith, Ravi - Milos was trying to convert a<br>
    dist (1 X 1)<br>
            volume to a replicate (1 X 2) using add brick<br>
    and hit<br>
        this issue<br>
            where add-brick failed. The cluster is<br>
    operating with 3.7.6.<br>
            Could you help on what scenario this code path<br>
    can be<br>
        hit? One<br>
            straight forward issue I see here is missing<br>
    err_str in<br>
        this path.<br>
<br>
<br>
<br>
<br>
<br>
<br>
        --<br>
<br>
        ~ Atin (atinm)<br>
<br>
<br>
<br>
        --<br>
<br>
        ~ Atin (atinm)<br>
<br>
<br>
<br>
<br>
    --<br>
<br>
    ~ Atin (atinm)<br>
<br>
<br>
<br>
<br>
--<br>
<br>
~ Atin (atinm)<br>
</blockquote>
<br>
<br>
</blockquote></blockquote>
<br>
<br>
</blockquote>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><br></div><div>~ Atin (atinm)<br></div></div></div></div>
</div>