[Gluster-users] (3.1.5-1) "Another operation is in progress, please retry after some time"

Tomoaki Sato tsato at valinux.co.jp
Tue Aug 23 17:24:11 PDT 2011


kp,

I changed my scripts by reference to your comments.
following sequence works fine in my environment.

on baz-1:
# <register baz-1-private to DNS>
# gluster volume create baz baz-1-private:/mnt/brick
# gluster volume start baz
# <register baz-1-public to the DNS>

on baz-2 through baz-5:
# <register baz-<me>-private to the DNS>
# <wait baz-<anterior>-public appears on the DNS>
# ssh baz-1-private gluster peer probe baz-<me>-private
# ssh baz-1-private gluster volume add-brick baz baz-<me>-private:/mnt/brick
# <register baz-<me>-public to the DNS>

<anterior> = <me> - 1

Thanks,
tomo

(2011/08/23 14:32), krish wrote:
> Hi Tomoaki,
>
> To avoid the issue of the 'cluster' going into an undefined state, you need to avoid issuing
> peer addition/deletion commands in tandem with volume operations (create, add-brick,
> stop, delete etc).
>
> A part of the problem is that, all volume operations are performed such that all the
> peers part of the cluster are kept up to date about the 'proceedings'.
> Now adding newer members while the volume is being 'manipulated' leaves the new peer
> in a rather special situation. It does not hold the same 'view' as that of the other peers, about
> the ongoing volume operation. This is the summary of the problem.
>
> thanks,
> kp
>
> On 08/23/2011 10:28 AM, Tomoaki Sato wrote:
>> Hi kp,
>>
>> I anticipate the future version of gluster.
>> Do you have any recommends to avoid the issue for now ?
>> I've mentioned below but it's turned out to be false.
>>
>>>> I've noticed that following commands are stable.
>>>>
>>>> on baz-2-private through baz-5-private:
>>>> # <wait baz-1-private appears on the DNS>
>>>> # ssh baz-1-private gluster peer probe <me>
>>>> # ssh baz-1-private gluster volume add-brick baz <me>:/mnt/brick
>>>> # <register me to the DNS>
>>
>> Thanks,
>> tomo
>>
>> (2011/08/22 15:26), krish wrote:
>>> Hi Tomoaki,
>>>
>>> Issuing peer related commands like 'peer probe' and 'peer detach' concurrently with volume operations
>>> can cause the 'cluster' to get into an undefined state. We are working on getting glusterd cluster to handle concurrent commands robustly. See http://bugs.gluster.com/show_bug.cgi?id=3320 for updates on this issue.
>>>
>>> thanks,
>>> kp
>>>
>>> On 08/22/2011 10:39 AM, Tomoaki Sato wrote:
>>>> Hi kp,
>>>>
>>>> I've reproduce the issue in my environment.
>>>> please find attached taz.
>>>>
>>>> there are 5 VMs, baz-1-private through baz-5-private.
>>>> on each VMs, following commands are issued concurrently.
>>>>
>>>> on baz-1-private:
>>>> # gluster volume create baz baz-1-private:/mnt/brick
>>>> # gluster volume start baz
>>>> # <register baz-1-private to DNS>
>>>>
>>>> on baz-2-private through baz-5-private:
>>>> # <wait baz-1-private appears on the DNS>
>>>> # ssh baz-1-private gluster peer probe <me>
>>>> # gluster volume add-brick baz <me>:/mnt/brick
>>>> # <register me to the DNS>
>>>>
>>>> <me> = baz-n-private (n: 2,3,4,5)
>>>>
>>>> I've noticed that following commands are stable.
>>>>
>>>> on baz-2-private through baz-5-private:
>>>> # <wait baz-1-private appears on the DNS>
>>>> # ssh baz-1-private gluster peer probe <me>
>>>> # ssh baz-1-private gluster volume add-brick baz <me>:/mnt/brick
>>>> # <register me to the DNS>
>>>>
>>>> thanks,
>>>>
>>>> tomo
>>>>
>>>>
>>>> (2011/08/20 14:37), krish wrote:
>>>>> Hi Tomoaki,
>>>>>
>>>>> Can you attach the glusterd log files of the peersseeing the problem?
>>>>> Restarting glusterd(s) would solve the problem. Let me see the log files and let
>>>>> you know if anything else can be done to resolve the problem.
>>>>>
>>>>> thanks,
>>>>> kp
>>>>>
>>>>>
>>>>> On 08/18/2011 07:37 AM, Tomoaki Sato wrote:
>>>>>> Hi,
>>>>>>
>>>>>> baz-X-private and baz-Y-private, 2 newly probed peers, have issued the each 'gluster volume add-brick baz baz-{X|Y}-private:/mnt/brick' in very short period.
>>>>>> Both the 'add-brick's have returned without "Add Brick successful" messages.
>>>>>> After that, 'add-brick' returns with "Another operation is in progress, please retry after some time" on the both peers every time.
>>>>>> How should I clear this situation ?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> tomo
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>
>>
>



More information about the Gluster-users mailing list