[Gluster-users] Input/output error

siga hiro hirokisiga at gmail.com
Thu Aug 25 10:48:53 UTC 2011


Thank you Pranith Kumar K.

I made an environment again because the numerical value of the result
of the getfattr command was wrong.

But Permission denied output.

 [172.23.0.1]
  glusterfs-core-3.2.1-1
  glusterfs-fuse-3.2.1-1

  mount -t glusterfs -o tcp,soft,timeo=3 172.23.0.1:/syncdata /syncdata

 [172.23.0.2]
  glusterfs-core-3.2.1-1
  glusterfs-fuse-3.2.1-1

  mount -t glusterfs -o tcp,soft,timeo=3 172.23.0.2:/syncdata /syncdata

 [check all]
 # getfattr -d -m . /home/syncdata
 # getfattr -d -m . /home/syncdata/testdata
 # md5sum

 result is the same with both machines.
 So,I tyied again.

  [172.23.0.2]
  umount /syncdata and stop glusterd

  update gluseter
  glusterfs-core-3.2.1-1 -> 3.2.2.1
  glusterfs-fuse-3.2.1-1 -> 3.2.2.1

  start glusterd and  nfs mount
  mount -t nfs -o nolock,nfsvers=3,tcp,hard,intr 172.23.0.2:/syncdata /syncdata

  # ls -al /syndata/testdata
  ls: /syncdata/testdata/: Permission denied

  I set gluster option (gluster volume set syncdata
diagnostics.client-log-level DEBUG ),because no output nfs.log.
  nfs.log  (ls -al /syndata/testdata)
[2011-08-25 19:39:56.86272] D [rpcsvc.c:709:nfs_rpcsvc_conn_init]
0-nfsrpc: New connection inited: sockfd: 13
[2011-08-25 19:39:56.86327] D
[rpcsvc.c:2733:nfs_rpcsvc_conn_listening_handler] 0-nfsrpc: New
Connection
[2011-08-25 19:39:56.86397] D
[rpcsvc.c:1953:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: 44398d39,
Ver: 2, Program: 100003, ProgVers: 3, Proc: 4
[2011-08-25 19:39:56.86420] D [rpcsvc.c:1370:nfs_rpcsvc_program_actor]
0-nfsrpc: Actor found: NFS3 - ACCESS
[2011-08-25 19:39:56.86441] D
[nfs3-helpers.c:2164:nfs3_log_common_call] 0-nfs-nfsv3: XID: 44398d39,
ACCESS: args: FH: hashcount 0, exportid
b3004e82-8e15-4016-afdf-855b58034610, gfid
00000000-0000-0000-0000-000000000001
[2011-08-25 19:39:56.86714] D
[nfs3-helpers.c:2296:nfs3_log_common_res] 0-nfs-nfsv3: XID: 44398d39,
ACCESS: NFS: 0(Call completed successfully.), POSIX: 6(No such device
or address)
[2011-08-25 19:39:56.86809] D
[rpcsvc.c:1953:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: 45398d39,
Ver: 2, Program: 100003, ProgVers: 3, Proc: 1
[2011-08-25 19:39:56.86830] D [rpcsvc.c:1370:nfs_rpcsvc_program_actor]
0-nfsrpc: Actor found: NFS3 - GETATTR
[2011-08-25 19:39:56.86846] D
[nfs3-helpers.c:2164:nfs3_log_common_call] 0-nfs-nfsv3: XID: 45398d39,
GETATTR: args: FH: hashcount 0, exportid
b3004e82-8e15-4016-afdf-855b58034610, gfid
00000000-0000-0000-0000-000000000001
[2011-08-25 19:39:56.87887] D
[nfs3-helpers.c:2296:nfs3_log_common_res] 0-nfs-nfsv3: XID: 45398d39,
GETATTR: NFS: 0(Call completed successfully.), POSIX: 107(Transport
endpoint is not connected)
[2011-08-25 19:39:56.87981] D
[rpcsvc.c:1953:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: 46398d39,
Ver: 2, Program: 100003, ProgVers: 3, Proc: 1
[2011-08-25 19:39:56.88003] D [rpcsvc.c:1370:nfs_rpcsvc_program_actor]
0-nfsrpc: Actor found: NFS3 - GETATTR
[2011-08-25 19:39:56.88018] D
[nfs3-helpers.c:2164:nfs3_log_common_call] 0-nfs-nfsv3: XID: 46398d39,
GETATTR: args: FH: hashcount 1, exportid
b3004e82-8e15-4016-afdf-855b58034610, gfid
f3599ede-60d0-4167-a470-5a6c9036d243
[2011-08-25 19:39:56.88483] D
[nfs3-helpers.c:2296:nfs3_log_common_res] 0-nfs-nfsv3: XID: 46398d39,
GETATTR: NFS: 0(Call completed successfully.), POSIX: 0(Success)
[2011-08-25 19:39:56.88850] D
[rpcsvc.c:1953:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: 47398d39,
Ver: 2, Program: 100003, ProgVers: 3, Proc: 1
[2011-08-25 19:39:56.88873] D [rpcsvc.c:1370:nfs_rpcsvc_program_actor]
0-nfsrpc: Actor found: NFS3 - GETATTR
[2011-08-25 19:39:56.88889] D
[nfs3-helpers.c:2164:nfs3_log_common_call] 0-nfs-nfsv3: XID: 47398d39,
GETATTR: args: FH: hashcount 1, exportid
b3004e82-8e15-4016-afdf-855b58034610, gfid
f3599ede-60d0-4167-a470-5a6c9036d243
[2011-08-25 19:39:56.89349] D
[nfs3-helpers.c:2296:nfs3_log_common_res] 0-nfs-nfsv3: XID: 47398d39,
GETATTR: NFS: 0(Call completed successfully.), POSIX: 0(Success)
[2011-08-25 19:39:56.89437] D
[rpcsvc.c:1953:nfs_rpcsvc_request_create] 0-nfsrpc: RPC XID: 48398d39,
Ver: 2, Program: 100003, ProgVers: 3, Proc: 4
[2011-08-25 19:39:56.89457] D [rpcsvc.c:1370:nfs_rpcsvc_program_actor]
0-nfsrpc: Actor found: NFS3 - ACCESS
[2011-08-25 19:39:56.89472] D
[nfs3-helpers.c:2164:nfs3_log_common_call] 0-nfs-nfsv3: XID: 48398d39,
ACCESS: args: FH: hashcount 1, exportid
b3004e82-8e15-4016-afdf-855b58034610, gfid
f3599ede-60d0-4167-a470-5a6c9036d243
[2011-08-25 19:39:56.89818] D
[nfs3-helpers.c:2296:nfs3_log_common_res] 0-nfs-nfsv3: XID: 48398d39,
ACCESS: NFS: 0(Call completed successfully.), POSIX: 0(Success)

It's buf 2921 ?

thanks.


2011/8/25 Pranith Kumar K <pranithk at gluster.com>:
> hi siga hiro,
>       The gfids of the directories /home/syncdata/testdata/ are different,
> most probably because of the bug 2921.
> This could have happened due to the following reason: You created the
> directory testdata before the volume is mounted and accessed them parallely.
> The gfid is assigned to the entry when it is first accessed. Please do
> respond back if this is not the case.
> If you think both the directories contain the same file, then you can remove
> the gfid xattr by executing setfattr -x trusted.gfid /home/syncdata/testdata
> on both the machines. and access it from just one of the mounts.
> Then this problem goes away.
>    If you are using more than one mounts to access fresh data under the
> volumes then first mount one client and do a "find <mount-point>" and then
> mount the rest of the clients and use it.
> The find command will assign the gfids which wont conflict.
>
> Pranith.
>
> On 08/24/2011 04:28 PM, siga hiro wrote:
>>
>> Thank you Pranith Kumar K.
>>
>> ------172.23.0.1-------------------------------
>> # getfattr -d -m . /home/syncdata/
>> getfattr: Removing leading '/' from absolute path names
>> # file: home/syncdata
>> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
>> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
>> trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
>> trusted.glusterfs.quota.dirty=0sMAA=
>> trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
>> trusted.glusterfs.test="working\000"
>>
>> # getfattr -d -m . /home/syncdata/testdata/
>> getfattr: Removing leading '/' from absolute path names
>> # file: home/syncdata/testdata
>> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
>> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
>> trusted.gfid=0st0UDRLu7TEqt2W8wc30mCQ==
>>
>> ------172.23.0.2-------------------------------
>> # getfattr -d -m . /home/syncdata/
>> getfattr: Removing leading '/' from absolute path names
>> # file: home/syncdata
>> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
>> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
>> trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ==
>> trusted.glusterfs.quota.dirty=0sMAA=
>> trusted.glusterfs.quota.size=0sAAAAAAAAAAA=
>> trusted.glusterfs.test="working\000"
>>
>> # getfattr -d -m . /home/syncdata/testdata/
>> getfattr: Removing leading '/' from absolute path names
>> # file: home/syncdata/testdata
>> trusted.afr.syncdata-client-0=0sAAAAAAAAAAAAAAAA
>> trusted.afr.syncdata-client-1=0sAAAAAAAAAAAAAAAA
>> trusted.gfid=0shkqegy6JT0KgZjAlx3Db0w==
>>
>>
>> thanks.
>>
>> 2011/8/24 Pranith Kumar K<pranithk at gluster.com>:
>>>
>>> hi siga hiro,
>>>     Can you provide the output of:
>>> getfattr -d -m . /home/syncdata
>>> getfattr -d -m . /home/syncdata/testdata
>>>
>>> On both the machines.
>>> Pranith
>>>
>>> On 08/24/2011 02:11 PM, siga hiro wrote:
>>>>
>>>> Thank you for the quick answer.
>>>>
>>>>> 1) http://bugs.gluster.com/show_bug.cgi?id=2921 (most likely this)
>>>>
>>>> Isn't this solved in GlusterFS 3.2.3?
>>>>
>>>> I have installed GlusterFS 3.2.3 in 172.23.0.2.
>>>> (get from
>>>> http://download.gluster.com/pub/gluster/glusterfs/LATEST/CentOS/)
>>>> And It confirmed that md5sum corresponded with 172.23.0.1 and
>>>> 172.23.0.2.
>>>> # md5sum *
>>>> 8012eaf68e8ee8153d1b4f317dea385d  error_log.txt
>>>> 88f70311135f82578a69866bce0564ba  error.log
>>>>
>>>> mount 172.23.0.2
>>>>   ->    mount -t glusterfs -o tcp,soft,timeo=3 172.23.0.2:/syncdata
>>>> /syncdata
>>>>
>>>> But...
>>>> [root at 172.23.0.2 /]# ls -al /syncdata/testdata/
>>>> ls: reading directory /syncdata/testdata/: Input/output error
>>>>
>>>> /var/log/glusterfs/nfs.log
>>>> [2011-08-24 17:06:14.447688] I [rpc-clnt.c:1531:rpc_clnt_reconfig]
>>>> 0-syncdata-client-0: changing port to 24009 (from 0)
>>>> [2011-08-24 17:06:17.453688] I
>>>>
>>>>
>>>> [client-handshake.c:1082:select_server_supported_programs]0-syncdata-client-1:
>>>> Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
>>>> [2011-08-24 17:06:17.456448] I
>>>> [client-handshake.c:913:client_setvolume_cbk] 0-syncdata-client-1:
>>>> Connected to 172.23.11.121:24009, attached to remote volume
>>>> '/home/syncdata'.
>>>> [2011-08-24 17:06:17.456517] I [afr-common.c:2611:afr_notify]
>>>> 0-syncdata-replicate-0: Subvolume 'syncdata-client-1' came back up;
>>>> going online.
>>>> [2011-08-24 17:06:17.456957] I
>>>>
>>>>
>>>> [client-handshake.c:1082:select_server_supported_programs]0-syncdata-client-0:
>>>> Using Program GlusterFS-3.1.0, Num (1298437), Version (310)
>>>> [2011-08-24 17:06:17.457937] I
>>>> [client-handshake.c:913:client_setvolume_cbk] 0-syncdata-client-0:
>>>> Connected to 172.23.3.4:24009, attached to remote volume
>>>> '/home/syncdata'.
>>>> [2011-08-24 17:06:17.458478] I [afr-common.c:912:afr_fresh_lookup_cbk]
>>>> 0-syncdata-replicate-0: added root inode
>>>> [2011-08-24 17:06:52.479588] W
>>>> [afr-common.c:656:afr_lookup_self_heal_check] 0-syncdata-replicate-0:
>>>> /fastask: gfid different on subvolume
>>>> [2011-08-24 17:06:52.480560] I
>>>> [client3_1-fops.c:411:client3_1_stat_cbk] 0-syncdata-client-0: remote
>>>> operation failed: No such file or directory
>>>> [2011-08-24 17:06:52.481555] I
>>>> [client3_1-fops.c:1099:client3_1_access_cbk] 0-syncdata-client-0:
>>>> remote operation failed: No such file or directory
>>>> [2011-08-24 17:06:52.482554] I
>>>> [client3_1-fops.c:2132:client3_1_opendir_cbk] 0-syncdata-client-0:
>>>> remote operation failed: No such file or directory
>>>> [2011-08-24 17:06:52.482577] W
>>>> [client3_1-fops.c:5136:client3_1_readdir] 0-syncdata-client-0:
>>>> (689897478): failed to get fd ctx. EBADFD
>>>> [2011-08-24 17:06:52.482592] W
>>>> [client3_1-fops.c:5201:client3_1_readdir] 0-syncdata-client-0: failed
>>>> to send the fop: File descriptor in bad state
>>>> [2011-08-24 17:06:52.482608] I
>>>> [afr-dir-read.c:120:afr_examine_dir_readdir_cbk]
>>>> 0-syncdata-replicate-0: /fastask: failed to do opendir on
>>>> syncdata-client-0
>>>> [2011-08-24 17:06:52.482811] I
>>>> [afr-dir-read.c:174:afr_examine_dir_readdir_cbk]
>>>> 0-syncdata-replicate-0:  entry self-heal triggered. path: /fastask,
>>>> reason: checksums of directory differ, forced merge option set
>>>> [2011-08-24 17:06:52.483553] I
>>>> [client3_1-fops.c:1303:client3_1_entrylk_cbk] 0-syncdata-client-0:
>>>> remote operation failed: No such file or directory
>>>> [2011-08-24 17:06:52.483642] E
>>>> [afr-self-heal-entry.c:2292:afr_sh_post_nonblocking_entry_cbk]
>>>> 0-syncdata-replicate-0: Non Blocking entrylks failed for /fastask.
>>>> [2011-08-24 17:06:52.483839] W [afr-common.c:122:afr_set_split_brain]
>>>>
>>>>
>>>> (-->/opt/glusterfs/3.2.3/lib64/glusterfs/3.2.3/xlator/cluster/replicate.so(afr_sh_post_nonblocking_entry_cbk+0xf5)
>>>> [0x2aaaaad137f5]
>>>>
>>>>
>>>> (-->/opt/glusterfs/3.2.3/lib64/glusterfs/3.2.3/xlator/cluster/replicate.so(afr_sh_entry_done+0x46)
>>>> [0x2aaaaad13646]
>>>>
>>>>
>>>> (-->/opt/glusterfs/3.2.3/lib64/glusterfs/3.2.3/xlator/cluster/replicate.so(afr_self_heal_completion_cbk+0x246)
>>>> [0x2aaaaad0cac6]))) 0-syncdata-replicate-0: invalid argument: inode
>>>> [2011-08-24 17:06:52.483864] E
>>>> [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk]
>>>> 0-syncdata-replicate-0: background  entry entry self-heal failed on
>>>> /fastask
>>>> [2011-08-24 17:06:52.483898] W
>>>> [client3_1-fops.c:5253:client3_1_readdirp] 0-syncdata-client-0:
>>>> (689897478): failed to get fd ctx. EBADFD
>>>> [2011-08-24 17:06:52.483913] W
>>>> [client3_1-fops.c:5317:client3_1_readdirp] 0-syncdata-client-0: failed
>>>> to send the fop: File descriptor in bad state
>>>>
>>>> thanks.
>>>>
>>>>> hi siga hiro,
>>>>>    I see the following warning:
>>>>> [2011-08-24 11:36:04.695145] W
>>>>> [afr-common.c:656:afr_lookup_self_heal_check]
>>>>> 0-syncdata-replicate-0: /testdata: gfid different on subvolume
>>>>>
>>>>> I also see that you have more than one mount on the volume. Most
>>>>> probably
>>>>> you are running into one of the following bugs:
>>>>> 1) http://bugs.gluster.com/show_bug.cgi?id=2921 (most likely this)
>>>>> 2) http://bugs.gluster.com/show_bug.cgi?id=2745
>>>>>
>>>>> If it is not the bug 2745, you can confirm it is the bug 2921 if the
>>>>> md5sums
>>>>> on the files match on both the machines 172.23.0.1, 172.23.0.2
>>>>>
>>>>> pranith.
>>>>>
>>>>> On 08/24/2011 11:48 AM, siga hiro wrote:
>>>>>
>>>>> Hi, everyone.
>>>>> Its nice meeting you.
>>>>> I am poor at English....
>>>>>
>>>>> I am writing this because I'd like to update GlusterFS to 3.2.2-1,and I
>>>>> want
>>>>> to change from gluster mount to nfs mount.
>>>>>
>>>>> I have installed GlusterFS 3.2.1 one week ago,and replication 2 server.
>>>>>
>>>>> OS:CentOS5.5 64bit
>>>>> RPM:glusterfs-core-3.2.1-1
>>>>>     glusterfs-fuse-3.2.1-1
>>>>>
>>>>> command
>>>>>  gluster volume create syncdata replica 2  transport tcp
>>>>> 172.23.0.1:/home/syncdata 172.23.0.2:/home/syncdata
>>>>>
>>>>> mount command
>>>>>  172.23.0.1 ->    mount -t glusterfs -o tcp,soft,timeo=3
>>>>> 172.23.0.1:/syncdata
>>>>> /syncdata
>>>>>  172.23.0.2 ->    mount -t glusterfs -o tcp,soft,timeo=3
>>>>> 172.23.0.2:/syncdata
>>>>> /syncdata
>>>>>
>>>>> So,Yesterday I update GlusterFS to 3.2.2-1 and use nfs mount.
>>>>>  172.23.0.2 ->    mount -t nfs  -o nolock,nfsvers=3,tcp,hard,intr
>>>>> 172.23.0.2:/syncdata /syncdata
>>>>>
>>>>> [root at 172.23.0.2 /]# ls -al /syncdata/testdata/
>>>>> ls: reading directory /syncdata/testdata/: Input/output error
>>>>>
>>>>> /var/log/glusterfs/nfs.log
>>>>> [2011-08-24 11:35:16.319379] I
>>>>> [client-handshake.c:1082:select_server_supported_programs]
>>>>> 0-syncdata-client-1: Using Program GlusterFS-3.1.0, Num (1298437),
>>>>> Version
>>>>> (310)
>>>>> [2011-08-24 11:35:16.322126] I
>>>>> [client-handshake.c:913:client_setvolume_cbk]
>>>>> 0-syncdata-client-1: Connected to 172.23.0.2:24009, attached to remote
>>>>> volume '/home/syncdata'.
>>>>> [2011-08-24 11:35:16.322191] I [afr-common.c:2611:afr_notify]
>>>>> 0-syncdata-replicate-0: Subvolume 'syncdata-client-1' came back up;
>>>>> going
>>>>> online.
>>>>> [2011-08-24 11:35:16.323281] I
>>>>> [client-handshake.c:1082:select_server_supported_programs]
>>>>> 0-syncdata-client-0: Using Program GlusterFS-3.1.0, Num (1298437),
>>>>> Version
>>>>> (310)
>>>>> [2011-08-24 11:35:16.324274] I
>>>>> [client-handshake.c:913:client_setvolume_cbk]
>>>>> 0-syncdata-client-0: Connected to 172.23.0.1:24009, attached to remote
>>>>> volume '/home/syncdata'.
>>>>> [2011-08-24 11:35:16.324801] I [afr-common.c:912:afr_fresh_lookup_cbk]
>>>>> 0-syncdata-replicate-0: added root inode
>>>>> [2011-08-24 11:36:04.695145] W
>>>>> [afr-common.c:656:afr_lookup_self_heal_check]
>>>>> 0-syncdata-replicate-0: /testdata: gfid different on subvolume
>>>>> [2011-08-24 11:36:04.696121] I
>>>>> [client3_1-fops.c:411:client3_1_stat_cbk]
>>>>> 0-syncdata-client-0: remote operation failed: No such file or directory
>>>>> [2011-08-24 11:36:04.697121] I
>>>>> [client3_1-fops.c:1099:client3_1_access_cbk]
>>>>> 0-syncdata-client-0: remote operation failed: No such file or directory
>>>>> [2011-08-24 11:36:04.698118] I
>>>>> [client3_1-fops.c:2132:client3_1_opendir_cbk]
>>>>> 0-syncdata-client-0: remote operation failed: No such file or directory
>>>>> [2011-08-24 11:36:04.698140] W
>>>>> [client3_1-fops.c:5136:client3_1_readdir]
>>>>> 0-syncdata-client-0: (689897478): failed to get fd ctx. EBADFD
>>>>> [2011-08-24 11:36:04.698155] W
>>>>> [client3_1-fops.c:5201:client3_1_readdir]
>>>>> 0-syncdata-client-0: failed to send the fop: File descriptor in bad
>>>>> state
>>>>> [2011-08-24 11:36:04.698168] I
>>>>> [afr-dir-read.c:120:afr_examine_dir_readdir_cbk]
>>>>> 0-syncdata-replicate-0:
>>>>> /fastask: failed to do opendir on syncdata-client-0
>>>>>
>>>>> # gluster volume info all
>>>>>
>>>>> Volume Name: syncdata
>>>>> Type: Replicate
>>>>> Status: Started
>>>>> Number of Bricks: 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 172.23.0.1:/home/syncdata
>>>>> Brick2: 172.23.0.2:/home/syncdata
>>>>>
>>>>>
>>>>> After an 172.23.0.2 server is made to work as usual, I want to do the
>>>>> work
>>>>> of the 172.23.0.1 server.
>>>>>
>>>>> Any ideas?
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>



More information about the Gluster-users mailing list