<div dir="ltr">Hi,<br><br><div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Dec 12, 2015 at 2:35 AM, Vijay Bellur <span dir="ltr"><<a href="mailto:vbellur@redhat.com" target="_blank">vbellur@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class=""><br>
<br>
----- Original Message -----<br>
> From: "Sachidananda URS" <<a href="mailto:surs@redhat.com">surs@redhat.com</a>><br>
> To: "Gluster Devel" <<a href="mailto:gluster-devel@gluster.org">gluster-devel@gluster.org</a>><br>
</span><span class="">> Sent: Friday, December 11, 2015 10:26:04 AM<br>
> Subject: [Gluster-devel] Help needed in understanding GlusterFS logs and debugging elasticsearch failures<br>
><br>
</span><div><div class="h5">> Hi,<br>
><br>
> I was trying to use GlusterFS as a backend filesystem for storing the<br>
> elasticsearch indices on GlusterFS mount.<br>
><br>
> The filesystem operations as far as I can understand is, lucene engine<br>
> does a lot of renames on the index files. And multiple threads read<br>
> from the same file concurrently.<br>
><br>
> While writing index, elasticsearch/lucene complains of index corruption and<br>
> the<br>
> health of the cluster goes to red, and all the operations on the index fail<br>
> hereafter.<br>
><br>
> ===================<br>
><br>
> [2015-12-10 02:43:45,614][WARN ][index.engine       ] [client-2]<br>
> [logstash-2015.12.09][3] failed engine [merge failed]<br>
> org.apache.lucene.index.MergePolicy$MergeException:<br>
> org.apache.lucene.index.CorruptIndexException: checksum failed (hardware<br>
> problem?) : expected=0 actual=6d811d06<br>
> (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")<br>
> [slice=_a7_Lucene50_0.doc]))<br>
>Â Â Â Â Â at<br>
>Â Â Â Â Â org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1233)<br>
>Â Â Â Â Â at<br>
>Â Â Â Â Â org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)<br>
>Â Â Â Â Â at<br>
>Â Â Â Â Â java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)<br>
>Â Â Â Â Â at<br>
>Â Â Â Â Â java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)<br>
>Â Â Â Â Â at java.lang.Thread.run(Thread.java:745)<br>
> Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed<br>
> (hardware problem?) : expected=0 actual=6d811d06<br>
> (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")<br>
> [slice=_a7_Lucene50_0.doc]))<br>
><br>
> =====================<br>
><br>
><br>
> Server logs does not have anything. The client logs is full of messages like:<br>
><br>
><br>
><br>
> [2015-12-03 18:44:17.882032] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-61881676454442626.tlog<br>
> (hash=esearch-replicate-0/cache=esearch-replicate-0) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-311.ckp<br>
> (hash=esearch-replicate-1/cache=<nul>)<br>
> [2015-12-03 18:45:31.276316] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2384654015514619399.tlog<br>
> (hash=esearch-replicate-0/cache=esearch-replicate-0) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-312.ckp<br>
> (hash=esearch-replicate-0/cache=<nul>)<br>
> [2015-12-03 18:45:31.587660] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-4957943728738197940.tlog<br>
> (hash=esearch-replicate-0/cache=esearch-replicate-0) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-312.ckp<br>
> (hash=esearch-replicate-0/cache=<nul>)<br>
> [2015-12-03 18:46:48.424605] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-1731620600607498012.tlog<br>
> (hash=esearch-replicate-1/cache=esearch-replicate-1) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-313.ckp<br>
> (hash=esearch-replicate-1/cache=<nul>)<br>
> [2015-12-03 18:46:48.466558] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-5214949393126318982.tlog<br>
> (hash=esearch-replicate-1/cache=esearch-replicate-1) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-313.ckp<br>
> (hash=esearch-replicate-1/cache=<nul>)<br>
> [2015-12-03 18:48:06.314138] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-9110755229226773921.tlog<br>
> (hash=esearch-replicate-0/cache=esearch-replicate-0) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-314.ckp<br>
> (hash=esearch-replicate-1/cache=<nul>)<br>
> [2015-12-03 18:48:06.332919] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-5193443717817038271.tlog<br>
> (hash=esearch-replicate-1/cache=esearch-replicate-1) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-314.ckp<br>
> (hash=esearch-replicate-1/cache=<nul>)<br>
> [2015-12-03 18:49:24.694263] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]<br>
> 0-esearch-dht: renaming<br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2750483795035758522.tlog<br>
> (hash=esearch-replicate-1/cache=esearch-replicate-1) =><br>
> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-315.ckp<br>
> (hash=esearch-replicate-0/cache=<nul>)<br>
><br>
> ==============================================================<br>
><br>
> The same setup works well on any of the disk filesystems.<br>
> This is 2 x 2 distributed-replicate setup:<br>
><br>
> # gluster vol info<br>
><br>
> Volume Name: esearch<br>
> Type: Distributed-Replicate<br>
> Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede<br>
> Status: Started<br>
> Number of Bricks: 2 x 2 = 4<br>
> Transport-type: tcp,rdma<br>
> Bricks:<br>
> Brick1: 10.70.47.171:/gluster/brick1<br>
> Brick2: 10.70.47.187:/gluster/brick1<br>
> Brick3: 10.70.47.121:/gluster/brick1<br>
> Brick4: 10.70.47.172:/gluster/brick1<br>
> Options Reconfigured:<br>
> performance.read-ahead: off<br>
> performance.write-behind: off<br>
><br>
><br>
> I need a little bit help in understanding the failures. Let me know if you<br>
> need<br>
> further information on setup or access to the system to debug further. I've<br>
> attached the debug logs for further investigation.<br>
><br>
<br>
<br>
</div></div>Would it be possible to turn off all the performance translators (md-cache, quickread, io-cache etc.) and check if the same problem persists? Collecting strace of the elasticsearch process that does I/O on gluster can also help.<br></blockquote><div><br></div><div>I turned off all the performance xlators. <br><br><br>Â gluster vol info<br>Â <br>Volume Name: esearch<br>Type: Distributed-Replicate<br>Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede<br>Status: Started<br>Number of Bricks: 2 x 2 = 4<br>Transport-type: tcp,rdma<br>Bricks:<br>Brick1: 10.70.47.171:/gluster/brick1<br>Brick2: 10.70.47.187:/gluster/brick1<br>Brick3: 10.70.47.121:/gluster/brick1<br>Brick4: 10.70.47.172:/gluster/brick1<br>Options Reconfigured:<br>performance.stat-prefetch: off<br>performance.md-cache-timeout: 0<br>performance.quick-read: off<br>performance.io-cache: off<br>performance.read-ahead: off<br>performance.write-behind: off<br><br>The problem still persists. Attaching strace logs.<br><br></div><div><br></div><div>-sac<br></div></div></div></div></div>