[Gluster-users] While running a rebalance, what does the metric "layout" represent

Thu Mar 3 16:22:35 UTC 2011

On 03/03/2011 11:13 AM, Burnash, James wrote:
> Hi Joe.
>
> Thank you - that's a nicely detailed explanation, and a sufficiently
> reasonable guess as to what the "layout" metric may mean.
>
> At the end of the day, a single lane of SATA for each box sure does
> look like the ultimate bottleneck in this setup - we were aware of
> this when we built it, but total storage was judged to be more
> important than speed, so at least we're on spec.

I guess I am confused by the 1.5Gb number.  With that, your bandwidth 
should max out around 187 MB/s.  Looks like its more.  Do you have each 
brick on its own dedicated 1.5Gb link?  Or is it 35 disks behind a 
single 1.5Gb link?

> Here is the dstat output from the two machine that are rebalancing -
> this is just about 30 seconds worth of output, but they're pretty
> constant in their load and numbers at this time:

Could you do a dstat -V so I can see if this is the 0.6.x series (versus 
the bugfixed 0.7.x series)?  We had issues with the 0.6.x series 
doubling apparent bandwidths on RAIDs.

>
> [root at jc1letgfs13 vols]# dstat ----total-cpu-usage---- -dsk/total-
> -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read
> writ| recv  send|  in   out | int   csw 0   1  99   0   0   0|2579k
> 32M|   0     0 |   0   0.3 |3532  4239 1   3  95   1   0   2|1536k
> 258M| 276M 4135k|   0     0 |  10k   14k 1   6  90   1   0   2|1304k
> 319M| 336M 4176k|   0     0 |  13k   16k 1   3  95   0   0   1|1288k
> 198M| 199M 3497k|   0     0 |  11k   11k 1   3  94   1   0   2|1288k
> 296M| 309M 4039k|   0     0 |  12k   15k 1   2  95   1   0   1|1032k
> 231M| 221M 2297k|   0     0 |  11k   11k 1   3  94   0   0   2|1296k
> 278M| 296M 4078k|   0     0 |  14k   15k 1   7  89   1   0   2|1552k
> 374M| 386M 5849k|   0     0 |  15k   19k 1   4  93   0   0   2|1024k
> 343M| 350M 2961k|   0     0 |  13k   17k 1   4  92   1   0   2|1304k
> 370M| 383M 4499k|   0     0 |  14k   18k 1   3  94   1   0   2| 784k
> 286M| 311M 5202k|   0     0 |  12k   15k 1   3  93   1   0   2|1280k
> 312M| 319M 3109k|   0     0 |  12k   16k 1   6  91   1   0   2|1296k
> 319M| 342M 4270k|   0     0 |  13k   16k
>

Looks like the network may be the issue more so than the disk.  I am 
wondering more about that 1.5Gb number.  Multiple lanes of 1.5Gb would 
show results like this.  I am guessing a lower end SAS link of some sort.

> root at jc1letgfs16:~# dstat ----total-cpu-usage---- -dsk/total-
> -net/total- ---paging-- ---system-- usr sys idl wai hiq siq| read
> writ| recv  send|  in   out | int   csw 1   2  84  12   0   1| 204M
> 84M|   0     0 |   0    19B|  28k   24k 1   1  82  16   0   1| 231M
> 6920k|1441k  240M|   0     0 |  16k   17k 1   2  79  18   0   1| 328M
> 1208k|2441k  338M|   0     0 |  18k   19k 1   1  85  13   0   1| 268M
> 136k|2139k  280M|   0     0 |  15k   16k 1   2  78  18   0   1| 370M
> 320k|2637k  383M|   0     0 |  19k   20k 1   2  79  18   0   1| 290M
> 136k|2245k  306M|   0     0 |  16k   18k 1   2  79  18   0   1| 318M
> 280k|1770k  325M|   0     0 |  17k   18k 1   2  80  17   0   1| 277M
> 248k|2149k  292M|   0     0 |  15k   17k 1   2  79  18   0   1| 313M
> 128k|2331k  328M|   0     0 |  17k   18k 1   2  79  18   0   1| 323M
> 376k|2373k  336M|   0     0 |  18k   19k 1   1  79  18   0   1| 267M
> 136k|2070k  275M|   0     0 |  15k   17k 1   1  78  19   0   1| 275M
> 368k|1638k  289M|   0     0 |  16k   18k 1   2  78  19   0   1| 337M
> 1480k|2450k  343M|   0     0 |  18k   20k 2   3  74  20   0   1| 312M
> 1344k|2403k  330M|   0     0 |  17k   24k 1   1  80  17   0   1| 263M
> 688k|2078k  275M|   0     0 |  16k   17k 1   1  81  16   0   1| 292M
> 120k|1677k  304M|   0     0 |  16k   17k 1   1  78  19   0   1| 264M
> 4264k|2118k  271M|   0     0 |  16k   19k
>

Which 10GbE cards?  And what are the network settings?  Which driver are 
you using?  etc?

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615