[Gluster-users] very bad performance on small files

Fri Jan 14 22:12:01 UTC 2011

On 01/14/2011 04:50 PM, Marcus Bointon wrote:
> On 14 Jan 2011, at 18:58, Jacob Shucart wrote:
>
>> This kind of thing is fine on local disks, but when you're talking
>> about a distributed filesystem the network latency starts to add up
>> since 1 request to the web server results in a bunch of file
>> requests.
>
> I think the main objection is that it takes a huge amount of network
> latency to explain a>  1,500% overhead with only 2 machines.

If most of your file access times are dominated by latency (e.g. small, 
seeky like loads), and you are going over a gigabit connection, yeah, 
your performance is going to crater on any cluster file system.

Local latency to traverse the storage stack is on the order of 10's of 
microseconds.  Physical latency of the disk medium is on the order of 
10's of microseconds for RAMdisk, 100's of microseconds for flash/ssd, 
and 1000's of microseconds (e.g. milliseconds) for spinning rust.

Now take 1 million small file writes.  Say 1024 bytes.  These million 
writes have to traverse the storage stack in the kernel to get to disk.

Now add in a network latency event on the order of 1000's of 
microseconds for the remote storage stack and network stack to respond.

I haven't measured it yet in a methodical manner, but I wouldn't be 
surprised to see IOP rates within a factor of 2 of the bare metal for a 
sufficiently fast network such as Infiniband, and within a factor of 4 
or 5 for a slow network like Gigabit.

Our own experience has been generally that you are IOP constrained 
because of the stack you have to traverse.  If you add more latency into 
this stack, you have more to traverse, and therefore, you have more you 
need to wait.  Which will have a magnification effect upon times for 
small IO ops which are seeky (stat, small writes, random ops).

>
> On 14 Jan 2011, at 15:20, Joe Landman wrote:
>
>> MB size or larger
>
> So does gluster become faster abruptly when file sizes cross some
> threshold? Or are average speeds are proportional to file size? Would

Its a continuous curve, and very much user load specific.  The fewer 
seeky operations you can do the better (true of all cluster file systems).

> be good to see a wider spread of values on benchmarks of throughput
> vs file size for the same overall volume (like Max's data but with
> more intermediate values)

I haven't seen Max's data, so I can't comment on this.  Understand that 
performance is going to be bound by many things.  One of many things is 
the speed of the spinning disk if thats what you use.  Another will be 
network.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615