This post explores the question: “how can gluster utilize SSDs” ? It does this by reviewing three tests done by the Red Hat performance group. In each test SSDs were used in a different configuration. The tests varied by cost of ownership and tunability.
The LSI Nytro MegaRAID 8110-4e card was used for testing on systems in Red Hat’s performance labs. The card can be configured for different RAID levels on disk drives. It has an SSD attached to it. The SSD is flexible and can be configured in different ways. Other cards could have been tested; this was chosen because of its availability in the lab.
Here are the tests:
We would expect SSDs perform best in random I/O workloads when compared with disks. The SSD size should be larger than the RAM, because the Linux buffer cache is already doing caching.
How can users realize SSDs benefit at the least cost?
Replacing disks with SSDs
The most obvious deployment method is to simply replace all the disks in Gluster with SSDs. This may be prohibitive from the cost perspective, but suggests an upper bound that would be achievable.
The experiments met expectations. They showed SSDs performed much better than disks in most cases, in particular with small files.
To compare Nytro SSD to traditional spindles, 3 modes of accessing storage were tried:
These tests used 2 workload types, with varying file sizes and numbers of threads, placing files at random into directories and with random exponential file size distribution.
operation type | threads | file size | SSD files/sec | write-back files/sec | write-thru files/sec |
---|---|---|---|---|---|
create | 1 | 4 | 4325 | 2485 | |
1 | 16 | 3890 | 2662 | ||
1 | 64 | 2527 | 1616 | ||
4 | 4 | 12648 | 5354 | 192 | |
4 | 16 | 10785 | 5077 | 189 | |
4 | 64 | 6027 | 2761 | 175 | |
16 | 4 | 24110 | 5803 | 492 | |
16 | 16 | 18596 | 7427 | 496 | |
16 | 64 | 7963 | 3852 | 461 | |
read | 1 | 4 | 9869 | 5384 | |
1 | 16 | 6641 | 4157 | ||
1 | 64 | 4222 | 2144 | ||
4 | 4 | 22585 | 14619 | 11841 | |
4 | 16 | 19457 | 7394 | 7109 | |
4 | 64 | 9437 | 5164 | 2092 | |
16 | 4 | 58010 | 5260 | 4286 | |
16 | 16 | 36636 | 4356 | 3837 | |
16 | 64 | 10379 | 3756 | 2425 |
Note the numbers in bold. There are 5 times the write back performance numbers for create, and 10 times what was seen for reads.
Using SSDs as a cache at the controller level
By default, the LSI Nytro card utilizes its SSD as a cache in front of the disks. By using a cache, it is desired that frequently used data will be quickly accessible on the SSD rather than the disk. The caching policies are internal to the hardware – in effect the cache is a “black box”. This experiment tries to show how good its caching policies work.
The tests appeared to show the SSDs had some benefit, but not nearly as significant as when the disks were completely replaced. For example, the best results showed a 70% improvement, while replacing the disks completely in some cases yielded a 500% improvement.
I/O was run directly to RAID-6 volumes without gluster, in order to isolate the effect of the SSD. I/O was generated using the smallfile tool. The tool generated in a “small file” workload that generates random I/O operations.
200,000 x 20 = 4 million files were written as part of each test, with a total of 256 GB of data, two times the amount of RAM available on the host.
3 passes of each test were done. The write tests are done using the swift-put operation, and the read tests are done using the swift-get operation.
Using SSDs as a cache at the kernel level (dm-cache)
The Nytro card’s default SSD caching configuration did not generate very impressive improvements. Linux introduced an alternative in the 3.9 kernel called the “dm-cache”, aka “dynamic block level storage caching”. It can cache blocks at the device manager level within the kernel. The blocks reside on a “cache device”, typically an SSD. A related project is called bcache.
The dm-cache module has a tunable policy (e.g. LRU, MFU). The file system can send hints to dm-cache to “encourage” blocks to be cached or not cached.
For the test, the Nytro card’s SSD was re-purposed to act as the caching device for the dm-cache. The test compared using dm-cache with using the normal RAID write-back cache on the Nytro controller.
The results showed dm-cache performed well when there was no caching on the controller level (write through was set on Nytro). When write-back was set, little benefit was observed in most cases, with the exception of the small file workload.
The test was preliminary, for example, RAID-10 rather than RAID-6 was used. The meta device used by dm-cache was housed on a disk.
That said, dm-cache appears promising. Nytro’s cache helps performance, but many users prefer JBODs to expensive controllers. Such JBOD users would see worse performance without having the cache. They may be able to recover that performance by using dm-cache+SSDs.
2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...
It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...
The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...