highlandsun wrote:Something still doesn't make sense about all this. If the controller is really too stupid to reorder logically random writes into physically sequential writes, then caching alone wouldn't solve the problem. I.e., if you throw a long stream of random I/Os into a big cache, eventually it has to flush out and if the data span is big enough, it will still mean a lot of sparse writes. On the other hand, if the random I/Os all occur in a small region, it's possible that caching them is sufficient to make them all contiguous. I guess the question is how big was the RAID controllers' cache, relative to the data set in each test.
Typical flash memory block sizes are 64, 128, or 256 kiB. I may be translating size wrong but I'm assuming 256 kiB = 0.25 megabytes with with SLC SSDs using 64KB blocks and MLC SSDs using 128KB blocks
The Intel MLC drives have a controller with a 256KB cache again 0.25 megabytes.
The OCZ drives apparently have a controller with a cache that is 16KB which is not only 16 times smaller than the Intel controllers cache its also smaller than any common flash block size.
It's basically guaranteed that any random writes will cause the OCZ drive to flush the controllers cache (assuming I'm not just misunderstanding the concept).
ATTO benchmark runs from .5 KB to 1 MB meaning it runs from so small every random write will flush the cache to being bigger than the cache which is paradoxically good for SSD performance (in traditional hard drives the cache is way faster than the actual write mechanism so the curve works the opposite way).
The problem is not what the benchmark size is because the benchmarks allow you to vary the size and see the good and the bad.
The problem is what size writes are typical for your usage patterns?
For example there is this short list that I have no idea how old it is (possibly several years old, and I'll not promise there isn't missing context)
O/S Log Block Size
======= ==============
Solaris 512 bytes
HP-UX 1024 bytes
NT 512 bytes
OpenVMS 512 bytes
Digital UNIX 1024 bytes
That's 0.5KB and 1KB blocks.
I'm going to guess that programmers have been using low block sizes for a long time and it'll take time for software to be updated to start using larger block sizes.
Now assuming my math is still on the right part of the conversion factors we have this comparison:
An OCZ MLC SSD the controller has a 16KB cache and on a typical mainstream consumer hard drive there is a 8MB cache (about 512 times the cache)
An Intel MLC SSD the controller has a 256KB cache and a performance hard drive would typically have a 16MB cache (about 64 times larger)
I think a memoright SSD has a 16MB cache versus a really high end 7200 RPM drive having 32MB cache (about 2 times larger)
So hopefully when the flash gets cheap enough there will be enough money left on the bill of materials for SSDs to start to catch up with traditional hard drive cache sizes.