Friday, October 10, 2008

Solid State Drives

This is a guest post by Mark Wilkins, a Senior Software Engineer on the Advantage R&D team.

Solid State Drives
Solid solid state drives (SSDs) present a new storage option that is becoming increasingly economically viable. Our colleagues from the SQL Anywhere team had mentioned that they had done some testing with solid state drives. Our own team was having an informal brainstorming session one day, and we were speculating about how these drives will affect the performance and characteristics of Advantage Database Server. Being the diligent and self-sacrificing developers that we are, and because we are good at running with other peoples' ideas, we immediately chased down some cash, bought an SSD, and tested it.

We purchased an OCZ Core Series 64GB 2.5" SATA II drive. Most of the testing was done on two different machines: a Dell Precision 380 3.2GHz desktop development PC with a 7200 rpm Seagate Barracuda SATA drive and a Dell PowerEdge 1950 1.83GHz quad core 64-bit Xeon server with a 15,000 rpm SCSI drive.

When holding the SSD in your hand, the form factor is a compelling feature. It is small, light and very desirable and is fun to carry around and show off to other developers. Of course, when it is dangling off of a SATA cable with no obvious resting place inside a desktop PC that has a bunch of fans and two noisy spinning hard disks, some of its sexiness gets lost in the mix. In the rack machine, I was able to slide it carefully into the far back reaches of the second hard drive bay and click it into place. It is currently still sitting there and will be until someone with long skinny flexible fingers can pry it out.

With minimal searching, one can find dozens of videos on the Internet purporting to show that SSDs are blazingly fast and the best thing since flying toasters. The common theme for these videos is to take to identical laptops, put an SSD in one and a "traditional" hard disk drive (HDD) in the other, point a video camera at them and have the laptops "race". A person will push a button on each laptop to simultaneously boot, load programs, eat bowls of Häagen-Dazs, etc. In the videos, the laptop with the SSD is always a clear winner. In our testing the SSD does perform very well, but its benefit does not seem to be as clear-cut as the videos would have you believe. Go figure. If the HDDs in the videos were extremely full, had minimally sized swap files and were extremely fragmented, then the videos might be realistic. To be fair, though, I don't have two identically configured laptops to try similar tests on, so I could well be wrong.

However, what we were really interested in was to see how Advantage behaved. The short answer is that Advantage performs very well with an SSD; we did not uncover any surprises. In some scenarios the HDD appears to have a slight edge and in others, the SSD wins. In many situations, the I/O caching performed by Advantage Database Server equalizes the performance of the two types of drives. In general, the test results were fairly predictable. For example, the near-zero latency and seek times of an SSD provides for the ability to read random portions of data from the drive very fast. Using this knowledge, one can construct tests that benefit from that behavior.

Some of the testing we performed was to run TPC-C transactions over some interval of time with a number of clients. These tests provide a reasonable cross section of queries with a mix of reads and writes across multiple tables. We captured a lot of different test result numbers, and there was no clear winner with the hardware we tested. For example, In one case, we ran 50 TPC-C clients against the quad core server for 5 hours. When running against the SSD, the test performed about 10% more iterations than when running against the HDD. However, when we ran the same test scenario for very short periods (e.g., 1 minute), the HDD typically outperformed the SSD by up to 10%.

In the absence of an obvious winner based on numbers, we were able to resort to trend spotting. Some situations in which the SSD seems to outperform a traditional HDD include the following.
  • Long sustained reads in natural record order.
  • Long sustained update and append operations with multiple indexes being updated.
  • The very first query against an unread (non-cached) table that involves index usage. Once indexes get cached, though, the difference disappears.
  • Reading and updating fragmented indexes.
It ultimately comes down to usage patterns and will obviously vary by application. If, though, we assume that this one SSD that we tested is comparable to other SSDs, then we can conclude that the current crop of SSDs perform very similarly to HDDs and are a valid choice for data storage for Advantage Database Server.

Some Numbers
The following are a few of the numbers obtained during the testing. The next set of simple reindex, read, and append tests were all performed on the Precision desktop workstation.

Reindex 1 million record table (2 indexes):
SSD: 2,096 ms
7200 rpm HDD: 3,573 ms


Read through 1 million record table:
SSD: 2,295 ms
7200 rpm HDD: 2,828 ms


Append 100,000 records with 3 indexes and a memo:
SSD: 27,900 ms
7200 rpm HDD: 28,100 ms


Set AOFs (filters) with 25 clients for 60 seconds:
SSD: 33,006 filters
7200 rpm HDD: 38,006 filters


The next few are some numbers from the TPC-C tests. These particular tests involved 8 tables with updates to 5 of the tables. Each "iteration" was a single transaction that involved an average of 22 queries and 22 updates/inserts. Each of the tests had 50 clients running concurrently.

60 second test on the Precision workstation:
SSD: 2425 iterations
7200 rpm HDD: 1986 iterations


60 second test on the PowerEdge server:
SSD: 3763 iterations
15,000 rpm HDD: 4176 iterations


5 hour test on the PowerEdge server:
SSD: 1,459,455 iterations
15,000 rpm HDD: 1,347,630 iterations


As you can see, the results are somewhat mixed. In general, the SSD would slightly out-perform the HDD, but it was certainly not always a given. For example, in the numbers above, the 60 second test on the PowerEdge server consistently had the HDD performing more iterations, but longer test runs usually gave the nod to the SSD. The short test runs, though, on the desktop workstation would typically place the SSD as the winner. Also, the filter (AOF) test would consistently show the HDD as the winner. I don't have any good explanations. One possibility is that the tests on the workstation were not run under ideal situations. I generally had multiple applications (documents, editors, IDEs, etc.) open while the tests were running. I did not do anything to ensure that those applications would not suddenly decide during a test run to scan for updated files or phone home to see if a critical update had just been released by its vendor. Still, though, that type of situation reflects some real world situations under which Advantage is used.

If any of you have real world results involving solid state drives, it would be interesting to hear about them.

1 comment:

doug faucette said...

Mark, that was a very interesting article! It really helps put the performance of the solid state drives in perspective in an Advantage environment. Good info.

Post a Comment