Evaluation of OpenStack Storage Options

As part of implementing an enterprise private cloud based on OpenStack, and working with various cloud customers I had to examine different storage solutions including traditional ones and software defined storage (SDS) solutions. It seems like the answer is not trivial, and unfortunately it still depends on the types of workload you have, which challenges the essence of cloud infrastructure where by definition involves many types of workloads.

Another key consideration is the support/dev-ops level you want to deal with. The open source based SDS (Software Defined Storage) solutions may be “lower” cost, but tend to consume a lot of IT/DevOps attention. They are difficult to manage, and don’t have all the common functionality enterprise customers are used to. So for enterprise customers traditional solutions may still be more viable if designed right in a cost effective manner (going for the mid-range offerings and placing many disks behind each controller).

The two leading solutions for OpenStack block storage are iSCSI (Open source or commercial SDS, or commercial arrays) and Red Hat’s Ceph. The diagrams below illustrate a common deployment model:

With iSCSI there is a dual controller model in front of many disks, traditional commercial controllers typically incorporate read/write (non-volatile) caching, sometimes they include tiering functionality, and integrated management. iSCSI RDMA (iSER) is a hardware accelerated iSCSI option which can deliver millions of IOPs and is supported since the OpenStack Havana release. Software defined storage (SDS) options include OpenSource ones like tgtd and LIO, and there are a bunch of commercial SDS offerings like Zadara, Nexenta, EMC ScaleIO, etc.

In the Ceph model there are many OSD servers (typically each hosts some SSD for Metadata and 20-30 hard drives) and few monitor servers. The data is replicated and striped across all the OSD nodes. Note that deep striping can be good for large I/O oriented workloads but with many small I/Os or VM I/O blending effects this can lead to random I/Os at the disk level, the problem can be further amplified by the fact that the OSD controllers don’t typically have non-volatile write log/cache to absorb local variance. Note that there is an RDMA accelerated Ceph transport under development which will improve the throughput of Ceph and some of its CPU overhead. It may have lower impact on the IOPs due to Ceph’s overall pipeline inefficiency.

The chart below tries to summarize the different options we looked into. The cost is calculated by taking the total hardware and licenses cost, divided by the total capacity (of SATA disks)

Some key pointes that can be observed:

  • Ceph is the most comprehensive SDS option, supporting block, object and basic files, but is not a good fit for high-performance workloads like databases, or workloads with their own data and replication management like Hadoop. It is also viewed as “free” but its inefficient design which requires a CPU core per disk leads to higher costs when compared to other open source alternatives
  • Open source iSCSI is a single trick pony SDS (block/Cinder only). But if designed efficiently with lots of disks per controller, it can lead to the best cost and performance.
  • In the commercial solution space there is a big cost and feature variance, given that OpenStack manages the data you can choose a mid-range solution (like NetApp E serias, DotHill, etc.), which may land very close to the open source solution cost, and will be much cheaper on Opex with solutions that work out of the box. Another solution can be to use commercial SDS offering like Zadara, Nexenta, or EMC ScaleIO which is more expensive but typically more mature, simpler to use, and packed with more features.

Another row in the table compares the Amazon AWS total cost per year (i.e. Capex + Opex). It is somewhat amazing how those guys are pretty cheap compared to all the “Free” options, assuming they are not losing money, it’s an indication that still more cost effective solutions can be built. the catch however with AWS is that its going to cost you quite a bit if you plan to transfer your data to a different location or provider.

You are probably asking yourself, so what I recommend. Right ?

For now, choose a combination with Ceph and object storage for the low-mid performance workloads (VM images, data archives, etc.) and open-source or commercial iSCSI for the performance oriented block workloads. Use iSCSI-RDMA (iSER) if you want to go for the best latency, highest IOPs and bandwidth.

If you don’t have time to mess with installing and debugging OpenSource code, go with commercial SDS solutions or buy a mid-range array, no need for the fancy/expensive ones since OpenStack does much of the overall management.

Anyway it seems like there is room for solutions which take the “unified” and object based approach of Ceph, but can dynamically address a broader set of workloads. rather than using silos of object storage, high density arrays,  all flash arrays (AFA), low-cost directly attached storage (common in BigData compu-stor clusters like Hadoop, MongoDB, and Cassandra). Modern SDS solutions would need to deliver infrastructure which can adapt to the specific application/workload requirements, and in the most cost optimal way.

more on that in the following posts …

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s