At A Glance
Massive array of idle discs (MAID) storage, a technology that radically cuts down on the
ower consumption of disk drives by spinning them down or off when not in use, has barely made a dent in the market. Despite the hype around green IT, there are several reasons for the slow adoption, in particular the requirement to trade performance for more energy efficiency. This is changing with some interesting shifts in the technology that will likely make MAID storage more appetizing for IT shops over the next 12 months.
In the five years since its introduction, MAID-based storage deployments can be measured in the hundreds, rather than thousands.
MAID is a storage technology that packs a large number of disc drives into a single array and spins only those that are being used at any given time. This significantly reduces the power consumption of the system, and according to vendors that sell these products, also prolongs the life of the drives. For customers, it means the ability to keep much more data online instead of carting it off to tape. The acronym was coined in the late 1990s at the University of Colorado in Boulder, where students were working on a research project to build high-density online storage that consumed less power. COPAN Systems, located in Longmont, Colorado, productised the research and was the first company to ship a system, in February 2004. Since then, COPAN has installed 300 or so systems worldwide, and a handful of other vendors have joined the market.
Who’s Using It And Why?
MAID storage is designed for infrequently accessed data, sometimes referred to as persistent data, where performance (IOPS or bandwidth) is not a big concern. With better response times as compared to access times in magnetic tapes, MAID storage has made some headway in the backup market, where customers are eager to increase service levels and improve the reliability of backups. If you want disc-based backup and have power or cooling constraints in the data center, MAID storage is a killer product. However, most IT shops have retained the traditional tools. Financial services companies, Web 2.0 firms, and government agencies have been early adopters, because they keep large quantities of data online and are often located in major cities short on power. Chicago Mercantile Exchange, Credit Suisse, MySpace.com, and the U.S. Department of Defense are all MAID users.
Drawbacks To Adoption
If you have even considered using MAID, chances are that you ruled it out for a handful of reasons. Limited ways to consume the technology, performance concerns, a lack of tools to identify what data to place on MAID storage, and short refresh cycles have all been major obstacles to most firms' implementations:
Limited ways to use the technology. To date, the popular way to access MAID storage has been via a VTL interface in conjunction with a backup application. This has meant a lot of work on your part to figure out how to integrate these systems into an existing backup environment. Furthermore, tape libraries and backup environments are long-term investments where change is unwelcome.
Performance concerns (and myths). There has been a lot of fear, uncertainty, and doubt (FUD) about MAID storage that has obscured some of the real concerns about the technology. Some of the myths floating around have included suggestions that power savings from drives being idle are eaten up by the additional power required to spin the drives back to life. That turns out to be nonsense. And similarly, so is the claim that hard drives fail more often when they are spun on and off. COPAN argues that its software proactively monitors and manages drive health by periodically exercising all disks and detecting potential drive failures before they occur. Better management of the drives actually prolongs the mean time between failures, the company claims.
But there are some less well understood issues around the use of MAID given its performance characteristics. For example, COPAN Systems packs 896 1TB SATA drives into a single cabinet, spinning a maximum of 25 percent of them at any one time and queuing requests as they come in.
In the event that data is requested from a logical unit number (LUN) on a RAID group that is powered down, power management software will spin down an operational LUN when it's no longer in use and then power up the LUN that contains the data being requested. This means a delay in getting data on or off LUNs depending on what drives are currently spinning at the time of the request. The reduction in power consumption can be as much as 85percent compared with traditional storage arrays in which all disks are spinning, but not all applications can tolerate the latency introduced by the spin-up process.
Lack of software tools to identify appropriate data for MAID. Most organizations have little to no insight into how much or what kind of data they have. Couple this with a shortage of software to specifically identify persistent data suitable for MAID storage and it's no surprise adoption has been slow. Most storage architects know that transactional data on a MAID system will fail, but beyond that, they're not confident about which of their data has the right access patterns for MAID. It would be helpful if archiving software vendors identified which data within their applications goes on what storage device and moved it there, but so far vendors have been slow to do this work.
Short refresh cycles. Tech refresh cycles have been extended, but not by much. One of the toughest problems in building a long-term archive is preserving the data in the system through multiple generations of technology. COPAN alleges it has been able to extend the life of its storage from the typical three-to-five-year upgrade cycle to seven years. It's a small step forward when you consider building an archive that will last a hundred years.
Vendors with MAID storage products include COPAN Systems, EMC, Fujitsu, Hitachi Data Systems (HDS), NEC, and Nexsan Technologies
Although the lack of ways to consume MAID storage has delayed its adoption, that's set to change, as the biggest proponents of the technology, COPAN and Nexsan, have recently introduced network-attached storage (NAS) technologies to branch out beyond the backup market. COPAN has bundled Quantum's StorNext file system with its MAID arrays to enable file archiving, and Nexsan recently unveiled a NAS gateway based on Windows Storage Server. Furthermore, the combination of data deduplication and replication will mean that vaulting data offsite on disk is a possibility. MAID will enable data vaulting companies that use low-grade data center space in terms of power and cooling to offer a disk-based backup, restore, and vaulting service.
MAID is also evolving from a binary “on” or “off” mode to a more flexible power management scheme that balances energy consumption against the performance and availability needs of the application.
On the issue of energy efficiency, data center power and cooling considerations are set to become more prevalent, especially in Europe and the Far East where this is already a critical problem.
Vendors need to reduce the power consumption of all components in the system, not just the drives, according to Ethan Miller, professor of computer science at the University of California in Santa Cruz. He says the controllers and CPUs in MAID storage technology can consume 150 to 200 watts apiece and are constantly running. Miller is working on a project in archival storage called Pergamum, which aims to develop a long-term archive using low-power processors, SATA disks, and non-volatile random access memory (NVRAM) such as flash to hold metadata stores. When you consider the implications of searching an exabyte of storage, the address tables for all that data better be on fast storage.
The author is an Analyst with Forrester Research