|
|
|
SAN
Virtualization Guidelines
|
Now that SAN
"plumbing" has matured with an ample collection of
Fibre Channel products, it's time to turn our attention to
fully harnessing the storage assets at the other end of the
light beams. That takes us into the realm of SAN
virtualization.
While SAN connections widen the pipes and stretch the distance
between disks and hosts, the new plumbing alone does little to
reconcile the conflicts among servers competing for scarce
disk space. You can look at SAN virtualization products as
capacity brokers in this chaotic environment. In their
simplest form, they collect all or portions of the SAN's
physical disks into a pool, and hand out logical slices to
needy application servers without having to re-cable or rezone
the SAN. |
Properly
architected, virtualization provides many benefits, such as
the ability to allocate storage resources on-demand, integrate
storage products from multiple vendors, configure selectively
for high availability and reduce the total cost of ownership.
Choosing a virtualization product is the challenge. We'll give
you some guidelines that our customers use, and consequently,
influence our solutions.
Virtualization Schemes
At last count, five divergent approaches to sharing virtual
disk capacity have emerged in the SAN market, spanning about
10 discrete implementations. Ranging broadly in price,
performance, and utility, these virtualization solutions can
be categorized by the methods they use to translate the the
physical reality to the host's logical view. The effectiveness
of each technique is essentially determined by where in the
SAN the mapping takes place and what platform is used to
deliver the services. Our taxonomy lumps the offerings into:
- Multi-host
storage arrays
- Host-based
LUN masking filters
- File
system redirectors via outboard metadata controllers
- Specialized
in-band virtualization engines
- Dedicated
storage domain servers
Note: Some
virtualization engines are packaged versions of storage domain
servers.
Selection Criteria
While virtualization suppliers' claims are often
indistinguishable, there are seven criteria that determine the
success and viability of each approach:
- The degree
of independence that these products provide from a host's
operating system and file system.
- The
broadness of support for a mixture of storage hardware.
- The
ability to protect investments in legacy storage assets.
- The
ability of the security policy to share virtual resources
while adequately excluding uninvited guests.
- The
effectiveness of the technology at minimizing losses due
to planned and unplanned downtime.
- The
breadth of devices consolidated into a centralized
management view.
- The
ability to leverage commodity hardware and storage devices
for improved performance and functionality at reasonable
cost.
Ultimately, the
best choice for virtualized SANs must provide unprecedented
levels of reliability, availability and scalability, while
serving as the basis for advanced storage services and
management.
Host Independence
This is a critical point. Several suppliers have elected to
place virtualization software on the hosts - each and every
host, that is. These vendors' engineering teams are spending a
lot of time just on porting and qualifying software to every
operating system. This process compromises focus and energy.
History has shown that this strategy is difficult to maintain
given all the version changes across many host environments.
And the fact remains that this approach requires IT staff to
install intrusive, processor-consuming software on each host
or risk problems. Host-based solutions can mean only one thing
for the system administrators: more headaches every time a
system is added or updated.
Mixed Storage Support
The usual answer from many vendors is "Don't mix. It's
hard. They don't interoperate." Translation: vendor
lock-in. While choosing products from a single vendor can
provide a certain level of near-term comfort, in the long run
you are compromising your ability to respond to change.
Fortunately, a few suppliers without allegiances to specific
storage hardware are far more liberal, willing, and most
importantly, able to put nearly anyone in their storage pool
as long as it talks Fibre Channel.
Legacy Investment Protection
How much of your current disk population is Fibre Channel (FC)
-ready? If your mix includes SCSI, EIDE or SSA drives, the SAN
virtualization choices get slim. Of course there are Fibre
Channel routers and bridges that could be worked in for
additional cost and complexity. Better instead to look for
storage pooling products that have built-in support for your
existing interfaces. Properly done, the hosts won't know the
difference between virtual devices coming from an FC drive and
a native SCSI spindle (performance of the hardware aside).
Security Concerns
Security and host independence are somewhat intertwined.
Depending on host-based software or hardware to implement the
security layer for shared access control over a SAN is
misplacing the authority. A rogue host doesn't play fair - it
can read and write to any disk in the pool, unintentionally
corrupting a neighbor's data. Steer towards outboard security
implementations that centralize access control and you'll
sleep nights. There's another benefit: with the growing
importance of personal privacy in the e-Commerce world, an
outboard security implementation simplifies the auditing of
data trails.
Resiliency to Outages
Buying devices in pairs to protect against failure is simply
not the best way to spend the IT budget, even though it may be
a common practice. The more practical (and effective) way is
to amortize redundancy across many resources in an N+1
fashion. In other words, when you need five units, buy six,
not 10, and you'll have a great combination of availability
and cost-savings. Make sure your virtualization solution
supports this capability - not all of them do, and can cause
an unanticipated increase in cost of ownership.
Centralization
Some define centralized storage pools and storage management
as limited to disks within one box, or one vendor's line of
products. What is your definition? We believe you should look
for centralized administration that includes pooling all the
disks across a network, regardless of where, how many, or what
make or model of storage is attached.
Price-performance leverage of the virtualization
platform
For reasons already discussed, we feel that the virtualization
engine should be outboard, and not a burden of the hosts. In
this case, the platform for the virtualization becomes
extremely important. Some vendors' products use proprietary or
custom hardware and software to provide virtualization and
other services. Naturally, this increases the development and
testing costs, which the end user must ultimately fund. In
addition, the performance and reliability of the system is
more of a gamble, for which the end user bears a large portion
of the risk. We feel you should look for solutions that
leverage existing, proven, high-performance technologies that
are cost-efficient, familiar, easily upgradeable, and
extensible. This includes processors, storage devices, and
operating systems. With that in mind, now you can plan on
flexibly scaling your performance, redundancy and capacity
based on your budget and business needs, rather than the other
way around.
Back to the Choices
Let's compare the SAN virtualization alternatives and see how
each one ranks against our criteria.
Multi-Host Arrays
A multi-host array (Figure 1) puts the pooling responsibility
at the storage subsystem level, usually with RAID controller
firmware. This implementation offers favorable performance, as
well as high availability configurations. Connectivity to many
flavors of hosts is supported, but you can only buy the disks
that come with the array. Perhaps the biggest drawback of this
approach is that the size and makeup of the pool is limited to
the array's monolithic enclosures. Spilling over means running
multiple pools and losing allocation freedom and
centralization. Although some vendors might offer centralized
management for multiple arrays, there are unanswered questions
about multi-vendor support.

LUN Masking
One means of enabling storage pooling is to install
specialized device drivers on each host to prevent that host
from accessing storage resources that it doesn't
"own." These LUN Masking drivers (Figure 2) are
typically configured using a central management application
that can be either host-based or outboard. Although this
method might work well for small, controlled configurations,
it introduces several complexities and costs in large data
center and enterprise SAN operations. First, the LUN masking
support must span a potentially wide spectrum of server
platforms - as we noted earlier, this presents a significant
challenge for the vendor to adequately supply and maintain.
Also, because every single host must have the LUN masking
driver, there is a performance hit to the host and therefore
the network. Plus, change management across numerous hosts is
tedious, costly and slow. Perhaps even more disconcerting is
the ability for any "rogue" host without the proper
LUN masking software to defeat the security controls of the
shared resources and corrupt others' disks in the storage
pool.

File System Redirectors
A third type of pooling technique involves the use of file
system redirector software (Figure 3). Basically, file access
control travels over the LAN, but disk data I/O moves over the
high-speed SAN. Each host on the SAN requires software to
facilitate the mapping of file names to block addresses, all
brokered by an external metadata controller or file system
manager. To be fair, these products are really targeted at
offloading disk I/O traffic from LANs, rather than general
purpose virtualized storage pooling. We've included them in
our virtualization roundup since there is a level of storage
abstraction in the design. Like LUN masking software, file
system redirection is tied to specific operating environments
and components must be installed on every host. Though the
file sharing services offer value, they are not the best
solution for general SAN virtualization and storage
management. You should overlay file redirection software on a
virtualized storage pooling service to get the best of both
worlds.

Specialized In-band Virtualization Engines
These products provide virtualized storage pooling by
consolidating the storage allocation and security functions on
dedicated platforms that sit between the hosts and the
physical storage (thus "in-band"). Typically, no
additional software is required on the hosts, allowing the
engines to support the diverse range of popular open systems
servers. The virtualization engine (Figure 4) can incorporate
a wide range of components and features. At one end are the
entry-level products that strictly address simple storage
pooling needs and require the purchase of external switches
and storage devices to complete the picture. Others choose to
embed switching support in the "appliance" bundle.
Still others include disks, and appear very similar to
multi-host arrays, but potentially at lower price points with
greater configuration flexibility. The particular components
of the appliance are not necessarily measures of quality,
merely options.
You should note that there is a war raging between the
out-of-band (outside the data path) and the in-band
virtualization camps. Some argue that in-band products slow
data access down, and that the failure of the virtualization
platform could compromise availability. This is only true if
the product is carelessly designed. The successful,
intelligent storage control suppliers have proven that you can
use caching and alternate paths to achieve big performance and
availability payoffs. We've seen dramatically enhanced I/O
response firsthand from JBODs and disk arrays that were
supplemented with in-band virtualizations engines sporting
advanced caching. As for survivability, configuring alternate
paths is a long-standing, proven method for continuous
availability that can be implemented for storage networks.
So, the true measure of success for these appliances lies in
their ability to confront and deal with these technical
challenges. Lesser implementations will expose themselves as
single points of failure; intelligent ones will provide
alternate paths and multi-node redundancy through classical
networking techniques proven in the LAN and WAN space. Weaker
products will experience a significant performance hit as data
travels through the appliance; successful solutions will
possess sophisticated, robust read and write caching
algorithms that actually improve the performance of the
physical disks under their control, while also leveraging the
cache already built in to the disk arrays.
Storage Domain Servers
A storage domain server (Figure 5) is a commercial server
platform dedicated to the virtualization and allocation of
disk storage to the hosts. The virtualization function is
implemented in software that runs as a network storage control
layer on top of the platform's native operating system. This
allows it to leverage many of the operating system's
networking, volume management, device interoperability and
security features. Some storage domain servers are designed to
collaborate over the SAN. In this way, they distribute the
load and management chores for a large storage pool while
maintaining centralized administration. The hardware
performance and number of storage domain servers can be
optimized to site-specific requirements.

Storage domain servers are capable of adding value to the I/O
stream by optionally performing host- and storage
device-independent caching, in-band performance and load
monitoring, snapshot and remote mirroring services, to name a
few. The richer the feature set, the simpler it becomes to
institute LAN-free and server-less backups, disaster recovery
programs and decision support practices across the entire
storage pool without regard to the supplier of the physical
SAN components. The end result is a huge reduction in
acquisition, administrative and upgrade costs with high return
on investment (ROI) for nearly any type of SAN environment.
The similarities to specialized virtualization engines are not
coincidence; many specialized appliances are simply storage
domain servers with hardware and software add-ons. While they
lose some of the flexibility of a storage domain server, these
appliances bundle the necessary services in a plug-and-play
solution at targeted price points. In the end, just as the
deployment of network domain servers delivered significant
advancement for LANs, storage domain servers promise to
deliver the most compelling advantages of disk virtualization
for SANs.
Conclusions
The recent flood of storage virtualization products presents
an abundance of choices - and a fair share of confusion. To
cut through the chaos, we've tried to identify the key factors
that will influence each offering's long-term success and
viability for both the end user and the vendor. These are
summarized in Table 1 below. Ultimately, the best product for
you will provide complete freedom of choice and high
performance at a reasonable cost. That way, you - not the
vendor - have control over your storage environment.
|
|
|
|
|
|