|
I recently spoke with Brian
Truskowski, chief technology officer for IBM's recently formed
Storage Systems Group. In the course of the interview,
Truskowski sets the record straight about the company's new
virtualization engine and Storage Tank, its relationship with
Hitachi Data Systems and DataCore, and more.
|
|
Q: How does IBM define
virtualization?
Our definition of virtualization centers on block-level I/O,
meaning better management of the physical storage that sits
behind virtualized pools. Our "virtualization
engine" takes disparate storage back-ends and pools those
together to create virtual pools of storage that are
relatively independent of the actual physical storage that
sits behind it.
|
|
This reduces complexity. You
don't have to reconfigure the application servers every time
you want to change some storage. The application servers
aren't tied directly to the physical storage on the back-end;
they connect through the intermediate, or virtualization,
layer.
|
|
Q: Today, your
virtualization capability centers on DataCore Software's
SANsymphony?
What we've announced with DataCore is a very specific set of
solutions tied to our Shark product. It's a tactical solution
... to [add functionality] to Shark. It allows us to do
distance copy and to create larger LUNs. Strategically, the
virtualization engine that we recently announced is what we
are investing in internally and is what we consider to be our
enterprise-wide virtualization solution.
|
|
Q: Why develop your own
virtualization product?
There are a lot of point products out there that address bits
and pieces of the problem, but we aren't convinced they are
ready for the enterprise environment. We don't think existing
products have the same availability [or performance]
characteristics that we're proposing.
|
|
Q: You're taking an
"in-band" approach?
Yes, it's an in-band solution based on IBM eServer xSeries and
Linux [using a fault-tolerant clustered architecture]. Each
node has 4GB of cache [up to eight nodes in pairs], so we have
read-write cache in the network, which gives us a lot of
capability for writing unique functions in the network layer
where the virtualization lives.
|
|
We're convinced that in-band
makes sense for virtualization. There is so much more we can
do function-wise by having it in-band versus out-of-band.
Also, there are ease-of-use advantages to an in-band approach.
All you have to do is drop virtualization into your network.
You don't have to make any changes to your application server,
other than point it to your virtualization layer.
|
|
Q: Does the virtualization
support heterogeneous storage?
Yes, over time. We're working with a number of vendors to get
the right device drivers. We'll probably start out with a
smaller set and extend that over time.
|
|
Q: Does Hitachi have any
role in the development of this specific virtualization
engine?
Right now, their role is as consumer. They've concluded that
our implementation makes a lot of sense. At this point, it
looks like they will use our technology.
|
|
Q: Are you co-developing
the product?
It's not really a joint development, but they will use our
technology to create their own implementation. I think they're
still determining how best to integrate the software. There
are a number of different ways we can go with them, and the
details are still being worked out.
|
|
Q: What is Storage Tank?
Storage Tank is a SAN-wide file system for storage networks
that is common across all application servers. Today, every
operating system has its own file system. Users can manage the
physical assets of each of these, but these assets are still
grouped into "containers," or virtual pools, virtual
LUNs, virtual volumes, etc.
|
|
All the data in these
containers is unique to the application server. This makes
management very difficult because there isn't a single
namespace across all platforms. Every application server sees its
piece of the storage and only its piece of the
storage--or its part of the file tree. As a result, there is
no common point of management. Because every one of these
environments is different, you have to have separate policies
[e.g., backup and recovery] for every file system out there.
|
|
[But] what if you had one
file system? You still have the native file systems on all the
application servers, but instead of using that, what if all
the data was written in a common way to storage and there was
one place in the storage environment that understood all the
data and wrote it out in a common fashion? That's what Storage
Tank does. It's a file system for storage.
|
|
[In terms of hardware], it's
a metadata server cluster composed of xServer xSeries running
Linux, so it's very similar to the virtualization engine in
terms of the clustered hardware. The cluster of servers sits
off to the side of the storage network. And then there is a
protocol and pieces of software that sit on each of the
application servers.
|
|
Q: How does Storage Tank
differ from the virtualization engine?
It is complementary to our virtualization strategy--whether it
is our virtualization engine or another vendor's. Storage Tank
will work with those virtualization products, but it doesn't
have to. However, we think virtualization will bring the same
value to Storage Tank as it does to environments without
Storage Tank.
|
|
Q: Does Storage Tank apply
to NAS (network-attached storage) environments?
Yes. If you think about it, what is NAS data but a file system
that's outboard of the application server? In my view, there
are three ways of thinking about file systems: as local file
systems, as NAS filers, or both. Storage Tank can be both.
|
|
It's a way to converge NAS
and local file systems into one thing, called the Storage
Tank. It provides a single file-system view, so you can see
the entire file system across all application servers at
once--you can see the whole SAN domain.
|
|
------------------------------------------------------------------
AT A GLANCE
|
|
Virtualization Engine
Description--Enterprise-class block-level
virtualization (in-band approach).
Hardware--IBM eServer xSeries running Linux in
clustered configurations; 2-node minimum, scalable to eight
nodes (initially).
Support--Broad OS support, limited storage support
(initially).
Benefits--Improved storage administrator productivity,
common platform for advanced functions (e.g., disaster
recovery, point-in-time and peer-to-peer copy, data
migration), and improved capacity utilization.
Availability--General availability slated for 2003.
|
|
Storage Tank
Description--SAN-wide file system (file aggregation).
Hardware--xServer xSeries running Linux.
OS support--AIX, Solaris, HP-UX, Linux, and Windows
2000/XP.
Benefits--Heterogeneous file sharing, centralized
management, and improved storage utilization. Also being
designed to provide policy-based capabilities (e.g.,
provisioning and non-disruptive data migration).
Availability--In alpha testing; general availability
slated for 2003.
|