IBM SVC storage

标签: ibmSVCstorage
815人阅读 评论(0) 收藏 举报

IBM SVC storage

1.    Introduction

          TheIBM SAN Volume Controller (SVC) is a block storage virtualization appliancethat belongs to the IBM System Storage product family. SVC implements anindirection [1], or "virtualization", layer in a FibreChannel Storage Area Network (SAN) [2].

2.   Architecture

          TheIBM 2145 SAN Volume Controller (SVC) is an inline [3] virtualizationor "gateway" device. It logically sits betweenhosts and storage arrays, presenting it to hosts as the storage provider(target) and presenting itself to storage arrays as one big host (initiator). SVCis physically attached to any available port in one or several SAN fabrics. Thevirtualization approach allows for non-disruptive replacements of any part inthe storage infrastructure, including the SVC devicesthemselves. It also aims at simplifying compatibilityrequirements in strongly heterogeneous [4] server and storagelandscapes. All advanced functions are therefore implementedin the virtualization layer, which allows switching storage array vendors [5]without impact. Finally, spreading an SVC installation across two or more sites(stretched clustering) enables basic disaster protection paired with continuousavailability.

SVCnodes are always clustered, with a minimum of 2 and a maximum of 8 nodes, andlinear scalability. Each I/O group consists of 2nodes. Each node is a 1U high rack-mounted [6] appliance leveragingIBM System x server hardware, protected by redundant power supplies and anintegrated 1U high uninterruptible power supply. Note that the DH8model is a 2U high unit, with integrated battery backup. Anintegrated two-row display and five-button keyboard offer stand-aloneconfiguration and monitoring options. Each node has fourFibre Channel ports and two or four 10/1 Gbit/s Ethernet ports used for FCoE,iSCSI and management. All Fibre Channel & FCoEports on the SVC are both targets and initiators, and are also utilized forinter-cluster communication. This includes maintainingread/write cache integrity, sharing status information, and forwarding [7]reads and writes.

Writecache is protected by mirroring within a pair of SVC nodes, called I/O group. Virtualizedresources (= storage volumes presented to hosts) are distributed across I/Ogroups to improve performance. Volumes can also be moved nondisruptively between I/O groups, e.g., when new node pairs are added or oldertechnology is removed. Node pairs are always active,meaning both members accept simultaneous writes for each volume. Inaddition, all other cluster nodes accept and forward read and write requestswhich are internally handled by the appropriate I/O group. Path or boardfailures are compensated by non-disruptive failover within each I/O group. Thisrequires multipath drivers such as IBM Subsystem Device Driver (SDD) orstandard MPIO drivers.

3.   Terminology

·        Node- a single 1U or 2U machine.

SVC node models


Cache [GB]

FC speed [Gb/s]

iSCSI Speed [Gb/s]

Based upon














































8 & 16

1 (10 Gbit/soptional)



·  I/Ogroup - a pair of nodes that duplicate each other's write commands

·  Cluster- a group of 1 to 4 I/O groups managed as a single entity.

·  Stretchedcluster - a site protection configuration with 1 to 4 I/O groups, eachstretched across two sites, plus a witness site

·  ClusterIP address - a single IP address of a cluster that provides administrativeinterfaces via (SSH and HTTPS)

·  ServiceIP address - an IP address used to service an individual node. Each node canhave a service IP configured.

·  Configurationnode - a single node that holds the cluster's configuration and has theassigned cluster IP address.

·  MasterConsole (or SSPC) - a management GUI for SVC until rel 5.1, based on WebSphereApplication Server; not installed on any SVC node, but on a separate machine

·  Asof SVC rel 6.1, a Master Console (SSPC) is no longer used. Web basedadministration is done directly on the configuration node, using a HTML5 GUI.

·  VirtualDisk (VDisk) - a unit of storage presented to the host. The release 6 GUIrefers to a VDisk as a Volume.

·  ManagedDisk (MDisk) - a unit of storage (a LUN) from a real, external disk array,virtualized by the SVC. An MDisk is the base to create an image mode VDisk.

·  ManagedDisk Group - (MDisk Group) a group of one or more Mdisks. The extents of theMDisks in an MDisk Group are the base to create a striped or sequential modeVDisk. The release 6 GUI refers to a Managed Disk Group as a Pool.

·  Extent- a discrete unit of storage; an MDisk is divided into extents; a VDisk isformed from set of extents.

4.   Performance

Release4.3 of the SVC held the Storage Performance Council (SPC) worldrecord for SPC-1 performance benchmarks, returning nearly 275K(274,997.58) IOPS.There was no faster storage subsystem benchmarked by the SPC at that time(October 2008).[2] The SPC-2 benchmarkalso returned a world leading measurement of over 7 GB/s throughput.

Release5.1 achieved new records with a 4 node and 6 node cluster benchmark with DS8700as backed storage device. SVC broke its own record of 274,997.58 SPC-1 IOPS inMarch 2010, with 315,043.59 for the 4 node cluster and 380,489.30 with the 6node cluster, records that stood until October 2011.

Release6.2 of the SVC held the Storage Performance Council (SPC) worldrecord for SPC-1 performance benchmarks, returning over 500K (520,043.99) IOPS(I/Os per second) using 8 SVC nodes and Storwize V7000 as the backend disk.There was no faster storage subsystem benchmarked by the SPC at that time(January 2012).[3] The full results andexecutive summaries can be reviewed at the SPC website referenced above.[note 2]

Release7.x provides multiple enhancements including support for additional CPUs, cacheand adapters. The streamlined cache operates at 100µs fall-through latency[4] and 60 µs cache-hitlatency, enabling SVC as a front-end to IBM FlashSystem solid-state storagewithout significant performance penalty

5.   IncludedFeatures

1) Indirection or mapping from virtualLUN to physical LUN

Serversaccess SVC as if it were a storage controller. The SCSI LUNs they see representvirtual disks (volumes) allocated in SVC from a pool of storage made up fromone or more managed disks (MDisks). A managed disk is simply a storage LUNprovided by one of the storage controllers that SVC is virtualizing. The virtualcapacity can be larger than the managed physical capacity, with a currentmaximum of 32 PB, depending on management granularity (extent size)

2)Data migration and pooling

SVCcan move volumes from one capacity pool (MDisk group) to another whilstmaintaining I/O access to the data. Write and read caching remain active. Poolscan be shrunk or expanded by removing or adding hardware capacity, whilemaintaining I/O access to the data. Both features can be used for seamlesshardware migration. Migration from an old SVC model to the most recent model isalso seamless and implies no copying of data.

3)Importing and exporting existing LUNsvia Image Mode

"Imagemode" is a non-virtualized pass-through representation of an MDisk(managed LUN) that contains existing client data; such an MDisk can beseamlessly imported into or removed from an SVC cluster.

4)Fast-write cache

Writesfrom hosts are acknowledged once they have been committed into the SVC mirroredcache, but prior to being destaged to the underlying storage controllers. Datais protected by replication to the peer node in an I/O group (cluster nodepair). Cache size is dependent on the SVC hardware model and installed options.Fast-write cache is especially useful to increase performance in midrangestorage configurations.

5)Auto tiering (Easy Tier)

SVCautomatically selects the best storage hardware for each chunk of data,according to its access patterns. Cache unfriendly "hot" data isdynamically moved to solid state drives SSD, whereas cache friendly"hot" and any "cold" data is moved to economic spinningdisks. Easy Tier also monitors spindle-only workloads.

6)Solid state drive (SSD) capability

SVCcan use any supported external SSD storage device or provide its own internalSSD slots, up to 32 per cluster. Easy Tiering is automatically active whenmixing different media in hybrid capacity pools (Managed Diskgroups).

7)Thin Provisioning

LUNcapacity is only used when new data is written to a LUN. Data blocks equal zeroare not physically allocated, unless previous data unequal zero exists. Duringimport or during internal migrations, data blocks equal zero are discarded(Thick-to-thin migration).

Besides,thin provisioning is integrated in the Flash Copy features detailed below toprovide space-efficient snapshots

8)Virtual Disk Mirroring

Providesthe ability to maintain two redundant copies of a LUN, implicitly on differentstorage controllers

9)Site protection with StretchedCluster

Ageographically distributed, highly available clustered storage setup leveragingthe virtual disk mirroring feature across datacenters within 300 km distance.Stretched Clusters can span 2, 3 or 4 datacenters (chain or ring topology, a4-site cluster requiring 8 cluster nodes). Cluster consistency is ensured by amajority voting set.

Fromtwo storage devices in two datacenters, SVC presents one common logicalinstance. All application-oriented operations, like Snapshots or Resizing, areapplied on the logical instance. Hardware-oriented operations like real-timecompression or live migration are applied at the physical instance level.

Unlikein classical mirroring, logical LUNs are readable and writable on both sides(tandem) at the same time, removing the need for "failover","role switch", or "site switch". The feature can becombined with Live PartitionMobility or VMotion to avoid any datatransport during a metro-distance virtual server motion.

AllSVC cluster nodes also have read/write access to storage hardware in the mirrorlocation, removing the need for site-resynchronization in case of a simple nodefailure.

10)Enhanced Stretched Cluster

Afunctionality optimizing data paths within a metro- or geo-distance StretchedCluster (see above), helpful when bandwidth between sites is scarce andcross-site traffic must be minimized. SVC will attempt to use the shortest pathfor reads and writes. For instance, cache write destaging to storage devices isalways performed by the most nearby cache copy, unless its peer cache copy isdown.

11) Stretched Cluster with Golden Copy(3-site DR)

AStretched Cluster that maintains an additional synchronous or asynchronous datacopy on an independent Stretched Cluster or SVC or Storwize device at geodistances. The Golden Copy is a disaster protection againstmetro-scale outages impacting the Stretched Cluster as a whole. It leveragesthe optional Metro or Global Mirror functionality.

·  Optional features

1)    Real-Time Compression

Thistechnology, invented by the acquired startup Storwize,[5] has been integrated inthe SVC and other IBM storage systems. Originally implemented asreal-time filecompression, it has since been enhanced to also providein-flight block compression. The efficiency is equal to"zip" LZW (Lempel–Ziv–Welch) with a verylarge dictionary. The temporal locality of the algorithm mayalso increase the read/write performance on adequate data patternssuch as uncompressed databases stored on spinning disks.

Real-timecompression can be combined with Easy Tiering, Thin Provisioning and VirtualDisk Mirroring.

2)   FlashCopy (Snapshot)

Thisis used to create a disk snapshot for backup, orapplication testing of a single volume. Snapshots require only the"delta" capacity unless created with full-provisioned target volumes.FlashCopy comes in three flavours: Snapshot, Clone, Backup volume. All arebased on optimized copy-on-write technology, and may ormay not remain linked to their source volume.

Onesource volume can have up to 256 simultaneous targets. Targets can be madeincremental, and cascaded tree like dependency structures can be constructed.Targets can be re-applied to their source or any other appropriate volume, alsoof different size (e.g. resetting any changes from a resize command).

Copy-on-writeis based on a bitmap with aconfigurable grain size, as opposed to a journal

3)   Metro Mirror - synchronousremote replication

Thisallows a remote disaster recovery site at a distance ofup to about 300km

4)   Global Mirror - asynchronousremote replication

Thisallows a remote disaster recovery site at a distance of thousands ofkilometres. Each Global Mirror relationship can be configured for high latency/ low bandwidth or for high latency / high bandwidth connectivity, the latterallowing a consistent recovery point objective RPO below 1 sec.

5)   Global Mirror over IP -remote replication over the Internet

uses SANslide technology integrated into the SVC firmware to send mirroring datatraffic across a TCP/IP link, while maximizing the bandwidth efficiency of thatlink. This may result in a 100x data transfer acceleration over long distances.

6.   Appendix

1)  Word Transaction








Fibre Channel Storage Area Network

















2)   Concept

·     IBMSystem x

·     DH8model

3)  Relatedlink

IBMStorage virtualization





* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    • 访问:344776次
    • 积分:7147
    • 等级:
    • 排名:第3141名
    • 原创:369篇
    • 转载:82篇
    • 译文:2篇
    • 评论:51条