ZoiT: Solaris Zones on iSCSI Targets (aka NAC: Network-Attached Containers)


Introduction

Solaris Containers have a 'zonepath' ('home') which can be a directory on the root file system or on a non-root file system. Until Solaris 10 8/07 was released, a local file system was required for this directory. Containers that are on non-root file systems have used UFS, ZFS, or VxFS. All of those are local file systems - putting Containers on NAS has not been possible. With Solaris 10 8/07, that has changed: a Container can now be placed on remote storage via iSCSI.

Background

Solaris Containers (aka Zones) are Sun's operating system level virtualization technology. They allow a Solaris system (more accurately, an 'instance' of Solaris) to have multiple, independent, isolated application environments. A program running in a Container cannot detect or interact with a process in another Container.

Each Container has its own root directory. Although viewed as the root directory from within that Container, that directory is also a non-root directory in the global zone. For example, a Container's root directory might be called /zones/roots/myzone/root in the global zone.

The configuration of a Container includes something called its "zonepath." This is the directory which contains a Container's root directory (e.g. /zones/roots/myzone/root) and other directories used by Solaris. Therefore, the zonepath of myzone in the example above would be /zones/roots/myzone.

The global zone administrator can choose any directory to be a Container's zonepath. That directory could just be a directory on the root partition of Solaris, though in that case some mechanism should be used to prevent that Container from filling up the root partition. Another alternative is to use a separate partition for that Container, or one shared among multiple Containers. In the latter case, a quota should be used for each Container.

Local file systems have been used for zonepaths. However, many people have strongly expressed a desire for the ability to put Containers on remote storage. One significant advantage to placing Containers on NAS is the simplification of Container migration - moving a Container from one system to another. When using a local file system, the contents of the Container must be transmitted from the original host to the new host. For small, sparse zones this can take as little as a few seconds. For large, whole-root zones, this can take several minutes - a whole-root zone is an entire copy of Solaris, taking up as much as 3-5 GB. If remote storage can be used to store a zone, the zone's downtime can be as little as a second or two, during which time a file system is unmounted on one system and mounted on another.

Here are some significant advantages to iSCSI over SANs:

  1. the ability to use commodity Ethernet switching gear, which tends to be less expensive than SAN switching equipment
  2. the ability to manage storage bandwidth via standard, mature, commonly used IP QoS features
  3. iSCSI networks can be combined with non-iSCSI IP networks to reduce the hardware investment and consolidate network management. If that is not appropriate, the two networks can be separate but use the same type of equipment, reducing costs and types of in-house infrastrucuture management expertise.

Unfortunately, a Container cannot 'live' on an NFS server, and it's not clear if or when that limitation will be removed.

iSCSI Basics

iSCSI is simply "SCSI communication over IP." In this case, SCSI commands and responses are sent between two iSCSI-capable devices, which can be general-purpose computers (Solaris, Windows, Linux, etc.) or specific-purpose storage devices (e.g. Sun StorageTek 5210 NAS, EMC Celerra NS40, etc.). There are two endpoints to iSCSI communications: the initiator (client) and the target (server). A target publicizes its existence. An initiator binds to a target.

The industry's design for iSCSI includes a large number of features, including security. Solaris implements many of those features. Details can be found:

In Solaris, the command iscsiadm(1M) configures an initiator, and the command iscsitadm(1M) configures a target.

Steps

This section demonstrates the installation of a Container onto a remote file system that uses iSCSI for its transport.

The target system is an LDom on a T2000, and looks like this:

System Configuration:  Sun Microsystems  sun4v
Memory size: 1024 Megabytes
SUNW,Sun-Fire-T200
SunOS ldg1 5.10 Generic_127111-07 sun4v sparc SUNW,Sun-Fire-T200
Solaris 10 8/07 s10s_u4wos_12b SPARC
The initiator system is another LDom on the same T2000 - although there is no requirement that LDoms are used, or that they be on the same computer if they are used.
System Configuration:  Sun Microsystems  sun4v
Memory size: 896 Megabytes
SUNW,Sun-Fire-T200
SunOS ldg4 5.11 snv_83 sun4v sparc SUNW,Sun-Fire-T200
Solaris Nevada snv_83a SPARC
The first configuration step is the creation of the storage underlying the iSCSI target. Although UFS could be used, let's improve the robustness of the Container's contents and put the target's storage under control of ZFS. I don't have extra disk devices to give to ZFS, so I'll make some and use them for a zpool - in real life you would use disk devices here:
Target# mkfile 150m /export/home/disk0
Target# mkfile 150m /export/home/disk1
Target# zpool create myscsi mirror /export/home/disk0 /export/home/disk1
Target# zpool status
pool: myscsi
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
myscsi ONLINE 0 0 0
/export/home/disk0 ONLINE 0 0 0
/export/home/disk1 ONLINE 0 0 0
Now I can create a zvol - an emulation of a disk device:
Target# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myscsi 86K 258M 24.5K /myscsi
Target# zfs create -V 200m myscsi/jvol0
Target# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myscsi 200M 57.9M 24.5K /myscsi
myscsi/jvol0 22.5K 258M 22.5K -
Creating an iSCSI target device from a zvol is easy:
Target# iscsitadm list target
Target# zfs set shareiscsi=on myscsi/jvol0
Target# iscsitadm list target
Target: myscsi/jvol0
iSCSI Name: iqn.1986-03.com.sun:02:c8a82272-b354-c913-80f9-db9cb378a6f6
Connections: 0
Target# iscsitadm list target -v
Target: myscsi/jvol0
iSCSI Name: iqn.1986-03.com.sun:02:c8a82272-b354-c913-80f9-db9cb378a6f6
Alias: myscsi/jvol0
Connections: 0
ACL list:
TPGT list:
LUN information:
LUN: 0
GUID: 0x0
VID: SUN
PID: SOLARIS
Type: disk
Size: 200M
Backing store: /dev/zvol/rdsk/myscsi/jvol0
Status: online

Configuring the iSCSI initiator takes a little more work. There are three methods to find targets. I will use a simple one. After telling Solaris to use that method, it only needs to know what the IP address of the target is.

Note that the example below uses "iscsiadm list ..." several times, without any output. The purpose is to show the difference in output before and after the command(s) between them.

First let's look at the disks available before configuring iSCSI on the initiator:

Initiator# ls /dev/dsk
c0d0s0 c0d0s2 c0d0s4 c0d0s6 c0d1s0 c0d1s2 c0d1s4 c0d1s6
c0d0s1 c0d0s3 c0d0s5 c0d0s7 c0d1s1 c0d1s3 c0d1s5 c0d1s7
We can view the currently enabled discovery methods, and enable the one we want to use:
Initiator# iscsiadm list discovery
Discovery:
Static: disabled
Send Targets: disabled
iSNS: disabled
Initiator# iscsiadm list target
Initiator# iscsiadm modify discovery --sendtargets enable
Initiator# iscsiadm list discovery
Discovery:
Static: disabled
Send Targets: enabled
iSNS: disabled
At this point we just need to tell Solaris which IP address we want to use as a target. It takes care of all the details, finding all disk targets on the target system. In this case, there is only one disk target.
Initiator# iscsiadm list target
Initiator# iscsiadm add discovery-address 129.152.2.90
Initiator# iscsiadm list target
Target: iqn.1986-03.com.sun:02:c8a82272-b354-c913-80f9-db9cb378a6f6
Alias: myscsi/jvol0
TPGT: 1
ISID: 4000002a0000
Connections: 1
Initiator# iscsiadm list target -v
Target: iqn.1986-03.com.sun:02:c8a82272-b354-c913-80f9-db9cb378a6f6
Alias: myscsi/jvol0
TPGT: 1
ISID: 4000002a0000
Connections: 1
CID: 0
IP address (Local): 129.152.2.75:40253
IP address (Peer): 129.152.2.90:3260
Discovery Method: SendTargets
Login Parameters (Negotiated):
Data Sequence In Order: yes
Data PDU In Order: yes
Default Time To Retain: 20
Default Time To Wait: 2
Error Recovery Level: 0
First Burst Length: 65536
Immediate Data: yes
Initial Ready To Transfer (R2T): yes
Max Burst Length: 262144
Max Outstanding R2T: 1
Max Receive Data Segment Length: 8192
Max Connections: 1
Header Digest: NONE
Data Digest: NONE
The initiator automatically finds the iSCSI remote storage, but we need to turn this into a disk device. (Newer builds seem to not need this step, but it won't hurt. Looking in /devices/iscsi will help determine whether it's needed.)
Initiator# devfsadm -i iscsi
Initiator# ls /dev/dsk
c0d0s0 c0d0s3 c0d0s6 c0d1s1 c0d1s4 c0d1s7 c1t7d0s2 c1t7d0s5
c0d0s1 c0d0s4 c0d0s7 c0d1s2 c0d1s5 c1t7d0s0 c1t7d0s3 c1t7d0s6
c0d0s2 c0d0s5 c0d1s0 c0d1s3 c0d1s6 c1t7d0s1 c1t7d0s4 c1t7d0s7
Initiator# ls -l /dev/dsk/c1t7d0s0
lrwxrwxrwx 1 root root 100 Mar 28 00:40 /dev/dsk/c1t7d0s0 ->
../../devices/iscsi/disk@0000iqn.1986-03.com.sun%3A02%3Ac8a82272-b354-c913-80f9-db9cb378a6f60001,0:a

Now that the local device entry exists, we can do something useful with it. Installing a new file system requires the use of format(1M) to partition the "disk" but it is assumed that the reader knows how to do that. However, here is the first part of the format dialogue, to show that format lists the new disk device with its unique identifier - the same identifier listed in /devices/iscsi.
Initiator# format
Searching for disks...done

c1t7d0: configured with capacity of 199.98MB

AVAILABLE DISK SELECTIONS:
0. c0d0
/virtual-devices@100/channel-devices@200/disk@0
1. c0d1
/virtual-devices@100/channel-devices@200/disk@1
2. c1t7d0
/iscsi/disk@0000iqn.1986-03.com.sun%3A02%3Ac8a82272-b354-c913-80f9-db9cb378a6f60001,0
Specify disk (enter its number): 2
selecting c1t7d0
[disk formatted]
Disk not labeled. Label it now? no

Let's jump to the end of the partitioning steps, after assigning all of the available disk space to partition 0:
partition> print
Current partition table (unnamed):
Total disk cylinders available: 16382 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
0 root wm 0 - 16381 199.98MB (16382/0/0) 409550
1 unassigned wu 0 0 (0/0/0) 0
2 backup wu 0 - 16381 199.98MB (16382/0/0) 409550
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0

partition> label
Ready to label disk, continue? y

The new raw disk needs a file system.
Initiator# newfs /dev/rdsk/c1t7d0s0
newfs: construct a new file system /dev/rdsk/c1t7d0s0: (y/n)? y
/dev/rdsk/c1t7d0s0: 409550 sectors in 16382 cylinders of 5 tracks, 5 sectors
200.0MB in 1024 cyl groups (16 c/g, 0.20MB/g, 128 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 448, 864, 1280, 1696, 2112, 2528, 2944, 3232, 3648,
Initializing cylinder groups:
....................
super-block backups for last 10 cylinder groups at:
405728, 406144, 406432, 406848, 407264, 407680, 408096, 408512, 408928, 409344

Back on the target:
Target# zfs list
NAME USED AVAIL REFER MOUNTPOINT
myscsi 200M 57.9M 24.5K /myscsi
myscsi/jvol0 32.7M 225M 32.7M -
Finally, the initiator has a new file system, on which we can install a zone.
Initiator# mkdir /zones/newroots
Initiator# mount /dev/dsk/c1t7d0s0 /zones/newroots
Initiator# zonecfg -z iscuzone
iscuzone: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:iscuzone> create
zonecfg:iscuzone> set zonepath=/zones/newroots/iscuzone
zonecfg:iscuzone> add inherit-pkg-dir
zonecfg:iscuzone:inherit-pkg-dir> set dir=/opt
zonecfg:iscuzone:inherit-pkg-dir> end
zonecfg:iscuzone> exit
Initiator# zoneadm -z iscuzone install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <2762> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <1162> packages on the zone.
...
Initialized <1162> packages on zone.
Zone is initialized.
Installation of these packages generated warnings:
The file contains a log of the zone installation.

There it is: a Container on an iSCSI target on a ZFS zvol.

Zone Lifecycle, and Tech Support

There is more to management of Containers than creating them. When a Solaris instance is upgraded, all of its native Containers are upgraded as well. Some upgrade methods work better with certain system configurations than others. This is true for UFS, ZFS, other local file system types, and iSCSI targets that use any of them for underlying storage.

You can use Solaris Live Upgrade to patch or upgrade a system with Containers. If the Containers are on a traditional file system which uses UFS (e.g. /, /export/home) LU will automatically do the right thing. Further, if you create a UFS file system on an iSCSI target and install one or more Containers on it, the ABE will also need file space for its copy of those Containers. To mimic the layout of the original BE you could use another UFS file system on another iSCSI target. The lucreate command would look something like this:

# lucreate -m /:/dev/dsk/c0t0d0s0:ufs   -m /zones:/dev/dsk/c1t7d0s0:ufs -n newBE

Conclusion

If you want to put your Solaris Containers on NAS storage, Solaris 10 8/07 will help you get there, using iSCSI.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值