Fun with Linux UDEV and ASM: Using UDEV to create ASM disk volumes


floppy-disksBecause of the many discussions and confusion around the topic of partitioning, disk alignment and it’s brother issue, ASM disk management, hereby an explanation on how to use UDEV, and as an extra, I present a tool that manages some of this stuff for you.

 

 

 

 

The questions could be summarized as follows:

  • When do we have issues with disk alignment and why?
  • What methods are available to set alignment correctly and to verify?
  • Should we use ASMlib or are there alternatives? If so, which ones and how to manage those?

I’ve written 2 blogposts on the matter of alignment so I am not going to repeat myself on the details. The only thing you need to remember is that classic “MS-DOS” disk partitioning, by default, starts the first partition on the disk at the wrong offset (wrong in terms of optimal performance). The old partitioning scheme was invented when physical spinning rust was formatted with 63 sectors of 512 bytes per disk track each. Because you need some header information for boot block and partition table, the smart guys back then thought it was a good idea to start the first block of the first data partition on track 1 (instead of track 0). These days we have completely different physical disk geometries (and sometimes even different sector sizes, another interesting topic) but we still have the legacy of the old days.

If you’re not using an Intel X86_64 based operating system then chances are you have no alignment issues at all (the only exception I know is Solaris if you use “fdisk”, similar problem). If you use newer partition methods (GPT) then the issue is gone (but many BIOSes, boot methods and other tools cannot handle GPT). As MSDOS partitioning is limited to 2 TiB (http://en.wikipedia.org/wiki/Master_boot_record) it will probably be a thing of the past in a few years but for now we have to deal with it.

Wrong alignment causes some reads and writes to be broken in 2 pieces causing extra IOPS. I don’t have hard numbers but a long time ago I was told it could be an overhead of up to 20%. So we need to get rid of it.

ASM storage configuration

ASM does not use OS file systems or volume managers but has its own way of managing volumes and files. It “eats” block devices and these block devices need to be read/write for the user/group that runs the ASM instance, as well as the user/group that runs Oracle database processes (a public secret is that ASM is out-of-band and databases write directly to ASM data chunks). ASM does not care what the name or device numbers are of a block device, neither does it care whether it is a full disk, a partition, or some other type of device as long as it behaves as a block device under Linux (and probably other UNIX flavors). It does not need partition tables at all but writes its own disk signatures to the volumes it gets.

[ Warning: Lengthy technical content, Rated T, parental advisory required ]

ASM detects volumes (typically after boot) by scanning a path called the ASM DISKSTRING. If the disk string is set to /dev/whatever then ASM will scan /dev/whatever/* for block devices that are read/writable. If they contain valid ASM signatures it will try to access them and compose ASM disk groups from them. If they have no valid signatures then the volumes become “candidate” disks, i.e. They may be used to create new ASM volumes and/or disk groups.

Linux, by default, places detected disk volumes under /dev/ with root:disk as user/group (for example, /dev/sdq) and not read/writeable for anyone else than root user/group. So Oracle cannot handle this as it typically does not run under root id. Also, if ASM scans /dev/ it will have to scan a lot of devices because /dev/ is home to not just all disks, but also all other block (and character) devices.

So we need to present ASM volumes under a path that matches the ASM diskstring and has different userid/group and permissions. One way of doing this is using the “mknod” command to create references (inodes) to the disks under /dev. So for example /dev/sdq and /dev/oracleasm/disks/myvol both point to a Linux device with major ID 8 and minor id 80. They are essentially different names for the very same thing and can have different permissions (so /dev/sdq can have root:root and myvol can have oracle:asm as ownership and still being the same device). The problem is that a) it’s a very manual intensive process, b) it’s error prone and c) not guaranteed to be consistent across reboots.

ASMlib

Oracle therefore created a tool called ASMLib that consisted of a custom kernel module and a set of command line tools. The kernel module (driven by the CLI) would scan all devices in /dev/* after boot and if it found the right signatures it would clone those (by creating additional device inodes using mknod) typically using diskstring /dev/oracleasm/disks. ASMLib has a few flaws:

  • Can be very slow during boot because it scans ALL devices it can for signatures
  • The kernel module needs to be recompiled against the right kernel every time (this can be done partially dynamic but it’s sensitive to errors and the module is not part of the validated kernel source code – making it tricky to maintain)
  • Oracle dropped support for SuSE/RHEL 6/CentOS 6 so customers upgrading from 5.x were stuck unless using Oracle Linux (but Oracle now seems to support Red Hat again)
  • Requires the disk to be partitioned first (i.e. /dev/sdq is not accepted by ASMlib, but /dev/sdq1 is)

The partition requirement  probably caused partitioning of ASM disks to be the de facto standard these days. Some people claim that an unaware Linux sysadmin will be tempted to create a partition table and file system on any raw device he gets his hands on – and if this happens to be a disk already in use for ASM, it can ruin your whole day (and more). If it already has a partition the rumor goes that the admin will think the disk is used and not be tempted to just create a file system on the (in his eyes) empty partition 1 (I have my serious doubts on this – a low-brain admin will destroy anything in his path anyway – but alas).

Alignment (again)

So if we use partitions then we need to make sure the partition is aligned at a multiple of the element size of our storage device (whatever that is). For EMC VMAX and VNX the disk “track” size is 8K (track quoted because these days it is a far cry from what the real track would look like on the real spinning platter – there are many indirection layers). So we need to align at least in multiples of 8K (16 sectors). As 8K is not accepted by the partition tools we need to go larger – plus, for other reasons it makes sense to go larger (think storage cache slot size, raid stripe width etc). EMC recommends 64K (128 sectors) or a multiple thereof.

DBAs prefer an offset of 1 MiB (2048 sectors) because it improves the chance that someone overwrites a disk with a partition label without touching the real Oracle data (and ASM keeps copies of the disk header on multiple offsets on the disk so wiping out 1MB from block zero is often recoverable (still ugly).

How can you figure out if a disk is partitioned? (assuming all Linux here, my UNIX skills are pretty rusty these days):

# parted /dev/sdq unit s print
Number Start End Size Type File system Flags
1 2048s 155647s 153600s primary ext4

The 2048s means the first partition starts here at 2048 sectors = 1024K = 1MiB. If it shows 128, all is good, if it shows 63, trouble. Note that it only matters for disks that do lots of random IO (so I don’t care about the boot disk for example).

Let’s create a partition on an empty disk using classic tools. I’m using CentOS 6.5 here (most recent non-beta distro 100% compatible with Red Hat):

[root@dbhost ~]# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x959383a2.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
switch off the mode (command 'c') and change display units to
sectors (command 'u').
Command (m for help):
n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-130, default 1): 1
Last cylinder, +cylinders or +size{K,M,G} (1-130, default 130):
Using default value 130
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@dbhost ~]# parted /dev/sdb unit s print
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sdb: 2097152s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Number Start End Size Type File system Flags
1 63s 2088449s 2088387s primary

Who said modern Linux versions had this problem solved? 🙂

My blog post “Disk alignment reloaded” goes a bit deeper in this stuff. But the command to partition correctly is this:

# parted /dev/sdb mklabel msdos ; parted /dev/sdb mkpart primary 1m 100%

Note that ASMlib does not partition devices itself. The linux admin has to do that and then use asmlib commands (“oracleasm createdisk”) to move the volumes for ASM usage.

Linux UDEV

Is there a better way? Yes. Linux offers a facility since (at least) RHEL version 5 (probably before) called UDEV. UDEV is a mechanism that manipulates the way the kernel detects all sorts of devices and presents them to the system for further usage. It has a built-in set of rules that define things like persistent device naming, ownerships and the like. The nice thing of UDEV is that you can add or override rules. Some people have already written on how to use UDEV for Oracle (ASM), for example Frits Hoogland, and many more (searching Google finds many of them).

As said, a disk volume by default is presented as /dev/sdXX (where XX is the next available character between a-z and keeps going with aa, ab,..when there’s more than 26 of them). Owner:group is root:disk and only root user or members of group “disk” may read/write to all of them.

If we add a custom rule we can alter this presentation for a block device. Example:

[root@dbhost ~]# cat /etc/udev/rules.d/99-asm.rules

OWNER="grid", GROUP="asmdba", MODE="0660", ENV{DEVTYPE}=="disk", KERNEL=="sd*", ENV{ID_SERIAL}=="36000c29f8e2de32a6d10cf8e69d2816f", NAME="oracleasm/vol1"

Explanation: custom rules have to go in /etc/udev/rules.d and have extension .rules. The number (99) indicates the order of processing (99 is last).

You see assignments (single = sign) and comparisons (double == sign).

Assignments tell UDEV what to do with a device once it matches all comparisons.

So in this case, it tells a device to have owner “grid” and group “asmdba” with mode 0660 if the device detected is a disk, part of kernel “sd” driver (typically SCSI) and has the serial as shown. It also gives the device a new path and name (oracleasm/vol1 instead of de/sdsomething).

The detection is not based on the SCSI target and lun numbers or something similar (these can change). SCSI devices have unique identifiers (say, world wide names) that can be used for persistent naming. If a device matches the generic type AND has the correct identifier, it will become /dev/oracleasm/vol1 with the permissions as mentioned. It will not appear any longer as /dev/sdq (or whatever it would be named normally).

So we kill 3 birds with one stone:

  • We completely bypassed the partitioning problem, Oracle gets a block device that is the whole LUN and nothing but the LUN
  • We assigned the correct permissions and ownership and moved to a place where ASM only needs to scan real ASM volumes (not 100s of other thingies)
  • We completely avoid the risk of a rookie ex-Windows administrator to format an (in his eyes) empty volume (that actually contains precious data). An admin will not look in /dev/oracleasm/ to start formatting disks there

And the best of all: we have an astonishing extra whopping Megabyte of disk space for our tablespaces, due to not needing boot sector and partition tables! Yaaay!

But first a caveat with VMware. VMware creates such SCSI signatures on each disk, however, by default it does not let the guest OS know about them. So if you list the scsi_id of a volume under VMware, it will show empty. For VMware to present disk IDs to the guest OS, you need to enable a config parameter “disk.enableUUID = true” in each VM’s config (VMX) file. That’s all (manually edit the VMX file or in VSphere you can use the VM configuration GUI to add an entry).

Introducing asmdisks

We done yet? Nope. Manually maintaining the asm rules is hard work, you need to scan then copy-paste these Ids in the rules file and then give it names etc. Can be done but not nice. Browsing all the blogposts that describe in great detail how to create the UDEV rules file, I wondered why nobody seemed to have written a tool to do all the hard work for you. Saying goes, a good Unix administrator is lazy enough to automate repetitive tasks but everyone still suggests to manually mess around with copy/paste of SCSI id strings and all that mess.

So I wrote a script called “asm” (in a moment of sheer inspiration) that mimics ASMlib commands but actually generates these rules for you. I now present you the 1.0 final version. It’s packaged as an RPM package called “asmdisks” and has man pages and a few other goodies inside. Dependencies are set correctly, so for example, the packages required to read scsi ids  are installed automatically if you install the RPM with a decent package manager (YUM).

Short demo:

Install RPM package

[root@dbhost ~]# yum install asmdisks
…
…
Dependencies Resolved
=====================================================================================
Package Arch Version Repository Size
=====================================================================================
Installing:
asmdisks noarch 1.0-1 test 14 k
Installing for dependencies:
bc x86_64 1.06.95-1.el6 base 110 k
lsscsi x86_64 0.23-2.el6 base 38 k
parted x86_64 2.1-21.el6 base 606 k
sysstat x86_64 9.0.4-22.el6 base 230 k

Transaction Summary
=====================================================================================
Install 5 Package(s)
Total download size: 998 k
Installed size: 3.3 M
Is this ok [y/N]: y
…
Installed:
asmdisks.noarch 0:1.0-1
Dependency Installed:
bc.x86_64 0:1.06.95-1.el6 lsscsi.x86_64 0:0.23-2.el6
parted.x86_64 0:2.1-21.el6 sysstat.x86_64 0:9.0.4-22.el6

Complete!

List available disks and creating one for ASM

[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0] 1.00 GB available
/dev/sdc [2:0:2:0] 1.00 GB available
/dev/sdd [2:0:3:0] 4.00 GB available
/dev/sde [2:0:4:0] 4.00 GB available
/dev/sdf [2:0:5:0] 2.00 GB available
/dev/sdg [2:0:6:0] 1.00 GB available
[root@dbhost ~]# asm createdisk vol1 /dev/sdb
[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0] 1.00 GB configured as /dev/oracleasm/vol1
/dev/sdc [2:0:2:0] 1.00 GB available
/dev/sdd [2:0:3:0] 4.00 GB available
/dev/sde [2:0:4:0] 4.00 GB available
/dev/sdf [2:0:5:0] 2.00 GB available
/dev/sdg [2:0:6:0] 1.00 GB available

Note that the numbers between brackets are the SCSI driver, device, target and lun IDs (similar to “lsscsi” output fyi). Let’s look at /dev/sdb:

[root@dbhost ~]# ls -ald /dev/sdb
ls: cannot access /dev/sdb: No such file or directory
[root@dbhost ~]# ls -ald /dev/oracleasm/vol1
brw-rw---- 1 grid asmdba 8, 16 Jun 24 08:55 /dev/oracleasm/vol1

It’s missing from /dev (no way to mess up by admin!) and reappeared under /dev/oracleasm/vol1 with correct ownership. ASM can pick it up and make a disk group out of it.

Let’s take a look under the covers:

[root@dbhost ~]# cat /etc/asmtab
# /etc/asmtab - configuration file for asmdisks
# Definitions for IORate testing - volumes under /dev/iorate will have root:iops @ mode 0601
PATH=iorate:root:iops:0660
#
# This file keeps track of udev disk mappings for asmdisk(1)
# You should normally not have to edit this file directly
# Use asm(1) instead.
#
# On each line:
#
# label type identifier
# where
# label: diskstring/volume name (default diskstring is oracleasm and can be omitted)
# type: one of scsi, part or mapper (scsi=entire SCSI disk, part=scsi disk partition, mapper=linux disk mapper device)
# label: scsi_id, scsi_id:partition, mapper_name
#
# Ownerships and permissions can be specified for a diskstring:
# PATH=diskstring:owner:group:mode
# default is oracleasm:grid:asmdba:0660
#
# example:
# vol1 scsi 36000c29f825cd85b5fcc70a1aadebf0c # entire SCSI disk
# vol2 part 36000c298afa5c31b47fe76cbd1750937:1 # partition 1 of entire SCSI disk
# vol3 mapper mpathb # /dev/mapper/mpathb (multipath device)
# iorate/test1 mapper iops-vol1 # LV vol1 on VG iops, will be mapped as /dev/iorate/test1
# -----------------------------------------------
vol1 scsi 36000c29f8e2de32a6d10cf8e69d2816f

We see an entry with the volume name (vol1), the type (will explain that later), and the disk ID. The asm script detects the ID automatically and configures it. Not a single time do you need to manually detect the id or copy it all over the place.

[root@dbhost ~]# cat /etc/udev/rules.d/99-asm.rules
SUBSYSTEM!="block", GOTO="asmudev_end"
ENV{DEVPATH}=="*/block/sda", GOTO="asmudev_end"

OWNER="grid", GROUP="asmdba", MODE="0660", ENV{DEVTYPE}=="disk", KERNEL=="sd*", ENV{ID_SERIAL}=="36000c29f8e2de32a6d10cf8e69d2816f", NAME="oracleasm/vol1"

LABEL="asmudev_end"

The rules file looks very similar to my earlier example, with a few additions: “subsystem” is only defined once (cleans up the mess a bit) but more important: /dev/sda is EXCLUDED from any manipulation. This prevents typos that cause the boot volume missing from /dev/ which makes the system unbootable (no dataloss but nasty to fix – believe me, I accidentally did that once and immediately created this protection against it 😉 )

What if we want to remove the volume?

[root@dbhost ~]# asm deletedisk vol1
[root@dbhost ~]# ls -al /dev/oracleasm/vol1
ls: cannot access /dev/oracleasm/vol1: No such file or directory
[root@dbhost ~]# ls -al /dev/sdb
brw-rw---- 1 root disk 8, 16 Jun 24 09:02 /dev/sdb

‘Nuff said. Beware of doing this to disks that are in use by ASM. Makes Oracle sad.

Let’s create a few more volumes and present different ways of showing the configuration.

[root@dbhost ~]# asm createdisk myvol /dev/sdc
[root@dbhost ~]# asm createdisk yourvol /dev/sdd
[root@dbhost ~]# ls -al /dev/oracleasm/
drwxr-xr-x 2 root root 80 Jun 24 09:04 .
drwxr-xr-x 19 root root 4020 Jun 24 09:04 ..
brw-rw---- 1 grid asmdba 8, 32 Jun 24 09:04 myvol
brw-rw---- 1 grid asmdba 8, 48 Jun 24 09:04 yourvol
[root@dbhost ~]# asm list
myvol   1.00 GB [-] sdc
yourvol 4.00 GB [-] sdd
[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0]  1.00 GB available
/dev/sdc [2:0:2:0]  1.00 GB configured as /dev/oracleasm/myvol
/dev/sdd [2:0:3:0]  4.00 GB configured as /dev/oracleasm/yourvol
/dev/sde [2:0:4:0]  4.00 GB available
/dev/sdf [2:0:5:0]  2.00 GB available
/dev/sdg [2:0:6:0]  1.00 GB available

Could we rename a disk? Haven’t implemented that but you can do it yourself by editing asmtab:

[root@dbhost ~]# vi /etc/asmtab
yourvol scsi 36000c2905b5f9379248f904459f8b449

Change to:

myvol2 scsi 36000c2905b5f9379248f904459f8b449

Rescan asmtab for changes:

[root@dbhost ~]# asm scandisks
[root@dbhost ~]# ls -al /dev/oracleasm/
total 0
drwxr-xr-x 2 root root 100 Jun 24 09:06 .
drwxr-xr-x 19 root root 4020 Jun 24 09:06 ..
brw-rw---- 1 grid asmdba 8, 32 Jun 24 09:06 myvol
brw-rw---- 1 grid asmdba 8, 48 Jun 24 09:06 myvol2
brw-rw---- 1 grid asmdba 8, 48 Jun 24 09:04 yourvol

You see that yourvol is not removed and it’s the same as myvol2. That’s an artifact of the Udev mechanism (when doing asm deletedisk I force delete in the script). After reboot it will be gone. You may also manually rm /dev/oracleasm/yourvol (but be careful).

Ready for some more magic? Here goes…

Say your DBA wants to use partitions instead of full volumes because…. Well just because. Legacy thinking. We can do that if they insist:

[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0] 1.00 GB available
/dev/sdc [2:0:2:0] 1.00 GB configured as /dev/oracleasm/myvol
/dev/sdd [2:0:3:0] 4.00 GB configured as /dev/oracleasm/yourvol
/dev/sde [2:0:4:0] 4.00 GB available
/dev/sdf [2:0:5:0] 2.00 GB available
/dev/sdg [2:0:6:0] 1.00 GB available
[root@dbhost ~]# parted /dev/sde mklabel msdos
Information: You may need to update /etc/fstab.
[root@dbhost ~]# parted /dev/sde mkpart primary 1m 50%
Information: You may need to update /etc/fstab.
[root@dbhost ~]# parted /dev/sde mkpart primary 50% 100%
Information: You may need to update /etc/fstab.
[root@dbhost ~]# parted /dev/sde unit MiB print

Model: VMware, VMware Virtual S (scsi)
Disk /dev/sde: 4096MiB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Number Start End Size Type File system Flags
1 1.00MiB 2048MiB 2047MiB primary
2 2048MiB 4096MiB 2048MiB primary

Note that I deliberately created 2 partitions to show that it’s possible to use both on one disk as separate ASM volumes.

[root@dbhost ~]# asm createdisk sdep1 /dev/sde1
[root@dbhost ~]# asm createdisk sdep2 /dev/sde2
[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0] 1.00 GB available
/dev/sdc [2:0:2:0] 1.00 GB configured as /dev/oracleasm/myvol
/dev/sdd [2:0:3:0] 4.00 GB configured as /dev/oracleasm/yourvol
/dev/sde [2:0:4:0] 4.00 GB partitioned
/dev/sdf [2:0:5:0] 2.00 GB available
/dev/sdg [2:0:6:0] 1.00 GB available
[root@dbhost ~]# asm list
myvol   1.00 GB [-] sdc
myvol2  4.00 GB [-] sdd
sdep1   1.99 GB [-] sde1
sdep2   2.00 GB [-] sde2
yourvol 4.00 GB [-] sdd

You see /dev/sde cannot be detected as a single ASM volume so it shows as “partitioned” but when listing the ASM volumes you see them both.

This might be handy when migrating from ASMlib configurations as well.

Still not done? Nope. Watch this… Say I have a VM on my laptop or a small old server at home with a few SATA disks. I would like to have many more ASM volumes than I have virtual or physical disks in the system. Is there a way?

[root@dbhost ~]# vgcreate asmvg /dev/sdf /dev/sdg
No physical volume label read from /dev/sdf
Physical volume /dev/sdf not found
No physical volume label read from /dev/sdg
Physical volume /dev/sdg not found
Physical volume "/dev/sdf" successfully created
Physical volume "/dev/sdg" successfully created
Volume group "asmvg" successfully created
[root@dbhost ~]# lvcreate -Ay -nlvol1 -L1G asmvg
Logical volume "lvol1" created
[root@dbhost ~]# lvcreate -Ay -nlvol2 -L1G asmvg
Logical volume "lvol2" created
[root@dbhost ~]# asm createdisk lvol01 /dev/asmvg/lvol1
[root@dbhost ~]# asm createdisk lvol02 /dev/asmvg/lvol2
[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0] 1.00 GB available
/dev/sdc [2:0:2:0] 1.00 GB configured as /dev/oracleasm/myvol
/dev/sdd [2:0:3:0] 4.00 GB configured as /dev/oracleasm/yourvol
/dev/sde [2:0:4:0] 4.00 GB partitioned
/dev/sdf [2:0:5:0] 2.00 GB LVM Volume
/dev/sdg [2:0:6:0] 1.00 GB LVM Volume
[root@dbhost ~]# asm list
lvol01  1.00 GB [asmvg-lvol1] dm-7
lvol02  1.00 GB [asmvg-lvol2] dm-8
myvol   1.00 GB [-] sdc
myvol2  4.00 GB [-] sdd
sdep1   1.99 GB [-] sde1
sdep2   2.00 GB [-] sde2
yourvol 4.00 GB [-] sdd

Voila… A mix of raw disks, disk partitions and LVM logical volumes all under /dev/oracleasm to be used by ASM as you like. Note that for Oracle RAC you cannot use LVM volumes as they are not cluster aware. Other than that, no restrictions. Can ASMlib do that? 😉

I also made it work with multipath volumes (after installing device-mapper-multipath):

[root@dbhost ~]# asm disks
/dev/sda [2:0:0:0] 20.00 GB partitioned
/dev/sdb [2:0:1:0] 1.00 GB multipath (mpatha)
/dev/sdc [2:0:2:0] 1.00 GB configured as /dev/oracleasm/myvol
/dev/sdd [2:0:3:0] 4.00 GB configured as /dev/oracleasm/myvol2
/dev/sde [2:0:4:0] 4.00 GB multipath (mpathe)
/dev/sdf [2:0:5:0] 2.00 GB multipath (mpathf)
/dev/sdg [2:0:6:0] 1.00 GB multipath (mpathg)
/dev/dm-2 [mpatha] 2.00 GB available
/dev/dm-3 [mpathe] 8.00 GB partitioned
/dev/dm-4 [mpathf] 4.00 GB LVM Volume
/dev/dm-5 [mpathg] 2.00 GB LVM Volume

Haven’t tested Powerpath yet, will do as soon as I get the chance. But I don’t expect too much problems (might require a few script changes).

What if you want another diskstring? I have thought of that because of another reason: I was testing with IORate (a destructive IO load generator from EMC, that overwrites devices it has configured). IOrate is very useful but also dangerous for that reason. And normally it has to be run as root:root because otherwise it cannot access the volumes. But what if we used “asm” for that?

[root@dbhost ~]# asm createdisk iorate/iops1 /dev/sdb
[root@dbhost ~]# asm createdisk iorate/iops2 /dev/sdc
[root@dbhost ~]# ls -al /dev/iorate/
/dev/iorate/:
total 0
drwxr-xr-x 2 root root 80 Jun 24 09:36 .
drwxr-xr-x 21 root root 4260 Jun 24 09:36 ..
brw-rw---- 1 root iops 8, 16 Jun 24 09:36 iops1
brw-rw---- 1 root iops 8, 32 Jun 24 09:36 iops2

Here we created two test volumes for IO stress testing, under /dev/iorate, with group “iops”. If we create a user “iorate” with group “iops”, this user can now run the IO tests without root permissions (and thus risking severe dataloss). You can configure extra disk strings each with it’s own set of permissions.

Ever used IOStat to monitor ASM disks?

[root@dbhost ~]# iostat -xk
Linux 2.6.32-431.el6.x86_64 (dbhost) 06/24/2014 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.44 0.00 1.57 1.11 0.00 96.89

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
scd0 0.00 0.00 0.17 0.00 0.69 0.00 8.00 0.00 0.89 0.89 0.02
sdb 0.05 0.00 1.86 0.00 8.49 0.00 9.13 0.00 0.14 0.14 0.03
sda 7.33 1.49 17.91 0.76 204.05 2.53 22.13 0.04 2.10 1.42 2.65
sdd 0.00 0.00 0.93 0.00 3.71 0.00 8.00 0.00 0.13 0.13 0.01
...
...
...
dm-13 0.00 0.00 0.96 0.01 3.83 0.06 7.97 0.00 1.02 0.46 0.04
dm-14 0.00 0.00 0.92 0.01 3.42 0.04 7.39 0.00 0.24 0.24 0.02

Now how do you know which one maps to what ASM volume? Maybe this helps:

[root@dbhost ~]# asmstat -xk
Linux 2.6.32-431.el6.x86_64 (dbhost) 06/24/2014 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.37 0.00 1.30 0.90 0.00 97.43

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdb    0.04 0.00 1.51 0.00 6.89 0.00 9.13 0.00 0.14 0.14 0.02
sda    5.95 1.21 14.54 0.64 165.66 2.11 22.11 0.03 2.10 1.42 2.15
myvol2 0.00 0.00 0.75 0.00 3.01 0.00 8.00 0.00 0.13 0.13 0.01
myvol  0.00 0.00 0.75 0.00 3.01 0.00 8.00 0.00 0.12 0.12 0.01
sde    0.04 0.00 4.95 0.00 20.53 0.00 8.30 0.00 0.16 0.13 0.07
sdf    0.04 0.00 1.28 0.00 5.92 0.00 9.26 0.00 0.12 0.11 0.01
sdg    0.04 0.00 0.62 0.00 3.28 0.00 10.62 0.00 0.16 0.16 0.01

Note that it only works with non-multipath full disk devices (no LVM or partitioned disks yet). This because asmstat is just a wrapper around iostat, and translation seems to be not straightforward for non-block devs. Might work on that in a future version.

# man asm

asm(1) asmdisks asm(1)

NAME
asm - tool for managing Oracle ASM devices via udev(7)

SYNOPSIS
asm

DESCRIPTION
asm is a replacement for the oracleasm command provided via Oracle ASMlib. It attempts to provide
similar functionality using a simple script and Linux UDEV rather than tweaking the kernel with an
add-on kernel module, complex configuration and binary files.
…
…

Man pages included.

Note that “asmdisks” requires RHEL or compatible (OEL/CENTOS) with versions >= 6.x.

UDEV works different under 5.x so the RPM refuses to install on version 5. Haven’t tested SuSE. Mileage may vary.

For the records: the RPM package is 14 Kilobytes (so it could run on a Commodore 64 😉 and the asm script itself has only 362 lines (written as bash script).

Want to try it out? I created a Linux YUM repository from which you can download the RPM (and another one that I will cover later). See “Downloads” page.

Update:

Made slight changes to the repository and to the downloads page on this blog, it seems to break the links that were there. Sorry for the inconvenience. Please check “Downloads” tab on the blog for more info.

Happy UDEVving and let me know what you think!

Filed under: Oracle, Virtualization Tagged: asm, asmlib, EMC, fragmentation, iops, linux udev, oracle, oracle rac, performance, response time, VMware

The post Fun with Linux UDEV and ASM: Using UDEV to create ASM disk volumes appeared first on Dirty Cache.

Laat een reactie achter