Cheney.Yu

博客园 首页 联系 订阅 管理

In this Document
  Goal
  Solution


 

 

Applies to:

Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.7 - Release: 10.1 to 11.1
Information in this document applies to any platform.

Goal

This document includes guidelines to diagnose ASMLIB problems, like installation, restore of missing files, collection of information during  error diagnostics.  

Solution

1. ASMLIB INSTALLATION
 
ASMLIB is distributed as Linux RPMs which are specific for the different Linux distributions based on : 
 
    - Linux distribution (RedHat 2.1, 3.0, 4.0, 5.0 or Suse Enterprise Server 8,9,10)
- Linux Kernel. (smp, highmem, or release)
- Type of CPUs:  AMD64, Intel 64, Itanium 64bits, Intel 32 bits. 
 
The complete reference for the RPMs  is found at  http://www.oracle.com/technetwork/server-storage/linux/whatsnew/index.html 
 
To verify the RPM installed execute rpm -qa |grep oracleasm. 
 
$ rpm -qa |grep oracleasm
oracleasm-support-2.0.4-1.el5
oracleasm-2.6.18-53.el5-2.0.4-1.el5
oracleasmlib-2.0.3-1.el5
 
The following syntax will print at the end of the file, the platform:
 
$rpm -qa --qf "%{NAME}-%{VERSION}-%{RELEASE}.%{ARCH}.rpm\n" | grep asm
oracleasm-support-2.0.4-1.el5.x86_64.rpm
oracleasm-2.6.18-53.el5-2.0.4-1.el5.x86_64.rpm
oracleasmlib-2.0.3-1.el5.x86_64.rpm
 
As mentioned before, this set of files is different for each platform, but once you have selected the correct set, files oracleasm-support and oracleasmlib are identical for all the possible kernels (smp,highmem,etc).  
Only the third RPM is unique to the kernel release installed, which is obtained executing command uname -a or uname -r
 
$ uname -a
Linux jfrac1.us.oracle.com 2.6.18-53.el5 #1 SMP Sat Nov 10 19:37:22 EST 2007 x86_64
 
Downloading the correct RPM is required and for this kernel, it will be oracleasm-2.6.18-53.el5-2.0.4-1.el5.x86_64.rpm.   The installation of an incorrect RPM will produce different errors, during the configuration of ASMLIB or later during the discovery of the disks.  Details of the configuration process are found in note Tips On Installing and Using ASMLib on Linux (Doc ID 394953.1).
 
* Is /dev/oracleasm created?
 
When ASMLIB is configured, a special filesystem is created and mounted: /dev/oracleasm.   
 
$ df -ha
Filesystem            Size  Used Avail Use% Mounted on 
 
/dev/hdc2              13G   11G  1.9G  85% /
none                     0     0     0   -  /proc
none                     0     0     0   -  /dev/pts
usbdevfs                 0     0     0   -  /proc/bus/usb
/dev/hdc1             101M   14M   81M  15% /boot
none                  250M     0  250M   0% /dev/shm
/dev/sda1             8.4G  4.8G  3.2G  60% /oradata2
/dev/sde1             8.3G  6.6G  1.4G  84% /oradata3
oracleasmfs              0     0     0   -  /dev/oracleasm
 
When command oracleasm createdisk is executed, a block device is created under /dev/oracleasm/disks.   
This is the device discovered by ASMLIB using the string ORCL:*.   
 
brw-rw----    1 usupport dba        8,  33 Feb 23 10:54 VOL1
 
* Checking if ASMLIB was installed properly: 
 
[root@arlnx2 asm_tar]# /etc/init.d/oracleasm status 
 
Checking if ASM is loaded:                            [  OK  ]
Checking if /dev/oracleasm is mounted:                [  OK  ] 
If the command fails, use strace and generate a log file: 
 
 
strace -f -o asm_status.out /etc/init.d/oracleasm status 
 
Additional information to verify the installation can be found in note 269194.1
 
*  Listing the ASMLIB disks: 
 
/etc/init.d/oracleasm listdisks 
TESTX 
VOL1
 
You will find an entry under /dev/oracleasm/disks.   This is the block device associated to the physical device. If the file exist the command will return information, but if not, plese execute: 
 
strace -f -o asm_listd.out /etc/init.d/oracleasm listdisks.
 
2. Error Device "/dev/sdg" is not a partition when creating the ASMLIB device
This problem is reported during the creation of  the ASMLIB device: 
 
[root@arlnx2 asm_tar]# /etc/init.d/oracleasm createdisk mydisk /dev/sdg 
 
Marking disk "/dev/sdg" as an ASM disk: asmtool: Device "/dev/sdg" is not a partition   [FAILED]
 
The message indicates the disk is not a partition and  ASMLIB requires the disk has at least one partition. 
 
How to check this:  using command fdisk -l <device name> 
 
Example: 
 
[root@arlnx2 asm_tar]# /sbin/fdisk -l /dev/sdg 
 
Disk /dev/sdg: 9105 MB, 9105018880 bytes
64 heads, 32 sectors/track, 8683 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes 
 
Device Boot    Start       End    Blocks   Id  System
/dev/sdg1             1      2862   2930672   83  Linux
 
The command shows disk /dev/sdg having one partition /dev/sdg1 and this is the device that should be referenced on command oracleasm createdisk.  If you still have problems, then use strace: 
 
# strace -f -o asm_create.out /etc/init.d/oracleasm createdisk <disk name> <physical disk> 
 
3. How to identify the physical disk bound to the ASMLIB disk.
 
Use  /etc/init.d/oracleasm querydisk <NAME>  where NAME is any name under /dev/oracleasm/disks. 
 
Example: 
[root@arlnx2 asm_tar]# /etc/init.d/oracleasm querydisk -d VOL1 
Disk "VOL1" is a valid ASM disk on device [8, 33]
 
The command reports the device identified with major,minor numbers which are unique numbers associated to each disk.  File /proc/partitions can be used to find the name of the device associated with those numbers: 
 
major minor  #blocks  name     rio rmerge rsect ruse wio wmerge wsect wuse running use aveq 
 
   8     0    8891620 sda 39715 78016 941080 417000 156198 242472 3189752 214180 0 420630 631180
   8     1    8891376 sda1 39691 77970 940922 416780 156198 242472 3189752 214180 0 420410 630960
   8    16    8891620 sdb 87 250 803 740 0 0 0 0 0 740 740
   8    17    8891376 sdb1 57 193 632 480 0 0 0 0 0 480 480
   8    32   17783250 sdc 745 2993 8321 8300 0 0 0 0 0 5250 8300
   8    33     977904 sdc1 87 139 644 1040 0 0 0 0 0 1040 1040  
   8    34     977920 sdc2 35 193 456 230 0 0 0 0 0 230 230
   8    35          1 sdc3 4 0 8 40 0 0 0 0 0 40 40
   8    37     977904 sdc5 57 193 632 1240 0 0 0 0 0 1240 1240
   8    38     977904 sdc6 57 193 632 1170 0 0 0 0 0 1170 1170
 
Also connected as root you can run the same command but referencing the physical device:
 
strace -f -o asm_query.out /etc/init.d/oracleasm querydisk <NAME>. 
 
4. ASM disks are not discovered when using asm_diskstring='ORCL:*' 
 
[root@arlnx2 dbs]# /etc/init.d/oracleasm querydisk /dev/sdc1
Disk "/dev/sdc1" is marked an ASM disk with the label "VOL1"
 
Any error on this command will require using strace: 
In 10gR2 if the disks are not discovered using string ORCL:*, the alternative is using  /dev/oracleasm/disks.  
This could be set in parameter asm_diskstring or using this path in the DDL statement when creating a diskgroup or adding a disk.  This is possible because in this release Oracle can open directly the block device.  10.1 requires binding the block device to the character device known as /dev/raw/rawX. 
 
Notice the output of ls -l /dev/oracleasm/disks 
 
[root@arlnx2 asm_tar]# ls -l /dev/oracleasm/disks 
 
brw-rw----    1 usupport dba        7,   1 Feb 20 13:30 TESTX
brw-rw----    1 usupport dba        8,  33 Feb 17 09:41 VOL1
 
The b at the beginning indicates this is a block device.  When referencing /dev/oracleasm/disks/VOL1 there is not ASMLIB used as we are accessing directly the block device.   Using /dev/sdg1 or /dev/oracleasm/disks/VOL1 is exactly the same. 
 
This situation could affect environments where ASM is using 10.2 but there are databases using 10.1 and 10.2.   Databases using 10.2 can use the diskgroup but 10.1 will fail when a file is created.  
Example:  creating a tablespace will fails with error ora-600: 
 
ORA-00600: internal error code, arguments: [kfioSubmitIO07], [], [], [], [], [], [], []
 
CallStack:
kgeasnmierr kfioSubmitIO kfioRequest ksfd_osmcrt ksfd_create ksfdcre  kcfcedtf tbsafl ctsdrv1 ctsdrv.
 
kfioSubmitIO is trying to obtain the extent map of the disks associated to the diskgroup but can not identify the disk.  The reason is because 10.1 can not reference directly the block device /dev/oracleasm/disks/.  
 
* How to diagnose why ORCL:*  is not working:  Use /usr/sbin/oracleasm-discover 'ORCL:*'. 
 
When ORCL:* is resolved, the following output is presented: 
 
[root@arlnx2 asm_tar]# /usr/sbin/oracleasm-discover 'ORCL:*' 
 
Using ASMLib from /opt/oracle/extapi/32/asm/orcl/1/libasm.so 
 
[ASM Library - Generic Linux, version 2.0.0 (KABI_V1)]
Discovered disk: ORCL:TESTX [819200 blocks (419430400 bytes), maxio 128]
Discovered disk: ORCL:VOL1 [1955808 blocks (1001373696 bytes), maxio 128]
 
If something is wrong,  you could  get error Unable to open ASMLib.  
Use strace to debug the command: 
 
strace -f -o asm_discover.out /usr/sbin/oracleasm-discover 'ORCL:*' 
 
Check for the reference to library libasm.so.  
 
open("/opt/oracle/extapi/32/asm/orcl/1/libasm.so", O_RDONLY) = 3.
 
If you get open("/opt/oracle/extapi/32/asm/orcl/1/libasm.so", O_RDONLY) = -1 EACCES (Permission denied), then that is the cause of the problem.  Check if the library exist or if the permissions are correct: 755.  Also validate that the directories in the path also have the correct permissions (755). 
 
5. Starting ASM instance report errors ORA-604, ORA-15183, ORA-15180 after deleting file libasm.so 
When all the files under /opt/oracle/extapi path are deleted, if ASMLIB is used, when mounting diskgroups, following errors will be reported:   
 
ORA-00604 : error occurred at recursive SQL level 2
ORA-15183 : ASMLIB initialization error [/opt/oracle/extapi/64/asm/orcl/1/libasm.so]
ORA-15180 : Could not open dynamic library /opt/oracle/extapi/64/asm/orcl/1/libasm.so, error
 
This set of directories and files are created during the installation of the ASMLIB rpms, particularly oracleasmlib-*.  The diskgroups can not be mounted because there are not ASMLIB devices available.
 
During the installation, the following directories and files are created:
 
/opt/oracle/extapi
/opt/oracle/extapi/32
/opt/oracle/extapi/32/asm
/opt/oracle/extapi/32/asm/orcl
/opt/oracle/extapi/32/asm/orcl/1
/opt/oracle/extapi/32/asm/orcl/1/libasm.so
 
For 64bit platforms the 32 directory will be renamed by 64 directory.  You can verify the existance of these elements executing command find /opt/oracle/extapi
 
Also executing /usr/sbin/oracleasm-discover (strace -f -o asm_discover.out /usr/sbin/oracleasm-discover 'ORCL:*'), will report errors when trying to read the missing directories/files.
 
In order to create this objects, oracleasmlib-* rpm needs to be reinstalled, first deleting from disk the rpm image.
 
For example, in this environment those are the rpms installed:
 
Note:  Use --force flags to reinstall the rpms if the normal flags Uvh don't work.
 
Due to Unpublished BUG:9824267 you can get this:
 
ORA-15183: ASMLIB initialization error [driver/agent not installed]
WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
 
if you have asmlib rpms installed but not configured. 
 
Even if your asm discovery string contains NO asmlib components.
 
6. ASM instance is not discovering disks when asm_diskstring is 'ORCL:*' You have reviewed the previous five points and still the disks are not discovered using the string 'ORCL:*' and only discover the disks if using the native path '/dev/sdX', '/dev/emcpowerxy' or '/dev/oracleasm/disks/XXX'.  If you are in 10gR2 this is possible because from this version, Oracle can execute IO referencing the block device 
and the devices under /dev/oracleasm/disks are block devices, linked to the physical device. 
 
SOLUTION
 
We can simulate the disk discovery from the operating system level, using tool kfod.  ($ORACLE_HOME/bin).  The execution syntax is:
 
[usupport@jfrac1 bin]$ kfod asm_diskstring='ORCL:*' disks=all
--------------------------------------------------------------------------------
Disk Size Path
================================================================================
1: 954 Mb ORCL:ASMLIB1
2: 955 Mb ORCL:ASMLIB2
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
================================================================================
+ASM2 /oracle/10gR2/asm
+ASM1 /oracle/10gR2/asm
 
That is the normal output when executed by oracle user.   Sometimes, the command reports (discover)  the disks when  executed as root.  This is an indication of an access problem in one of the files or directories under /opt/oracle.  Make sure the permissions are 755 for all the directories and the files under /opt/oracle. If after verifying the permissions the disks are still not discovered, it is probably a faulty installation of the oracleasm rpms.  The symptoms for this problem are: 
 
[usupport@arlnx2 admin]$ rpm -qa |grep oracleasm
oracleasm-support-2.0.0-1
oracleasmlib-2.0.0-1
oracleasm-2.4.21-EL-1.0.0-1
oracleasm-2.4.21-27.0.4.EL-1.0.4-2
 
To remove the rpm as root execute:  rpm -e oracleasmlib-2.0.0-1
To reinstall the rpm as root execute:  rpm -Uvh oracleasmlib-2.0.0-1 
 The filesystem /dev/oracleasm is mounted
 
ASMLIB disks can be created using /etc/init.d/oracleasm createdisk command, and the block devices exist under /dev/oracleasm/disks Commands like /etc/init.d/oracleasm listdisks or /etc/init.d/oracleasm querydisks return the expected results. The first 5 points referenced in this note have been validated. The rpms are installed, at least what is rpm -qa oracleasm reports. At this point, the recommendation is reinstalling the rpms for ASMLIB.  Use the --force flag.
 
7. ASMLIB disks are bound to the individual path device and not to the pseudo device created by the multipath software. Multipathing allows to stablish multiple I/O access paths for an individual disk, providing features like load balancing, automatic failover.   There is a pseudo device, which is created under /dev and it can be referenced by ASM.   Examples are /dev/emcpowerxx (EMC PowerPath), /dev/vpath (IBM SDD), /dev/md-x (Linux MD), /dev/dm (Linux DM).
 
ASMLIB disks can be created referencing this pseudo devices when /etc/init.d/oracleasm createdisk is used.
 
There is a situation during scandisks operation where the ASMLIB is binded to the single/individual path and not to the pseudo device.   You can verify this running an ls -l /dev/oracleasm/disks and checking the major,minor numbers of the devices.
 
[usupport@jfrac1 bin]$ ls -l /dev/oracleasm/disks
brw-rw---- 1 usupport dba 8, 17 Jul 11 17:08 ASMLIB1
brw-rw---- 1 usupport dba 8, 18 Jul 11 17:08 ASMLIB2
 
In this particular case, ASMLIB1 is binded to disk identified with major,minor 8,17.   File /proc/partitions contains the mapping of the major,minor and the name of the device.
 
Although some multipath vendors will trap the IO even if the individual path is referenced, it is prefered to bind the ASMLIB disks with the pseudo device created by the multipath layer.
 
SOLUTION
 
Modify parameter ORACLEASM_SCANORDER on file /etc/sysconfig/oracleasm and set the string associated with the pseudodevices.
 
Examples are: 
 
ORACLEASM_SCANORDER=emcpower
ORACLEASM_SCANORDER=dm
ORACLEASM_SCANORDER=vpath
ORACLEASM_SCANORDER=md
 
If you are running a cluster, make sure to modify the file in all the nodes and restart ASMLIB.
 
8. ASMLIB Logging
 
ASMLIB provides additional logging for the following areas:
 
ENTRY /* func call entry */
EXIT /* func call exit */
DISK /* Disk information */
REQUEST /* I/O requests */
BIO /* bios backing I/O */
IOC /* asm_iocs */
ABI /* ABI entry points */
 
The settings are recorded in file /proc/fs/oracleasm, where those are the default values:
 
ENTRY deny
EXIT deny
DISK off
REQUEST off
BIO off
IOC off
ABI off
ERROR allow
NOTICE allow
 
There are three possible values:
 
deny
off
allow
 
To change the values, run this simple command:
 
echo "DISK allow" > /proc/fs/oracleasm/log_mask
 
This will change the particular value for the DISK entry without affecting others values. It is recommended to run the command for each change individually or you can use the following shell script:
 
1.log_mask="/proc/fs/oracleasm/log_mask"
2.echo " ***************** Current values ********************"
3.echo
4.cat $log_mask
5.echo REQUEST > /tmp/logmask.out
6.echo DISK >> /tmp/logmask.out
7. cat /tmp/logmask.out | (
8. while read bit status; do
9. # $1 is "allow" or "off
10. echo "$bit $1" > $log_mask
11. done
12. )
13.echo
14.echo "****************** New Values ***********************"
15.cat $log_mask
 
and execute sh x.sh <new value>.
 
Ex: x.sh allow
 
 
Setting the value to off will disable the extra logging. 
 
This changes does not require restarting ASMLIB or restarting ASM Instance.
posted on 2012-11-05 17:29  Cheney.Yu  阅读(2620)  评论(0编辑  收藏  举报