padmin登录vios失败告警-"INIT: failed write of utmp entry"-HMC viosvrcmd fails with HSCL2970

padmin login fails with "INIT: failed write of utmp entry"

 

Troubleshooting


Problem

PowerVM VIOS padmin用户无法通过网络和控制台登录。处理步骤
 

Symptom

试图通过ssh或telnet并通过HMC终端(vterm)控制台会话登录到VIOS的网络登录失败,出现以下错误:

     INIT: failed write of utmp entry: " cons"
 

Cause

The error is symptomatic of I/O problem to the VIOS rootvg file systems.
#该错误是VIOS rootvg文件系统I/O问题的症状。
 

Environment

VIOS V3.1 managed by an HMC
 

Diagnosing The Problem(诊断问题)

 

常见原因包括(但不限于)以下任何一种:

  1. Permissions issue for /etc/utmp or /var/adm/wtmp  #权限问题
  2. Full rootvg file systems.    #完成rootvg文件系统
  3. Hardware failure related to VIOS rootvg, such as disk or adapter failure.   #VIOS rootvg相关的硬件故障,例如磁盘或适配器故障。
  4. rootvg file systems are in read-only mode   #Rootvg文件系统处于只读模式

 

Using HMC Enhanced GUI
> Click Resources
> All Systems
> Click the system name hosting the VIOS partition in question
> On the left pane, click "Virtual I/O Servers" under Power VM
> Select VIOS partition name
> Click Actions > Console > Open Terminal Window
 
Using HMC Classic GUI
In the navigation pane:
> Open Systems Management
> Click Servers
> Click the Managed System name hosting the VIOS partition in question
In the work pane:
> Select the VIOS partition
> Click Tasks Console Window > Open Terminal Window
 
 
Using HMC CLI
At HMC command line, type vtmenu
> Enter Number of the Managed System name
> Enter number of VIOS partition in question
If the console window is nonresponsive, you might see something similar to the following: (如果控制台窗口没有响应,您可能会看到类似以下内容:)
 
Opening Virtual Terminal On Partition t720vio1 . . .
Open in progress
Open Completed.
IBM Virtual I/O Server
login: padmin
/dev/vty0: You must "exec" login from the lowest login shell.
INIT: failed write of utmp entry: "          cons"
INIT: failed write of utmp entry: "          cons"
INIT: failed write of utmp entry: "          cons"
INIT: Command is respawning too rapidly. Check for possible errors.

 

如果收到这些错误,请检查是否可以使用HMC命令viosvrcmd远程运行VIOS命令。有几种方法可以做到这一点。

 

Note: For HMC V8.8.5 or later, refer to HMC viosvrcmd fails with HSCL2970

 

下面的HMC命令在HMC V7R7.8和V7R7.9上进行了测试。它们不适用于HMC V8。

 

 Method #1 - Run padmin command

 

hscroot@hmchost:~> command=`printf "ioslevel"`; viosvrcmd -m VIRT-9117-MMB-SN10F6B1R -p p7virtvios1 -c "$command"

 where ioslevel is the padmin command
     VIRT-9117-MMB-SN10F6B1R is your managed system name
     and p7virtvios is the VIOS partition name in question

 

2.2.4.10            <-- This is the command output
 
Method #2 - Run command via oem_setup_env/AIX root shell, as shown in the examples below
hscroot@hmchost:~> command=`printf "oem_setup_env\nls -l /etc/utmp"`; viosvrcmd -m VIRT-9117-MMB-SN10F6B1R -p p7virtvios1 -c "$command"
 

  where ls -l /etc/utmp is the VIOS command to be ran

-rw-r--r--    1 root     system        35640 Jun 20 15:55 /etc/utmp     <--  This is the output

hscroot@hmchost:~> command=`printf "oem_setup_env\nls -l /var/adm/wtmp"`; viosvrcmd -m VIRT-9117-MMB-SN10F6B1R -p p7virtvios1 -c "$command"
-rw-rw-r--    1 adm      adm         3906792 Jun 20 15:55 /var/adm/wtmp
如果可以通过viosvrcmd命令访问VIOS,则可以尝试检查以下可能原因。
 
另一方面,如果VIOS不能通过viosvrcmd访问,则它可能处于“挂起”状态。有关如何排除VIOS挂起情况,请参阅VIOS崩溃或VIOS挂起情况的MustGather测试用例。
 

Resolving The Problem

在采取任何纠正措施之前,请完整阅读余下的文件。

 

Probable Cause #1 - Permissions issue for /etc/utmp or /var/adm/wtmp

 

Probable Cause #1 - Permissions issue for /etc/utmp or /var/adm/wtmp

viosvrcmd method #2 reflects the expected permissions for /etc/utmp and /var/adm/wtmp.
因此,如果您得到类似的输出,那么您的权限是正确的。
如果您的权限不同,请纠正它们。然后再次尝试登录。
 
Probable Cause #2 - Full rootvg file system
 
通过viosvrcmd将命令替换为df -g来检查完整的文件系统。如果有任何文件系统已满或接近满,请在重新尝试登录之前解决问题。有关更多详细信息,请参见诊断PowerVM VIOS中的完整文件系统。
 
Probable Cause #3 - Hardware failure related to VIOS rootvg (VIOS rootvg相关的硬件故障)
 
将viosvrcmd中的命令替换为errlog (padmin)或errpt命令(如果在oem_setup_env/AIX根shell中),以确定是否存在与rootvg相关的磁盘或适配器错误。
 
如果没有与rootvg相关的硬件错误,请检查是否有与rootvg相关的LVM或文件系统错误。
 
Probable Cause #4 - rootvg file systems are in read-only mode
如果系统丢失到VIOS rootvg磁盘的路径,就会发生这种情况。为了测试这一点,在/home/padmin中检查iosci .log,并尝试使用viosvrcmd复制文件(cp iosci .log test123.log),看看它是否成功完成,或者它是否失败,并出现类似以下错误:
 

  HSCL2970 The IOServer command has failed because of the following reason:

     Unable to open file: /home/ios/logs/ioscli_global.trace for append
     Error from cliCheckFile:-1
     Unable to open file: /home/ios/logs/ioscli_global.trace for append
     Error from cliCheckFile:-1
     ...
     Unable to open file: /home/ios/logs/ioscli_global.trace for append
     Error from cliCheckFile:-1
     cp: test123.out: Read-only file system     <-------
     rc=1
     ...
 
注意:如果VIOS分区从SAN磁盘引导,请与SAN管理员联系,检查磁盘是否被设置为只读模式。当存储子系统使用磁盘复制实用程序时,就会发生这种情况。
 
如果在错误日志中没有发现硬件错误,但遇到“只读文件系统”消息,则使用viosrvcmd方法#2运行mount命令。mount命令应该显示有多少个只读文件系统。可能的输出包括:
 
    node    mounted    mounted over    vfs       date        options
   ------ ----------- --------------  ------ ------------ ---------------
          /dev/hd4    /                jfs2  Mar 15 09:47 rw,log=/dev/hd8
          /dev/hd2    /usr             jfs2  Mar 15 09:47 rw,log=/dev/hd8
          ...
 
          OR
    node    mounted    mounted over    vfs       date        options
   ------ ----------- --------------  ------ ------------ ---------------
          /dev/hd4    /                jfs2  Mar 03 01:25 ro,degraded
          /dev/hd2    /usr             jfs2  Mar 03 09:47 ro,degraded
 
如果所有文件系统都处于只读模式,则通常在文件系统无法访问底层磁盘和/或写入磁盘失败(基本上是I/O错误)时发生这种情况。在这种情况下,JFS2故意将文件系统置于只读模式,这样就不会再对磁盘进行写操作,以避免由于I/O错误而导致文件系统损坏。在这种情况下,应该在维护模式下启动VIOS,通过执行彻底的fsck来验证文件系统的完整性,使它们处于干净状态,并确保磁盘是可访问的。如果系统处于这种状态,很可能是引导磁盘有问题,但无法访问错误日志和快照数据,很难进行分类确认。由于此时唯一的选择是重新启动VIOS到维护模式,并在文件系统上运行fsck,因此在关闭VIOS时,可能值得进行进一步的研究。
 
IMPORTANT
Depending on the VIOS state, the VIOS partition might need to be booted to a maintenance mode. 
 
 

HMC viosvrcmd fails with HSCL2970

 

Troubleshooting


Problem

尝试使用viosvrcmd在VIOS oem_setup_env shell中运行命令时,从HMC 8.8.5及更高版本发出时,将失败,错误为HSCL2970。

 

Symptom

 

Example:

 

hscroot@<HMC_hostname>:~> command='printf "oem_setup_env\nls -l /etc/utmp"'
hscroot@<HMC_hostname>:~> viosvrcmd -m <Managed_Server_Name> -p <VIOS_Name> -c "$command"
HSCL2970 The IOServer command has failed because of the following reason: ioscli printf "oem_setup_env\nls -l /etc/utmp" contains illegal data:  printf .

 

Attempting to run in the padmin shell also fails (as it would if logged-in to the VIOS as padmin directly):

 

hscroot@<HMC_hostname>:~> command='ls -l /etc/utmp'
hscroot@<HMC_hostname>:~> viosvrcmd -m <Managed_Server_Name> -p <VIOS_Name> -c "$command"
HSCL2970 The IOServer command has failed because of the following reason:
Not a valid command: ls -l /etc/utmp
rc=1
 

Cause

虽然在早期版本中可以有效地运行路由到oem_setup_env shell的命令,但这被视为安全暴露,并施加了限制。

——admin标志是从HMC 8.5.0及更高版本引入的,它允许(某些)命令直接以根用户身份在VIOS上运行。

 

Environment

 

HMC V8R8.5.0.0 and later with PowerVM® partitions at any level. (HMC V8R8.5.0.0及更高版本,支持任何级别的PowerVM®分区)
 

Diagnosing The Problem

Check HMC version with "lshmc -V"

 

Resolving The Problem

 

Use a ViosAdminOp user profile on your HMC and amend the viosvrcmd command syntax.
 
Example:
:~> command=`printf "oem_setup_env\nls -l /etc/utmp"`
becomes:
:~> command="ls -l /etc/utmp"

... but the viosvrcmd part requires "--admin" adding.

Example:
hscroot@<HMC_hostname>:~> viosvrcmd -m <Managed_Server_Name> -p <VIOS_Name> -c "$command"
becomes:
viosadminuser@<HMC_hostname>:~> viosvrcmd -m <Managed_Server_Name> -p <VIOS_Name> -c "$command" --admin
 
 
To create the VIOS Admin task role: (第一步)
 
hscroot@<HMC_hostname>:~> mkaccfg -t taskrole -i "name=VIOS_Admin,parent=hmcsuperadmin,"resources=lpar:ViosAdminOp""
(NB: This command should all be on a single line).

To create the new HMC "viosadminuser" user with a password="vios-admin" set to expire in 3 days:

hscroot@<HMC_hostname>:~>mkhmcusr -u viosadminuser -a VIOS_Admin --passwd vios-admin -M 3
 
 
Log in as the new user:  (第二步)

$ ssh -e T viosadminuser@<HMC_hostname>
viosadminuser@<HMC_hostname>'s password:
Warning: your password will expire in 3 days

 
Then, create and run commands with "root" authority:
viosadminuser@<HMC_hostname>:~> command="ls -l /etc/utmp"
viosadminuser@<HMC_hostname>:~> viosvrcmd -m <Managed_Server_Name> -p <VIOS_Name> -c "$command" --admin
 
Example:
viosadminuser@<HMC_hostname>:~> command="ls -l /etc/utmp"
viosadminuser@<HMC_hostname>:~> viosvrcmd -m Server-8406-70Y-SN06BFF6A -p PS700_VIOS_Epic -c "$command" --admin
-rw-r--r--    1 root     system        34992 May 11 16:25 /etc/utmp
 

Related Information

https://www.ibm.com/docs/en/power9?topic=interfaces-hmc-commands
posted @ 2023-02-22 16:08  小明123_123  阅读(495)  评论(0)    收藏  举报