openEuler开启kdump失败

文章目录
  1. 1. 一、现象
  2. 2. 二、问题分析
  3. 3. 结论

一、现象

openEuler启动kdump失败,显示No memory reserved for crash kernel

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@kvm-hkcloud01 ~]# systemctl status kdump
kdump.service - Crash recovery kernel arming
Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2024-06-24 14:50:48 CST; 26min ago
Process: 1004 ExecStart=/usr/bin/kdumpctl start (code=exited, status=1/FAILURE)
Main PID: 1004 (code=exited, status=1/FAILURE)

Jun 24 14:50:47 kvm-hkcloud01 systemd[1]: Starting Crash recovery kernel arming...
Jun 24 14:50:48 kvm-hkcloud01 kdumpctl[1015]: No memory reserved for crash kernel
Jun 24 14:50:48 kvm-hkcloud01 kdumpctl[1015]: Starting kdump: [FAILED]
Jun 24 14:50:48 kvm-hkcloud01 systemd[1]: kdump.service: Main process exited, code=exited, status=1/FAILURE
Jun 24 14:50:48 kvm-hkcloud01 systemd[1]: kdump.service: Failed with result 'exit-code'.
Jun 24 14:50:48 kvm-hkcloud01 systemd[1]: Failed to start Crash recovery kernel arming.

二、问题分析

根据提示来看,应该是申请不到内存导致的,但从free的结果来看,是可以满足kdump的要求的

  • 资源剩余700M+
1
2
3
4
5
[root@kvm-hkcloud01 ~]# free -h
total used free shared buff/cache available
Mem: 967Mi 89Mi 442Mi 23Mi 436Mi 715Mi
Swap: 0B 0B 0B
[root@kvm-hkcloud01 ~]#
  • kdump的需求是512M
1
2
3
[root@kvm-hkcloud01 ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.10.0-136.12.0.86.oe2203sp1.x86_64 root=UUID=d372d015-87b7-44ed-9860-33c9d47eedb6 ro cgroup_disable=files apparmor=0 crashkernel=512M
[root@kvm-hkcloud01 ~]#

为什么会分配不到资源呢?难道是启动期间资源不足导致?

接下来我将crash的内存修改为128M,发现可以启动成功。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# 查看当前配置
[root@kvm-hkcloud01 ~]# cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="cgroup_disable=files apparmor=0 crashkernel=512M"
GRUB_DISABLE_RECOVERY="true"
# 修改为128
[root@kvm-hkcloud01 ~]# sed -i 's/512/128/g' /etc/default/grub
# 重新生成grub.cfg
[root@kvm-hkcloud01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.10.0-136.12.0.86.oe2203sp1.x86_64
Found initrd image: /boot/initramfs-5.10.0-136.12.0.86.oe2203sp1.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-7c27015c30d4409fb116e3795e6126c4
Found initrd image: /boot/initramfs-0-rescue-7c27015c30d4409fb116e3795e6126c4.img
Adding boot menu entry for UEFI Firmware Settings ...
done
# 重启主机
[root@kvm-hkcloud01 ~]# reboot
# 查看服务
[root@kvm-hkcloud01 ~]# systemctl status kdump
● kdump.service - Crash recovery kernel arming
Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled)
Active: active (exited) since Wed 2024-06-26 10:25:09 CST; 1min 12s ago
Process: 1002 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS)
Main PID: 1002 (code=exited, status=0/SUCCESS)

Jun 26 10:25:03 kvm-hkcloud01 systemd[1]: Starting Crash recovery kernel arming...
Jun 26 10:25:09 kvm-hkcloud01 kdumpctl[1009]: kexec: loaded kdump kernel
Jun 26 10:25:09 kvm-hkcloud01 kdumpctl[1009]: Starting kdump: [OK]
Jun 26 10:25:09 kvm-hkcloud01 systemd[1]: Finished Crash recovery kernel arming.
[root@kvm-hkcloud01 ~]#
# 查看内存,可以看到内存的总量比之前少了128M,说明128M的内存已经分配给crash了。
[root@kvm-hkcloud01 ~]# free -h
total used free shared buff/cache available
Mem: 839Mi 85Mi 559Mi 6.2Mi 195Mi 623Mi
Swap: 0B 0B 0B
[root@kvm-hkcloud01 ~]#

不能分配太少的内存,如果内存太少,会有 Could not find a free area of memory of 0x1ddd000 bytes...错误,可以通过 makedumpfile --mem-usage /proc/kcore来估算需要多少内存。

结论

虚拟机内存少,而crash设置的内存比较多,会出现因资源不足没办法给crash分配内存的情况,导致kdump启动失败,可以通过修改crash内存参数来解决。