Hi, I noticed every few weeks my ESXi reboots at random, and everytime before it happens the vmkwarning.log records the below.
Seems to me like a memory leak.
I've deployed 2 servers with ESXi 7.0 on the exact same hardware but only this one giving grief.
Both are running the same hardware, the same BIOS version and release.
It's a HPE ProLiant DL20 Gen10 with a HPE Smart Array E208i-a SR Gen10 Array Controller
Server #1 been running fine since deploy (45 days), while this one reboot every week or so.
Last reboot was today 12 Oct, previous one was on 5 Oct
Could it be bad RAM, or is ESXi leaking memory?
Build is 16324942
Any help is appreciated.
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 259: Failed to add Non-PF mem (0x4000006000 - 0x4000006fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 320: 0000:00:12.0: Failed to add BAR[0] (MEM64 f=0x4 0x4000006000-0x4000007000) status: Limit exceeded, parent: \_SB_.PC00
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000006000 - 0x4000006fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 453: 0000:00:12.0: Unable to free BAR[0] (MEM64 f=0x4 0x4000006000-0x4000007000): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 259: Failed to add Non-PF mem (0x4000000000 - 0x4000001fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 320: 0000:00:14.2: Failed to add BAR[0] (MEM64 f=0x4 0x4000000000-0x4000002000) status: Limit exceeded, parent: \_SB_.PC00
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000000000 - 0x4000001fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 453: 0000:00:14.2: Unable to free BAR[0] (MEM64 f=0x4 0x4000000000-0x4000002000): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000005000 - 0x4000005fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 453: 0000:00:14.2: Unable to free BAR[2] (MEM64 f=0x4 0x4000005000-0x4000006000): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 259: Failed to add Non-PF mem (0x4000004000 - 0x4000004fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 320: 0000:00:16.0: Failed to add BAR[0] (MEM64 f=0x4 0x4000004000-0x4000005000) status: Limit exceeded, parent: \_SB_.PC00
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000004000 - 0x4000004fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 453: 0000:00:16.0: Unable to free BAR[0] (MEM64 f=0x4 0x4000004000-0x4000005000): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 259: Failed to add Non-PF mem (0x4000003000 - 0x4000003fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 320: 0000:00:16.4: Failed to add BAR[0] (MEM64 f=0x4 0x4000003000-0x4000004000) status: Limit exceeded, parent: \_SB_.PC00
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000003000 - 0x4000003fff): Limit exceeded
2020-10-12T05:38:48.017Z cpu0:524288)WARNING: PCI: 453: 0000:00:16.4: Unable to free BAR[0] (MEM64 f=0x4 0x4000003000-0x4000004000): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000006000 - 0x4000006fff): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 453: 0000:00:12.0: Unable to free BAR[0] (MEM64 f=0x4 0x4000006000-0x4000007000): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000000000 - 0x4000001fff): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 453: 0000:00:14.2: Unable to free BAR[0] (MEM64 f=0x4 0x4000000000-0x4000002000): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000005000 - 0x4000005fff): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 453: 0000:00:14.2: Unable to free BAR[2] (MEM64 f=0x4 0x4000005000-0x4000006000): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000004000 - 0x4000004fff): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 453: 0000:00:16.0: Unable to free BAR[0] (MEM64 f=0x4 0x4000004000-0x4000005000): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 184: Failed to remove Non-PF mem (0x4000003000 - 0x4000003fff): Limit exceeded
2020-10-12T05:38:48.020Z cpu0:524288)WARNING: PCI: 453: 0000:00:16.4: Unable to free BAR[0] (MEM64 f=0x4 0x4000003000-0x4000004000): Limit exceeded
2020-10-12T05:38:48.021Z cpu0:524288)WARNING: PCI: 239: 0000:00:1f.5: BAR[0] (MEM f=0x0 0xfe010000-0xfe011000) registration failed (Bad address range)
2020-10-12T05:38:54.896Z cpu6:524775)WARNING: APEI: 306: Could not initialize EINJ
2020-10-12T05:38:56.352Z cpu6:524835)WARNING: WARN: smartpqi: pqisrc_display_device_info:248: added scsi BTL 1:0:1: HPE LOGICAL VOLUME RAID 1(1+0) SSDSmartPathCap- En- Exp+ qd=0
2020-10-12T05:38:56.352Z cpu6:524835)WARNING: WARN: smartpqi: pqisrc_display_device_info:248: added scsi BTL 2:1088:1: HPE E208i-a SR Gen10 RAID 0 SSDSmartPathCap- En- Exp+ qd=1014
2020-10-12T05:38:56.446Z cpu10:524692)WARNING: etherswitch: PortCfg_ModInit:1078: Skipped initializing etherswitch portcfg for VSS to use cswitch and portcfg module
2020-10-12T05:38:57.808Z cpu0:524692)WARNING: FBFT not enabled
2020-10-12T05:39:01.957Z cpu9:524692)WARNING: NMP: nmpPathClaimEnd:1393: All Helper Completed registering device 2