Skip to content

Commit e00af37

Browse files
MTL-2126 Switch out manual boot trim instructions for script (#5803)
* MTL-2126 Use boot order script This replaces the long-standing manual directions with a script invocation. The manual directions for boot trimming were from the early days when we wanted to reproduce what automation was doing. This section has lost some of its value as a manual process, it should invoke the automation script for easier use. (cherry picked from commit 7ffe74b) * Update `efibootmgr` output examples (cherry picked from commit b9107d8) --------- Co-authored-by: Russell Bunch <[email protected]>
1 parent c57db8f commit e00af37

File tree

1 file changed

+101
-75
lines changed

1 file changed

+101
-75
lines changed

background/ncn_boot_workflow.md

+101-75
Original file line numberDiff line numberDiff line change
@@ -122,8 +122,8 @@ done
122122
* `ipmitool` can set and edit boot order; it works better for some vendors based on their BMC implementation
123123
* `efibootmgr` speaks directly to the node's UEFI; it can only be ignored by new BIOS activity
124124
125-
> **NOTE:** `cloud-init` will set boot order when it runs, but this does not always work with certain hardware vendors. An administrator can invoke the `cloud-init` script at
126-
> `/srv/cray/scripts/metal/set-efi-bbs.sh` on any NCN.
125+
> **NOTE:** `cloud-init` will set boot order and trim boot devices during its `runcmd` module, but this does not always work with certain hardware vendors. An administrator may invoke the `cloud-init` script on
126+
> any NCN or PIT by loading `/srv/cray/scripts/metal/metal-lib.sh` (this should be loaded in a sub-shell as the library has a `set -e` flag.)
127127
128128
## Setting boot order
129129
@@ -185,47 +185,24 @@ This is the end of the `Setting boot order` procedure.
185185
186186
## Trimming boot order
187187
188-
This section gives the procedure for removing unwanted entries from the boot order on NCNs and the PIT node.
188+
This procedure prunes the list of boot devices, optimizing the boot order to align with CSM's requirements.
189189
190-
This section will only advise on removing other PXE entries. There are too many
191-
vendor-specific entries beyond disks and NICs to cover in this section (e.g. BIOS entries, iLO entries, etc.).
190+
(`ncn#` or `pit#`) Load the metal tools library and invoke the boot trim function.
192191
193-
In this case, the instructions are the same regardless of node type (management, storage, or worker):
194-
195-
1. (`ncn#` or `pit#`) Make lists of the unwanted boot entries.
196-
197-
* Gigabyte Technology
198-
199-
```bash
200-
efibootmgr | grep -ivP '(pxe ipv?4.*)' | grep -iP '(adapter|connection|nvme|sata)' | tee /tmp/rbbs1
201-
efibootmgr | grep -iP '(pxe ipv?4.*)' | grep -i connection | tee /tmp/rbbs2
202-
```
203-
204-
* Hewlett-Packard Enterprise
205-
206-
> **NOTE:** This does not trim HSN Mellanox cards; these should disable their OpROMs using [the high speed network snippets](../operations/node_management/Switch_PXE_Boot_From_Onboard_NICs_to_PCIe.md#high-speed-network).
207-
208-
```bash
209-
efibootmgr | grep -vi 'pxe ipv4' | grep -i adapter |tee /tmp/rbbs1
210-
efibootmgr | grep -iP '(sata|nvme)' | tee /tmp/rbbs2
211-
```
212-
213-
* Intel Corporation
214-
215-
```bash
216-
efibootmgr | grep -vi 'ipv4' | grep -iP '(sata|nvme|uefi)' | tee /tmp/rbbs1
217-
efibootmgr | grep -i baseboard | tee /tmp/rbbs2
218-
```
219-
220-
1. (`ncn#` or `pit#`) Remove them.
192+
```bash
193+
TEMP=$(mktemp -d)
194+
efibootmgr > "${TEMP}/original.log"
195+
(
196+
. /srv/cray/scripts/metal/metal-lib.sh
197+
setup_uefi_bootorder >"${TEMP}/run.log"
198+
)
199+
```
221200
222-
```bash
223-
cat /tmp/rbbs* | awk '!x[$0]++' | sed 's/^Boot//g' | awk '{print $1}' | tr -d '*' | xargs -r -t -i efibootmgr -b {} -B
224-
```
201+
> ***NOTE*** The above snippet is `pdsh` friendly for bulk trims.
225202
226-
The boot menu should be trimmed down to contain only relevant entries.
203+
The `${TEMP}/run.log` file will show the output from each `efibootmgr` call to trim the boot order.
227204
228-
This is the end of the `Trimming boot order` procedure.
205+
The boot order has been trimmed.
229206
230207
## Example boot orders
231208
@@ -234,53 +211,102 @@ Each section shows example output of the `efibootmgr` command.
234211
* Master node (with onboard NICs enabled)
235212
236213
```text
237-
BootCurrent: 0009
238-
Timeout: 2 seconds
239-
BootOrder: 0004,0000,0007,0009,000B,000D,0012,0013,0002,0003,0001
240-
Boot0000* cray (sda1)
241-
Boot0001* UEFI: Built-in EFI Shell
242-
Boot0002* UEFI OS
243-
Boot0003* UEFI OS
244-
Boot0004* cray (sdb1)
245-
Boot0007* UEFI: PXE IP4 Intel(R) I350 Gigabit Network Connection
246-
Boot0009* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:34:89:62
247-
Boot000B* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:34:89:63
248-
Boot000D* UEFI: PXE IP4 Intel(R) I350 Gigabit Network Connection
249-
Boot0012* UEFI: PNY USB 3.1 FD PMAP
250-
Boot0013* UEFI: PNY USB 3.1 FD PMAP, Partition 2
214+
BootCurrent: 0002
215+
Timeout: 0 seconds
216+
BootOrder: 0000,0013,001A,0002,000F,0012,0014,0015
217+
Boot0000* System Utilities
218+
Boot0001 Non bootable Hotkey
219+
Boot0002* CRAY UEFI OS 0
220+
Boot0003 Intelligent Provisioning
221+
Boot0004 Embedded UEFI Shell
222+
Boot0005 Embedded iPXE
223+
Boot0006 Diagnose Error
224+
Boot0007 Boot Menu
225+
Boot0008 Network Boot
226+
Boot0009 View Integrated Management Log
227+
Boot000A View GUI mode Integrated Management Log
228+
Boot000B View BIOS Event Log
229+
Boot000C HTTP Boot
230+
Boot000D PXE Boot
231+
Boot000E Embedded Diagnostics
232+
Boot000F* CRAY UEFI OS 1
233+
Boot0010* Generic USB Boot
234+
Boot0012 SATA Drive Box 1 Bay 1 : VK000480GWTHA
235+
Boot0013* OCP Slot 10 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - PXE (PXE IPv4)
236+
Boot0014 SATA Drive Box 1 Bay 2 : VK000480GWTHA
237+
Boot0015 SATA Drive Box 1 Bay 3 : VK000480GWTHA
238+
Boot0016* Rear USB 1 : PNY USB 3.1 FD
239+
Boot001A* Slot 1 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - PXE (PXE IPv4)
251240
```
252241
253242
* Storage node (with onboard NICs enabled)
254243
255244
```text
256-
BootNext: 0005
257-
BootCurrent: 0006
258-
Timeout: 2 seconds
259-
BootOrder: 0007,0009,0000,0002
260-
Boot0000* cray (sda1)
261-
Boot0001* UEFI: Built-in EFI Shell
262-
Boot0002* cray (sdb1)
263-
Boot0005* UEFI: PXE IP4 Intel(R) I350 Gigabit Network Connection
264-
Boot0007* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:34:88:76
265-
Boot0009* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:34:88:77
266-
Boot000B* UEFI: PXE IP4 Intel(R) I350 Gigabit Network Connection
245+
BootCurrent: 0014
246+
Timeout: 0 seconds
247+
BootOrder: 0000,0014,0015,0016,0010,0011,0012
248+
Boot0000* System Utilities
249+
Boot0001 Non bootable Hotkey
250+
Boot0002 Intelligent Provisioning
251+
Boot0003 Embedded UEFI Shell
252+
Boot0004 Embedded iPXE
253+
Boot0005 Diagnose Error
254+
Boot0006 Boot Menu
255+
Boot0007 Network Boot
256+
Boot0008 View Integrated Management Log
257+
Boot0009 View GUI mode Integrated Management Log
258+
Boot000A View BIOS Event Log
259+
Boot000B HTTP Boot
260+
Boot000C PXE Boot
261+
Boot000D Embedded Diagnostics
262+
Boot000E* Generic USB Boot
263+
Boot000F* OCP Slot 10 Port 2 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - PXE (PXE IPv4)
264+
Boot0010 SATA Drive Box 1 Bay 1 : VK000480GWTHA
265+
Boot0011 SATA Drive Box 1 Bay 2 : VK000480GWTHA
266+
Boot0012 SATA Drive Box 1 Bay 3 : VK001920GWTHC
267+
Boot0013 Temporary Legacy Boot Option
268+
Boot0014* OCP Slot 10 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - PXE (PXE IPv4)
269+
Boot0015* CRAY UEFI OS 0
270+
Boot0016* CRAY UEFI OS 1
267271
```
268272
269273
* Worker node (with onboard NICs enabled)
270274
271275
```text
272-
BootNext: 0005
273-
BootCurrent: 0008
274-
Timeout: 2 seconds
275-
BootOrder: 0007,0009,000B,0000,0002
276-
Boot0000* cray (sda1)
277-
Boot0001* UEFI: Built-in EFI Shell
278-
Boot0002* cray (sdb1)
279-
Boot0005* UEFI: PXE IP4 Intel(R) I350 Gigabit Network Connection
280-
Boot0007* UEFI: PXE IP4 Mellanox Network Adapter - 98:03:9B:AA:88:30
281-
Boot0009* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:34:89:2A
282-
Boot000B* UEFI: PXE IP4 Mellanox Network Adapter - B8:59:9F:34:89:2B
283-
Boot000D* UEFI: PXE IP4 Intel(R) I350 Gigabit Network Connection
276+
BootCurrent: 0019
277+
Timeout: 20 seconds
278+
BootOrder: 0000,0019,001E,001C,001D,0010,0011,0012,0013,0014,0015,0016,0017,0018,001A
279+
Boot0000* System Utilities
280+
Boot0001 Non bootable Hotkey
281+
Boot0002 Intelligent Provisioning
282+
Boot0003 Embedded UEFI Shell
283+
Boot0004 Embedded iPXE
284+
Boot0005 Diagnose Error
285+
Boot0006 Boot Menu
286+
Boot0007 Network Boot
287+
Boot0008 View Integrated Management Log
288+
Boot0009 View GUI mode Integrated Management Log
289+
Boot000A View BIOS Event Log
290+
Boot000B HTTP Boot
291+
Boot000C PXE Boot
292+
Boot000D Embedded Diagnostics
293+
Boot000E* Generic USB Boot
294+
Boot000F Temporary Legacy Boot Option
295+
Boot0010 SATA Drive Box 4 Bay 2 : VK001920GWTTC
296+
Boot0011 SATA Drive Box 4 Bay 1 : VK001920GWTTC
297+
Boot0012 SATA Drive Box 1 Bay 5 : VK001920GWTTC
298+
Boot0013 SATA Drive Box 1 Bay 6 : VK001920GWTTC
299+
Boot0014 SATA Drive Box 1 Bay 7 : VK001920GWTTC
300+
Boot0015 SATA Drive Box 1 Bay 8 : VK001920GWTTC
301+
Boot0016 SATA Drive Box 1 Bay 1 : VK000480GWTHA
302+
Boot0017 SATA Drive Box 1 Bay 2 : VK000480GWTHA
303+
Boot0018 SATA Drive Box 1 Bay 3 : VK001920GWTTC
304+
Boot0019* OCP Slot 10 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HQCU-HC OCP3 Adapter - PXE (PXE IPv4)
305+
Boot001A SATA Drive Box 1 Bay 4 : VK001920GWTTC
306+
Boot001B* Generic USB Boot
307+
Boot001C* CRAY UEFI OS 0
308+
Boot001D* CRAY UEFI OS 1
309+
Boot001E* Slot 1 Port 1 : Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - NIC - Marvell FastLinQ 41000 Series - 2P 25GbE SFP28 QL41232HLCU-HC MD2 Adapter - PXE (PXE IPv4)
284310
```
285311
286312
## Reverting changes

0 commit comments

Comments
 (0)