Open
Description
We would like to create a superset of the HPE SAT file adding new features like git tags to CFS configuration layers and hardware inventory to HSM groups.
This approach will help in having a self contained cluster definition in a single file.
Some use cases:
- Build new clusters from scratch
- Migrate clusters across different sites
- simplify cluster management
eg:
hardware:
- pattern: a100:4:epyc:4
configurations:
- name: config-test-__DATE__
layers:
- name: test-layer
playbook: site.yml
git:
url: https://api-gw-service-nmn.local/vcs/cray/test_layer.git
branch: cscs-23.06.0
images:
- name: image-test-__DATE__
ref_name: test_image
base:
ims:
type: image
id: 3de9f01b-1981-4248-a7b9-c9803a6bc471
configuration: config-test-__DATE__
configuration_group_names:
- Compute
- adula
session_templates:
- name: sessiontemplate-test-__DATE__
image:
image_ref: test_image
configuration: config-test-__DATE__
bos_parameters:
boot_sets:
compute:
kernel_parameters: ip=dhcp quiet spire_join_token=${SPIRE_JOIN_TOKEN}
node_groups:
- adula
A the top, we can see a description of the hardware we want the cluster to have:
hardware:
- pattern: a100:4:epyc:4
While processing this SAT file, the end goal is to have an HSM group with x4 Nvidia a100 and x4 AMD epyc CPUS. the process of finding the hardware needed in the CSM hardware inventory is out of the scope since it is already implemented.
This tasks is to:
- adapt the logic in manta which reads a SAT file
- interacts with mesa in order to get the nodes with the hardware requirements needed
- create/update the HSM accordingly
Metadata
Metadata
Assignees
Labels
No labels