Skip to content

FEATURE: Integrate hardware components into SAT file #21

Open
@Masber

Description

@Masber

We would like to create a superset of the HPE SAT file adding new features like git tags to CFS configuration layers and hardware inventory to HSM groups.
This approach will help in having a self contained cluster definition in a single file.
Some use cases:

  • Build new clusters from scratch
  • Migrate clusters across different sites
  • simplify cluster management

eg:

hardware:
- pattern: a100:4:epyc:4
configurations:
- name: config-test-__DATE__
  layers:
  - name: test-layer
    playbook: site.yml
    git:
      url: https://api-gw-service-nmn.local/vcs/cray/test_layer.git
      branch: cscs-23.06.0

images:
- name: image-test-__DATE__
  ref_name: test_image
  base:
    ims:
      type: image
      id: 3de9f01b-1981-4248-a7b9-c9803a6bc471
  configuration: config-test-__DATE__
  configuration_group_names:
  - Compute
  - adula

session_templates:
- name: sessiontemplate-test-__DATE__
  image:
    image_ref: test_image
  configuration: config-test-__DATE__
  bos_parameters:
    boot_sets:
      compute:
        kernel_parameters: ip=dhcp quiet spire_join_token=${SPIRE_JOIN_TOKEN}
        node_groups:
        - adula

A the top, we can see a description of the hardware we want the cluster to have:

hardware:
- pattern: a100:4:epyc:4

While processing this SAT file, the end goal is to have an HSM group with x4 Nvidia a100 and x4 AMD epyc CPUS. the process of finding the hardware needed in the CSM hardware inventory is out of the scope since it is already implemented.

This tasks is to:

  1. adapt the logic in manta which reads a SAT file
  2. interacts with mesa in order to get the nodes with the hardware requirements needed
  3. create/update the HSM accordingly

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions