Skip to content

Integrate Broadcom Devices in amd-smi framework #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 39 commits into
base: amd-staging
Choose a base branch
from

Conversation

mcharles-brcm
Copy link

@mcharles-brcm mcharles-brcm commented Jan 26, 2025

As part of this feature,

  1. Added BRCM devices in amd-smi framework
  2. The affected command is 'amd-smi list'
  3. Discovered and added BRCM's NIC and SWITCH devices in the list
  4. The discovery is specific to sysfs entries, which is created by BRCM driver. Hence, without BRCM driver the necessary NIC and SWITCH devices will not be listed in the list command
  5. As of now added BFD and UUID elements added as part of the individual devices

Updated (1/27)

  1. Added BRCM devices as part of amd-smi monitor command
  2. Very limited monitor attributes are added as part of this feature
  3. The affected command is 'amd-smi monitor'
  4. If hwmon sysfs entry not created for the BRCM NIC devices then those controllers are removed from the monitor command
  5. NIC's Monitor Attributes:
    i) NIC_TEMP_CURRENT
    ii) NIC_TEMP_CRIT_ALARM
    iii) NIC_TEMP_EMERGENCY_ALARM
    iv) NIC_TEMP_SHUTDOWN_ALARM
    v) NIC_TEMP_MAX_ALARM
  6. SWITCH's Monitor Attributes:
    i) CURRENT_LINK_SPEED
    ii) MAX_LINK_SPEED
    iii) CURRENT_LINK_WIDTH
    iv) MAX_LINK_WIDTH

jcnarlo and others added 10 commits January 23, 2025 11:34
1. As part of this we have added Broadcom's NIC and SWITCH devices in the amd-smi framework.
2. The affected command is 'amd-smi list'
3. Filter the NIC and SWITCH devices based on the Broadcom's vendor ids
4. Since these changes are depending on sysfs entries which is created by broadcom drivers, if no brcm driver present in the system then necessary elements will not created as part of the list command.
…MD-ROCm-Internal/amdsmi into SWDEV-504389/Synch_Comment_In_Linux_BM
NIC/Switch LIST
mcharles-brcm and others added 8 commits January 26, 2025 14:17
As part of this feature,
1. Added monitor attributes for BRCM devices - NIC and SWITCH
2. The affected amd-smi command is 'amd-smi monitor'
3. The list of monitor attributes are read from the corresponding BRCM device's sysfs path
BRCM Monitor
…imestamp from firmware. (ROCm#61)

* SWDEV-511296 - update violation_status->violation_timestamp to read values from firmware.

Signed-off-by: Greg Scaffidi <[email protected]>

* SWDEV-511296 - update violation_status->violation_timestamp to read values from firmware.

Signed-off-by: Greg Scaffidi <[email protected]>

* SWDEV-511296 - update violation_status->violation_timestamp to read values from firmware.

Signed-off-by: Greg Scaffidi <[email protected]>

---------

Signed-off-by: Greg Scaffidi <[email protected]>
rahulc-gh pushed a commit that referenced this pull request Jan 30, 2025
Fix ordering of RHEL 8 build process

Signed-off-by: Williams, Justin <[email protected]>
@dmitrii-galantsev
Copy link
Collaborator

dmitrii-galantsev commented Feb 21, 2025

ohhh that's cool, I'll ask some teammates to see if they want to take on broadcom things.

continue internally

Affected Commands:
1. amd-smi topology -nic
Display nic and gpu connectivity

2. amd-smi topology -show_numa
Display nic,gpu's numa and cpu affinity
1. Remove duplicate declaration
2. Resolve Alignment issue
@@ -994,6 +996,25 @@ typedef struct {
*
* @cond @tag{gpu_bm_linux} @tag{host} @endcond
*/
typedef struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment to explain the difference of those temperature

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment statements to the methods.

uint32_t nic_temp_input;
uint32_t nic_temp_max;
uint32_t nic_temp_shutdown;
} amdsmi_nic_temperature_metric_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is Broadcom specific metrics, maybe we should use the amdsmi_brcm_nic_temperature_metric_t?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed as amdsmi_brcm_nic_temperature_metric_t

} amdsmi_nic_temperature_metric_t;

typedef struct {
char current_link_speed[AMDSMI_MAX_STRING_LENGTH];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will assume this is the integer or float, and not string?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per BRCM NIC driver implementation the complete string added in the sysfs entry and the same populated in amd-smi.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No code changes made.

@@ -2951,7 +2972,17 @@ amdsmi_status_t amdsmi_get_gpu_bdf_id(amdsmi_processor_handle processor_handle,
* @return ::amdsmi_status_t | ::AMDSMI_STATUS_SUCCESS on success, non-zero on fail
*/
amdsmi_status_t amdsmi_get_gpu_topo_numa_affinity(amdsmi_processor_handle processor_handle, int32_t *numa_node);

amdsmi_status_t amdsmi_get_gpu_topo_cpu_affinity(amdsmi_processor_handle processor_handle,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is user facing API, we may want to reduce the amount of API. In this case, we have cpu/gpu/nic, maybe just one API with a enum parameter to specify the type we want to use?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the idea is do not want to conflict with gpu implementation.
In addition to that to identify the devices in python interface using the corresponding wrapper methods.

No code changes made.


amdsmi_status_t amdsmi_get_gpu_topo_cpu_affinity(amdsmi_processor_handle processor_handle,
unsigned int *cpu_aff_length, char *cpu_aff_data);
amdsmi_status_t amdsmi_get_nic_topo_numa_affinity(amdsmi_processor_handle processor_handle, int32_t *numa_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add comment for API defined in this file, so that doxygen can generate proper document.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment and deliver the changes.

char buffer[128];

// Open a pipe to execute the command
std::shared_ptr<FILE> pipe(popen(command.c_str(), "r"), pclose);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should try to avoid call external command if possible as this is a library. maybe you can refer to some code here:
https://github.com/pciutils/pciutils/blob/master/lspci.c

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like lspci command, we are planning to add ethtool, sg_ses commands also to populate the nic and switch devices details. In this case, we are believing this is better approach to get the necessary information.

No code changes made.

namespace amd {
namespace smi {

class AMDSmiNoDrmNIC {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain what is no drm nic? I think drm is related to GPU only? In amdsmi, some optional function needs to refer the libdrm, but we do not want to add the hard dependency to it, so we dynamically load it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the dependency "xf86drm.h" and delivered the changes. Other than GPU's base class referred as no_drm. It can be NIC and Switch.

@@ -1500,6 +1500,8 @@ rsmi_status_t rsmi_driver_status(rsmi_driver_state_t* state);
* @retval ::RSMI_STATUS_SUCCESS is returned upon successful call.
*/
rsmi_status_t rsmi_num_monitor_devices(uint32_t *num_devices);
rsmi_status_t rsmi_num_nic_monitor_devices(uint32_t *num_devices);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comments for public facing header so that the doxygen can generate API document properly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments and deliver the changes.

@@ -562,38 +642,74 @@ amdsmi_get_gpu_device_bdf(amdsmi_processor_handle processor_handle, amdsmi_bdf_t
return AMDSMI_STATUS_SUCCESS;
}

amdsmi_status_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we need to remove this function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This difference due to merge conflict and it was resolved.

@@ -324,6 +325,13 @@ amdsmi_status_t AMDSmiDrm::get_bdf_by_index(uint32_t gpu_index, amdsmi_bdf_t *bd
return AMDSMI_STATUS_SUCCESS;
}

amdsmi_status_t AMDSmiDrm::amdgpu_query_cpu_affinity(std::string devicePath, std::string &cpu_affinity) {
std::string cpuAffFile = "cpulistaffinity";
cpu_affinity = smi_brcm_get_value_string(devicePath, cpuAffFile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it will never fail?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is handled in amd_smi_utils.cc. If file not present returning as "N/A".
Hence, expected return value is always success.

uint32_t socket_count = 0;

// Initialize amdsmi for AMD CPUs
ret = amdsmi_init(AMDSMI_INIT_AMD_GPUS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you only init the amdsmi to discover GPU device, then you should not be able to find out the NIC device in the following amdsmi_get_processor_handles_by_type()

if you want to have an example of code to support both NIC and GPU device, you may need to pass both as parameter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of AMDSMI_INIT_AMD_GPUS only BRCM NIC and Switch flags are initialized.

* Construct the BDF for the discovered NIC devices
* Push NIC SMI Device object to nic_devices vector
*/
ret = DiscoverBRCMnicDevices();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a system does not have the NIC device, we should not fail the whole init() as the user may only care about the GPU.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled graceful exit in DiscoverBRCMnicDevices and DiscoverBRCMswitchDevices and make sure always return 0.

uint64_t bdfid;
for (auto &device : nic_devices_) {
if (ConstructBDFID(device->path(), &bdfid) != 0) {
std::cerr << "Failed to construct BDFID." << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, any NIC device init fail should not fail the whole function as people may just want to work on GPU.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled dead code.

@@ -653,7 +780,63 @@ void RocmSMI::AddToDeviceList(std::string dev_name, uint64_t bdfid) {
LOG_DEBUG(ss);
}

void RocmSMI::AddToNICDeviceList(std::string dev_name, uint64_t bdfid, uint32_t card_indx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of code is similar to AddToSWITCHDeviceList(), consider to remove duplicate code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both are two different elements do not want to make function overload.
Moreover, there is a chance to have a server with any one of the BRCM device - either BRCM NIC or BRCM Switch.

@@ -691,6 +874,84 @@ static bool isAMDGPU(std::string dev_path) {
return isAmdGpu;
}

static bool isBRCMnic(std::string dev_path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code is similar to isBRCMswitch(), consider to remove duplicate code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OEM, device id and device's sysfs paths are different and handle multiple if else will not be looks good.
No code changes made.

@@ -1096,6 +1357,142 @@ uint32_t RocmSMI::DiscoverAmdgpuDevices(void) {
return 0;
}

uint32_t RocmSMI::DiscoverBRCMnicDevices(void) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar code to DiscoverBRCMswitchDevices(), remove duplication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both are two different elements do not want to make function overload.
Moreover, there is a chance to have a server with any one of the BRCM device - either BRCM NIC or BRCM Switch.

amdsmi_status_t amdsmi_get_gpu_topo_cpu_affinity(amdsmi_processor_handle processor_handle,
unsigned int *cpu_aff_length, char *cpu_aff_data);
amdsmi_status_t amdsmi_get_nic_topo_numa_affinity(amdsmi_processor_handle processor_handle, int32_t *numa_node);
amdsmi_status_t amdsmi_get_nic_topo_cpu_affinity(amdsmi_processor_handle processor_handle,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any new API defined in the amdsmi.h, we need a unit test, you can refer to the files at tests/amd_smi_test/ folders.


static const char *kDeviceNamePrefix = "card";
static const char *kNICPrefix = "e"; //changed to 'e' since SLES syspath is 'eth' and RHEL/Ubuntu is 'en'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wl for wireless LAN (e.g., wlp2s0). But it may not be applied to our case, so we can still use this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Moreover, we are filtering by specific vendor / driver supported nic devices only.

* Push SWITCH SMI Device object to switch_devices vector
*/

ret = DiscoverBRCMswitchDevices();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The below switch code is very similar to handle BRCM NIC devices. Consider removing duplications.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both are two different elements do not want to make function overload.
Moreover, there is a chance to have a server with any one of the BRCM device - either BRCM NIC or BRCM Switch.

err_msg += kPathSwitchRoot;
err_msg += ".";
perror(err_msg.c_str());
return 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using hardcode number, we may prefer to use the macro to make the code more readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed same as DiscoverAmdgpuDevices implementation.

auto path_dir = opendir(path.c_str());
if (path_dir == nullptr) {
//move to the next index
if (numofretry == 3) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we come up to use the hardcode number 3 instead of other number for retry?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To handle the negative test scenario added this case - the test server had only one invalid sysfs entry and the execution failed.
Note: In the good scenario the retry count is not required.
For safer side added the retry count as 3. In general we have user editable configuration file for feeding these values, but not sure that can be applicable for amd-smi.

@@ -247,6 +251,122 @@ amdsmi_status_t AMDSmiSystem::populate_amd_gpu_devices() {
return AMDSMI_STATUS_SUCCESS;
}

amdsmi_status_t AMDSmiSystem::populate_brcm_nic_devices() {
uint32_t device_count = 0;
amdsmi_status_t amd_smi_status = no_drm_nic.init();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to check the return status?

return AMDSMI_STATUS_SUCCESS;
}

amdsmi_status_t AMDSmiSystem::populate_brcm_switch_devices() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider combining with populate_brcm_nic_devices() to remove duplicate code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both are two different elements do not want to make function overload.
Moreover, there is a chance to have a server with any one of the BRCM device - either BRCM NIC or BRCM Switch.

}
else {
std::string line;
getline(file, line);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle the errors as this function may throw exceptions.

else {
std::string line;
getline(file, line);
return static_cast<uint32_t>(stoi(line));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the line is not number, what will happen?

return "N/A";
}
else {
getline(file, temp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle error

@@ -63,6 +63,11 @@ add_executable(${SMI_NODRM_EXAMPLE_EXE} "amd_smi_nodrm_example.cc")
target_link_libraries(${SMI_NODRM_EXAMPLE_EXE} amd_smi)
add_dependencies(${SMI_NODRM_EXAMPLE_EXE} amd_smi)

set(SMI_LIST_EXAMPLE_EXE "amd_smi_list")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you just want to show to use the nic device, maybe using the name amd_smi_nic.cc instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the amd_smi_list example code to print all GPU, NIC and Switch devices BDF and UUID details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants