[ROCm] Return correct AMDSMI socket_power metric (#130331)

Extending on the change in https://github.com/pytorch/pytorch/pull/127729

Depending on gcnArch the API to return socket power will change based on underlying gpu_metrics. This PR will handle both cases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130331
Approved by: https://github.com/jeffdaily, https://github.com/eqy, https://github.com/malfet
This commit is contained in:
Jack Taylor
2024-07-17 01:58:55 +00:00
committed by PyTorch MergeBot
parent 03c660468e
commit e9023d57b0

View File

@ -1108,7 +1108,11 @@ def _get_amdsmi_temperature(device: Optional[Union[Device, int]] = None) -> int:
def _get_amdsmi_power_draw(device: Optional[Union[Device, int]] = None) -> int:
handle = _get_amdsmi_handler(device)
return amdsmi.amdsmi_get_power_info(handle)["current_socket_power"]
socket_power = amdsmi.amdsmi_get_power_info(handle)["average_socket_power"]
if socket_power != "N/A":
return socket_power
else:
return amdsmi.amdsmi_get_power_info(handle)["current_socket_power"]
def _get_amdsmi_clock_rate(device: Optional[Union[Device, int]] = None) -> int: