[Profiler] the doc of _ExperimentalConfig is incorrectly truncated by commas (#156586)

Hi team,

Please help review this trivial fix.

Without this change:

``` python
>>> import torch
>>> print(torch._C._profiler._ExperimentalConfig.__init__.__doc__)
__init__(self: torch._C._profiler._ExperimentalConfig, profiler_metrics: list[str] = [], profiler_measure_per_kernel: bool = False, verbose: bool = False, performance_events: list[str] = [], enable_cuda_sync_events: bool = False, adjust_profiler_step: bool = False, disable_external_correlation: bool = False, profile_all_threads: bool = False, capture_overload_names: bool = False) -> None

    capture_overload_names (bool) : whether to include ATen overload names in the profile
```

With this change:

```python
>>> import torch
>>> print(torch._C._profiler._ExperimentalConfig.__init__.__doc__)
__init__(self: torch._C._profiler._ExperimentalConfig, profiler_metrics: list[str] = [], profiler_measure_per_kernel: bool = False, verbose: bool = False, performance_events: list[str] = [], enable_cuda_sync_events: bool = False, adjust_profiler_step: bool = False, disable_external_correlation: bool = False, profile_all_threads: bool = False, capture_overload_names: bool = False) -> None

An experimental config for Kineto features. Please note thatbackward compatibility is not guaranteed.
    profiler_metrics : a list of CUPTI profiler metrics used
       to measure GPU performance events.
       If this list contains values Kineto runs in CUPTI profiler mode
    profiler_measure_per_kernel (bool) : whether to profile metrics per kernel
       or for the entire measurement duration.
    verbose (bool) : whether the trace file has `Call stack` field or not.
    performance_events : a list of profiler events to be used for measurement.
    enable_cuda_sync_events : for CUDA profiling mode, enable adding CUDA synchronization events
       that expose CUDA device, stream and event synchronization activities. This feature is new
       and currently disabled by default.
    adjust_profiler_step (bool) : whether to adjust the profiler step to
       match the parent python event duration. This feature is new and currently disabled by default.
    disable_external_correlation (bool) : whether to disable external correlation
    profile_all_threads (bool) : whether to profile all threads
    capture_overload_names (bool) : whether to include ATen overload names in the profile

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156586
Approved by: https://github.com/sraikund16, https://github.com/cyyever
This commit is contained in:
Denghui Dong
2025-07-16 04:10:46 +00:00
committed by PyTorch MergeBot
parent 0a9d450168
commit e92e3eaf4e

View File

@ -356,10 +356,10 @@ void initPythonBindings(PyObject* module) {
" that expose CUDA device, stream and event synchronization activities. This feature is new\n"
" and currently disabled by default.\n"
" adjust_profiler_step (bool) : whether to adjust the profiler step to\n"
" match the parent python event duration. This feature is new and currently disabled by default.\n",
" disable_external_correlation (bool) : whether to disable external correlation\n",
" profile_all_threads (bool) : whether to profile all threads\n",
" capture_overload_names (bool) : whether to include ATen overload names in the profile\n",
" match the parent python event duration. This feature is new and currently disabled by default.\n"
" disable_external_correlation (bool) : whether to disable external correlation\n"
" profile_all_threads (bool) : whether to profile all threads\n"
" capture_overload_names (bool) : whether to include ATen overload names in the profile\n"
" custom_profiler_config (string) : Used to pass some configurations to the custom profiler backend.\n",
py::arg("profiler_metrics") = std::vector<std::string>(),
py::arg("profiler_measure_per_kernel") = false,