[pytorch] fix hipify_python (#70619)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70619

This Diff improves `hipify_python`, which is needed for AMD GPUs.

Change 1:
```
if (c == "," or ind == len(kernel_string) - 1) and closure == 0:
```
This is needed to deal with the following case (ex: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/test/cuda_vectorized_test.cu#L111)
```
kernel<<<val, func()>>>(...)
// In this case, kernel_string is "val, func()"
// so closure gets 0 when ind == len(kernel_string) - 1.
```

Change 2:
```
mask_comments()
```
This is needed to deal with a case where "<<<" is included in a comment or a string literal (ex: https://github.com/pytorch/pytorch/blob/master/torch/csrc/deploy/interpreter/builtin_registry.cpp#L71)
```
abc = "<<<XYZ>>>"
// Though this <<<XYZ>>> is irrelevant to CUDA kernels,
// the current script attempts to hipify this and fails.
```

Test Plan:
This patch fixes errors I encountered by running
```
python3 tools/amd_build/build_amd.py
```

I confirmed, with Linux `diff`, that this patch does not change HIP code that was generated successfully with the original script.

Reviewed By: hyuen

Differential Revision: D33407743

fbshipit-source-id: bec822e040a154be4cda1c294536792ca8d596ae
This commit is contained in:
Shintaro Iwasaki
2022-01-06 13:24:07 -08:00
committed by Facebook GitHub Bot
parent 9c455d7086
commit 4fa70a2483

View File

@ -247,7 +247,7 @@ def add_dim3(kernel_string, cuda_kernel):
closure += 1
elif c == ")":
closure -= 1
elif (c == "," or ind == len(kernel_string) - 1) and closure == 0:
if (c == "," or ind == len(kernel_string) - 1) and closure == 0:
arg_locs[count]['end'] = ind + (c != ",")
count += 1
if count < 2:
@ -356,8 +356,43 @@ def processKernelLaunches(string, stats):
return kernel_positions
# Replace comments and string literals from the code so that find_kernel_bounds does not
# wrongly capture kernels in comments and string literals.
# This function replaces them with "x" to keep positions.
def mask_comments(string):
in_comment = ''
prev_c = ''
new_string = ''
for c in string:
if in_comment == '':
# Outside comments
if c == '/' and prev_c == '/':
in_comment = '//'
elif c == '*' and prev_c == '/':
in_comment = '/*'
elif c == '"' and prev_c != '\\' and prev_c != "'":
in_comment = '"'
elif in_comment == '//':
# In // xxx
if c == '\r' or c == '\n':
in_comment = ''
elif in_comment == '/*':
# In /* xxx */
if c == '/' and prev_c == '*':
in_comment = ''
elif in_comment == '"':
# In ""
if c == '"' and prev_c != '\\':
in_comment = ''
prev_c = c
if in_comment == '':
new_string += c
else:
new_string += 'x'
return new_string
# Grab positional ranges of all kernel launches
get_kernel_positions = list(find_kernel_bounds(string))
get_kernel_positions = list(find_kernel_bounds(mask_comments(string)))
output_string = string
# Replace each CUDA kernel with a HIP kernel.