Commit Graph

17 Commits

Author SHA1 Message Date
2c5ed6e7c0 Revert "[2/N] Fix clang-tidy readability checks (#164652)"
This reverts commit 3c5ca685d6f5b6f3971c0cd20a054aa355610419.

Reverted https://github.com/pytorch/pytorch/pull/164652 on behalf of https://github.com/izaitsevfb due to need to revert due to a conflict with revert of https://github.com/pytorch/pytorch/pull/162659 ([comment](https://github.com/pytorch/pytorch/pull/164652#issuecomment-3369346707))
2025-10-05 21:36:57 +00:00
3c5ca685d6 [2/N] Fix clang-tidy readability checks (#164652)
This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
2025-10-05 07:05:11 +00:00
cyy
2f30473fba [19/N] Fix clang-tidy warnings in jit (#133067)
Follows  #132963
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133067
Approved by: https://github.com/Skylion007
2024-08-13 15:59:43 +00:00
cyy
8967d55b01 [18/N] Fix clang-tidy warnings in jit (#132963)
Follows #132753

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132963
Approved by: https://github.com/Skylion007
2024-08-09 01:27:32 +00:00
cyy
cd99cdc3af fix std::move warnings from gcc (#105780)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105780
Approved by: https://github.com/Skylion007
2023-09-22 05:55:21 +00:00
a229e78544 [BE] Enforce sign-compare (#96723)
Number of OSS PR were reverted, because new signed-unsigned comparison warnings, which are treated as errors in some internal builds.
Not sure how those selective rules are applied, but this PR removes `-Wno-sign-compare` from PyTorch codebase.

The only tricky part in this PR, as making sure that non-ASCII character detection works for both signed and unsigned chars  here:
6e3d51b08a/torch/csrc/jit/serialization/python_print.cpp (L926)

Exclude several files from sign-compare if flash attention is used, due to the violation in cutlass, to be fixed by https://github.com/NVIDIA/cutlass/pull/869
Do not try to fix sign compare violations in caffe2 codebase
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96723
Approved by: https://github.com/albanD
2023-03-15 06:04:20 +00:00
a38e43e936 [perf][1/5] Replace IValue::toString()->string() with IValue::toStringRef() (#85437)
Summary: `IValue::toString()` creates a `new c10::intrusive_ptr` (like `std::shared_ptr`) and `->string()` immediately accesses it, creating an atomic reference increment/decrement. We can skip both of these operations by calling `IValue::toStringRef()`.

Test Plan: CI

Reviewed By: jaybean-dev

Differential Revision: D39605242

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85437
Approved by: https://github.com/jfix71
2022-09-23 23:36:57 +00:00
edea59acf8 [Easy][PyTorchEdge] Fix unused variable build error (#74117)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74117

Fix unused variable "code" build error
ghstack-source-id: 152356508

Test Plan: CI

Reviewed By: cccclai

Differential Revision: D34825166

fbshipit-source-id: 35e5924a356399ab80a0c9501920070251658991
(cherry picked from commit 55d302c0ceaf46c33f31657c8725b13e61c1d306)
2022-03-28 21:56:24 +00:00
103fc5f9a5 Remove unused variable (#70261)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70261

ghstack-source-id: 146310591

Test Plan:
```
buck test  fbsource//xplat/caffe2:for_each_prod_ptl_model_test
```

{gif:p014gzft}

Reviewed By: iseeyuan

Differential Revision: D33265656

fbshipit-source-id: 6e303ee304064a61383ba2ae34f2e21077ec9db3
2021-12-28 22:21:29 -08:00
d459e79500 [jit][edge] Remove usage of shared_ptr<mobile::Code>. (#68037)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68037

Right now mobile::Code doesn't outlive its enclosing Function, and all accesses to Code happens inside interpreter loop which doesn't outlive the module, so we don't need to use std::shared_ptr here. This also should saves us 1-2 KB for binary size, because shared_ptr seems to bloat on arm64 android.
ghstack-source-id: 145818696

Test Plan: eyes.

Reviewed By: qihqi, tugsbayasgalan

Differential Revision: D32264616

fbshipit-source-id: d83f538d6604cf75fd7728a25127b4849ce7ab2a
2021-12-16 13:11:46 -08:00
408283319a [Operator Versioning][Edge] Change OP to CALL when there is a valid upgrader (#67731)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67731

1. Register upgrader function at loading stage
2. Change OP to CALL when there operator_version from model is smaller than current runtime version and there exists a valid upgrader

The interpreter log is :
```
RUNNING 0 STOREN 1 3
RUNNING 1 DROPR 1
RUNNING 2 LOAD 2
RUNNING 3 LOAD 3
RUNNING 4 CALL 0
RUNNING 0 STOREN 1 2
RUNNING 1 LOAD 1
RUNNING 2 OP 0, aten::is_floating_point
RUNNING 3 JF 3
RUNNING 4 LOADC 1
RUNNING 5 JMP 3
RUNNING 8 STORE 3
RUNNING 9 MOVE 3
RUNNING 10 JF 5
RUNNING 11 LOAD 1
RUNNING 12 LOAD 2
RUNNING 13 OP 1, aten::div.Tensor
RUNNING 14 JMP 5
RUNNING 19 STORE 4
RUNNING 20 DROPR 2
RUNNING 21 DROPR 1
RUNNING 22 MOVE 4
RUNNING 23 RET
RUNNING 5 LOAD 2
RUNNING 6 LOAD 3
RUNNING 7 CALL 0
RUNNING 0 STOREN 1 2
RUNNING 1 LOAD 1
RUNNING 2 OP 0, aten::is_floating_point
RUNNING 3 JF 3
RUNNING 4 LOADC 1
RUNNING 5 JMP 3
RUNNING 8 STORE 3
RUNNING 9 MOVE 3
RUNNING 10 JF 5
RUNNING 11 LOAD 1
RUNNING 12 LOAD 2
RUNNING 13 OP 1, aten::div.Tensor
RUNNING 14 JMP 5
RUNNING 19 STORE 4
RUNNING 20 DROPR 2
RUNNING 21 DROPR 1
RUNNING 22 MOVE 4
RUNNING 23 RET
RUNNING 8 MOVE 2
RUNNING 9 MOVE 3
RUNNING 10 CALL 0
RUNNING 0 STOREN 1 2
RUNNING 1 LOAD 1
RUNNING 2 OP 0, aten::is_floating_point
RUNNING 3 JF 3
RUNNING 4 LOADC 1
RUNNING 5 JMP 3
RUNNING 8 STORE 3
RUNNING 9 MOVE 3
RUNNING 10 JF 5
RUNNING 11 LOAD 1
RUNNING 12 LOAD 2
RUNNING 13 OP 1, aten::div.Tensor
RUNNING 14 JMP 5
RUNNING 19 STORE 4
RUNNING 20 DROPR 2
RUNNING 21 DROPR 1
RUNNING 22 MOVE 4
RUNNING 23 RET
RUNNING 11 TUPLE_CONSTRUCT 3
RUNNING 12 RET
```

The upgrader bytecode is:
```
(STOREN, 1, 2)
(LOAD, 1, 0)
(OP, 0, 0)
(JF, 3, 0)
(LOADC, 1, 0)
(JMP, 3, 0)
(LOAD, 2, 0)
(OP, 0, 0)
(STORE, 3, 0)
(MOVE, 3, 0)
(JF, 5, 0)
(LOAD, 1, 0)
(LOAD, 2, 0)
(OP, 1, 0)
(JMP, 5, 0)
(LOAD, 1, 0)
(LOAD, 2, 0)
(LOADC, 0, 0)
(OP, 2, 0)
(STORE, 4, 0)
(DROPR, 2, 0)
(DROPR, 1, 0)
(MOVE, 4, 0)
(RET, 0, 0)
```
ghstack-source-id: 145635622

Test Plan: describe in summary and CI

Reviewed By: iseeyuan

Differential Revision: D32092517

fbshipit-source-id: 0314b4bda5d2578cdd4e7cfbfd1e3c07fbccf8a3
2021-12-14 19:13:12 -08:00
82f7f8d471 [PyTorch] Adopt IValue::toTupleRef() where obvious (#65505)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65505

Generated with

`fastmod -m 'toTuple\(\)(\s*)->' 'toTupleRef()${1}.'`

, followed by

`fastmod '(std::move\(.*)toTupleRef\(\).' '${1}toTuple()->'`

to unbreak 2 callsites.
ghstack-source-id: 142065835

Test Plan: CI

Reviewed By: gchanan

Differential Revision: D31131025

fbshipit-source-id: 54457ae5bbeb38db9c7f196d469b98521c3d3f34
2021-11-02 10:22:18 -07:00
5f58764d1d [PyTorch Edge][type] Add type support for NamedTuple custom class (import) (#63130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63130

Extend `type_parser` to handle `NamedTuple` type. It can be extended to handle other types when needed. The custom type will follow the following format:
```
"qualified_named[
    NamedTuple, [
        [filed_name_1, field_type_1],
        [filed_name_2, field_type_2]
    ]
]"
```
For example:
```
"__torch__.base_models.sparse_nn.pytorch_preproc_types.PreprocOutputType[
    NamedTuple, [
        [float_features, Tensor],
        [id_list_features, List[Tensor]],
        [label,  Tensor],
        [weight, Tensor],
        ]
    ]"
```

For nested types, the order of type lists from type table should be:
```
std::string type_1 = “__torch__.C [
    NamedTuple, [
        [field_name_c_1, Tensor],
        [field_name_c_2, Tuple[Tensor, Tensor]],
    ]
]”

std::string type_2 = “__torch__.B [
   NamedTuple, [
       [field_name_b, __torch__.C ]
   ]
]”

std::string type_3 = “__torch__.A[
   NamedTuple, [
       [field_name_a, __torch__.B]
   ]
]”
std::vector<std::string> type_strs = {type_str_1, type_str_2, type_3};
std::vector<TypePtr> type_ptrs =  c10::parseType(type_strs);
```

namedtuple from both `collection` and `typing` are supported
```

from typing import NamedTuple
from collections import namedtuple
```

This change only adds the parser and now new runtime can read the above format.
ghstack-source-id: 141293658

Test Plan:
```
buck test mode/dev //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.CompatiblePrimitiveType'
buck test mode/dev //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.CompatibleCustomType'

buck test mode/dev //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.InCompatiblePrimitiveType'
buck test mode/dev //caffe2/test/cpp/jit:jit -- --exact 'caffe2/test/cpp/jit:jit - LiteInterpreterTest.InCompatibleCustomType'
```

Reviewed By: iseeyuan

Differential Revision: D30261547

fbshipit-source-id: 68a9974338464e320b39a5c613dc048f6c5adeb5
2021-10-22 00:40:57 -07:00
e88d1c4f10 [PyTorch] Add tuple inline storage (#64066)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64066

I noticed a bunch of time being spent heap-allocating Tuples
in the unpickler. 1-, 2-, and 3-element Tuples are apparently common
enough that they get their own bytecode instructions, so I decided to
try also giving them their own representation. We store up to 3
IValues inline in `Tuple` rather than doing a second heap allocation
for a `std::vector<IValue>`.
ghstack-source-id: 140695395

Test Plan:
Added automated tests for TupleElements.

Pixel 3 before: https://www.internalfb.com/intern/aibench/details/761596366576284
Pixel 3 after: https://www.internalfb.com/intern/aibench/details/591414145082422
We went from 347 ms to 302 ms.

Reviewed By: dhruvbird

Differential Revision: D30592622

fbshipit-source-id: 93625c54c9dca5f765ef6d5c191944179cb281a8
2021-10-15 12:16:51 -07:00
176d3c6fb4 [PyTorch] Fix many Tuple::elements() callsites (#64065)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64065

It is only safe to mutate Tuple elements if you are the sole owner
of the tuple. The most efficient way to do this, then, is
`std::move(*std::move(tupleIValue).toTuple()).elements()` (the
innermost move allows `IValue::toTuple()` to avoid a refcount bump and
the outermost move allows the element vector to be moved out of the
tuple), but many callsites write simply
`tupleIValue.toTuple().elements()`, which incurs many extra refcount
bumps.

ghstack-source-id: 139468088

Test Plan: CI

Reviewed By: ezyang

Differential Revision: D30592621

fbshipit-source-id: e8312de866de09b9ea2a62e5128cbf403ee16f09
2021-10-01 11:36:05 -07:00
5d4efed83e [PyTorch] Add OpCode cache in ByteCodeDeserializer (#64110)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64110

As the code comment says, we can exploit pickler string interning to accelerate OpCode parsing. No more strcmp!
ghstack-source-id: 137978946

Test Plan:
Pixel 3 before: https://www.internalfb.com/intern/aibench/details/591414145082422
Pixel 3 after: https://www.internalfb.com/intern/aibench/details/484557404703261

new mean is 292 ms, down from 302 ms.

Reviewed By: dhruvbird

Differential Revision: D30615052

fbshipit-source-id: 9707625e778388a7920ab72704d71ad57ddaac17
2021-09-14 14:22:10 -07:00
30a7c768d7 [RFC] Modularize functions of parsing bytecode (#61862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61862

Modularize functions of parsing bytecode tables so that they can be used as needed in situations other than mobile lite interpreter.
* The decoupled functions are re-used by current lite interpreter loader.
* The bytecode can be serialized/deserialized from other formats.
* The decoupled functions have minimum dependencies on other PyTorch components.

Next:
Build a driver binary to include the parser and interpreter, but only has necessary dependency on other PyTorch components.
ghstack-source-id: 137867287

Test Plan:
As an example, a simple bytecode is parsed to a mobile function, and directly run in the added unit test, `RunTimeTest:ParseBytecode`. It contains basic control flow (if, else) and basic data orchestration (list construction).
CI

Reviewed By: larryliu0820

Differential Revision: D29798382

Pulled By: iseeyuan

fbshipit-source-id: 1c173a5f5d37097e3a97baec3f3e48e1eea1400f
2021-09-11 22:24:05 -07:00