pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Xuehai Pan	b020971e78	[BE] fix typos in torchgen/ (#156083 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156083 Approved by: https://github.com/jingsh ghstack dependencies: #156079, #156082	2025-06-17 19:25:50 +00:00
cyy	55250b324d	[1/N] Apply py39 ruff fixes (#138578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138578 Approved by: https://github.com/Skylion007	2024-12-02 21:46:18 +00:00
Xuehai Pan	9120992c72	[BE][Easy] enable postponed annotations in `torchgen` (#129376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129376 Approved by: https://github.com/ezyang ghstack dependencies: #129375	2024-06-29 09:23:39 +00:00
PyTorch MergeBot	6063bb9d45	Revert "[BE][Easy] enable postponed annotations in `torchgen` (#129376 )" This reverts commit 494057d6d4e9b40daf81a6a4d7a8c839b7424b14. Reverted https://github.com/pytorch/pytorch/pull/129376 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I need to revert to cleanly revert https://github.com/pytorch/pytorch/pull/129374, please do a rebase and reland this ([comment](https://github.com/pytorch/pytorch/pull/129375#issuecomment-2197800541))	2024-06-29 00:44:25 +00:00
Xuehai Pan	494057d6d4	[BE][Easy] enable postponed annotations in `torchgen` (#129376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129376 Approved by: https://github.com/ezyang ghstack dependencies: #129375	2024-06-28 15:37:57 +00:00
Xuehai Pan	b697808056	[BE][Easy] eliminate relative import in `torchgen` (#128872 ) Fix generated by: ```bash ruff check --config 'lint.flake8-tidy-imports.ban-relative-imports="all"' --fix --select=TID $(fd '.pyi?$' torchgen) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128872 Approved by: https://github.com/zou3519	2024-06-21 14:11:46 +00:00
Jane Xu	601c5e085d	Add _foreach_max (#127187 ) This PR adds _foreach_max support, the second reduction foreach op we have :D I did have to change the autogen slightly for foreach. I can promise that the existing foreach ops' derivative behavior has not changed as I've added a skip list for the harder requirement I am setting (that the arg list should match in length). I needed to add this requirement as there is another wrong max (the one that does take in a dim for reduction) that keeps getting matched first. Caveats! - We do not fast path if the shapes, dtypes, device, the regular shebang for foreach are not met. We fall back to slowpath! - MORE IMPORTANTLY, we also do not fast path for int8 and int16 and bool, but that's really a skill issue on my end as I've hardcoded -INFINITY into the CUDA kernels, and -INFINITY is not defined for small ints. It'd be nice to know how to do this properly, but that work can also come later. - This does NOT support empty Tensors in the list, because the original max op also does not support empty Tensors. ~I think this should be allowed though, and this PR may come later.~ I understand why this is not allowed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127187 Approved by: https://github.com/albanD	2024-05-29 19:08:58 +00:00
Jane Xu	deec2380c7	Add 0dim Tensor overload for _foreach_div (#113688 ) This PR is ALMOST basically just following the steps from #106677 EXCEPT! We do add one feature. Similar to fused_adam(w), for the CUDA dispatches: when the scalar tensor is on CPU, we .item and redispatch to the normal scalar overload. Otherwise, the cuda kernel will complain about mismatch in devices between the scalar and the tensors. Why do we add this feature? Our optimizers want to allow lr as a tensor, and lr could be a CPU tensor. lr is used with foreach_div_ in Adam, so our CI will break otherwise. After this PR, `_foreach_mul` and `_foreach_div` will accept either a CPU or a GPU tensor for the scalar tensor (vs only a GPU tensor). They join the ranks of `fused_adam(w)` in this characteristic. I did not yet do the same thing for foreach_add (the only other foreach op with a .Tensor overload) because there is no use case and will be more involved. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113688 Approved by: https://github.com/mlazos, https://github.com/albanD	2023-11-15 20:59:32 +00:00
Jane Xu	ca7d084ff9	Add ScalarTensor or 0dim overload for _foreach_add (#111079 ) Adding a Tensor overload will allow us to: - optimize in more cases than before - increase coverage for scalarTensor instead of just scalars in our foreach APIs The main complication in this PR was that add.Tensor has a scalar overload, so I've now built out support for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111079 Approved by: https://github.com/albanD	2023-10-20 01:34:07 +00:00
Masaki Kozuki	5814380e7b	Revert "Revert "Reland "Add forward mode AD to out-place foreach functions (#102409 ) (#106043 )""" (#106320 ) Fixed a typo specifying the number of tensors and elements in the test having failed in slow gradcheck Pull Request resolved: https://github.com/pytorch/pytorch/pull/106320 Approved by: https://github.com/soulitzer	2023-08-18 23:01:42 +00:00
PyTorch MergeBot	2b427ae3a7	Revert "Reland "Add forward mode AD to out-place foreach functions (#102409 ) (#106043 )" This reverts commit e773f28ee307e2a246a4b765f3a51117661b45ba. Reverted https://github.com/pytorch/pytorch/pull/106043 on behalf of https://github.com/DanilBaibak due to Break slow tests ([comment](https://github.com/pytorch/pytorch/pull/106043#issuecomment-1658642734))	2023-07-31 15:50:36 +00:00
Masaki Kozuki	e773f28ee3	Reland "Add forward mode AD to out-place foreach functions (#102409 ) (#106043 ) forward-mode AD of out-of-place foreach functions, finally. rel: - #102409 - #105504 - #58833 - #100695 --- # Generated Foreach ```c++ ::std::vector<at::Tensor> _foreach_sinh(c10::DispatchKeySet ks, at::TensorList self) { auto self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); std::vector<bool> _any_has_forward_grad_result(self.size()); for (const auto& i : c10::irange(self.size())) { _any_has_forward_grad_result[i] = isFwGradDefined(self[i]); } std::shared_ptr<ForeachSinhBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<ForeachSinhBackward0>(new ForeachSinhBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = make_saved_variable_list(self); grad_fn->self_size_ = self.size(); } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::_foreach_sinh(ks & c10::after_autograd_keyset, self_); })(); auto result = std::move(_tmp); #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } std::vector<c10::optional<at::Tensor>> result_new_fw_grad_opts(self.size(), c10::nullopt); for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { if (_any_has_forward_grad_result[i]) { auto self_t_raw = toNonOptFwGrad(self[i]); auto self_tensor = toNonOptTensor(self[i]); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self[i]); result_new_fw_grad_opts[i] = (self_t.conj() * self_p.cosh().conj()).conj(); } } for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { auto& result_new_fw_grad_opt = result_new_fw_grad_opts[i]; if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result[i].defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result[i]._set_fw_grad(result_new_fw_grad_opt.value(), /* level / 0, / is_inplace_op / false); } } return result; } ::std::vector<at::Tensor> _foreach_norm_Scalar(c10::DispatchKeySet ks, at::TensorList self, const at::Scalar & ord) { auto self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); std::vector<bool> _any_has_forward_grad_result(self.size()); for (const auto& i : c10::irange(self.size())) { _any_has_forward_grad_result[i] = isFwGradDefined(self[i]); } std::shared_ptr<ForeachNormBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<ForeachNormBackward0>(new ForeachNormBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->ord = ord; grad_fn->self_ = make_saved_variable_list(self); grad_fn->self_size_ = self.size(); } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::_foreach_norm(ks & c10::after_autograd_keyset, self_, ord); })(); auto result = std::move(_tmp); #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } std::vector<c10::optional<at::Tensor>> result_new_fw_grad_opts(self.size(), c10::nullopt); for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { if (_any_has_forward_grad_result[i]) { auto self_t_raw = toNonOptFwGrad(self[i]); auto self_tensor = toNonOptTensor(self[i]); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self[i]); result_new_fw_grad_opts[i] = norm_jvp(self_p, self_t, ord, result[i]); } } for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { auto& result_new_fw_grad_opt = result_new_fw_grad_opts[i]; if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result[i].defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result[i]._set_fw_grad(result_new_fw_grad_opt.value(), / level / 0, / is_inplace_op / false); } } if (grad_fn) { grad_fn->result = result; } return result; } ``` # Reference ```c++ at::Tensor sinh(c10::DispatchKeySet ks, const at::Tensor & self) { auto& self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); [[maybe_unused]] auto _any_has_forward_grad_result = (isFwGradDefined(self)); std::shared_ptr<SinhBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<SinhBackward0>(new SinhBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::sinh(ks & c10::after_autograd_keyset, self_); })(); auto result = std::move(_tmp); #ifndef NDEBUG if (self__storage_saved.has_value() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved == self_.getIntrusivePtr()); if (result.has_storage() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) { TORCH_INTERNAL_ASSERT(result.storage().use_count() == 1, "function: sinh"); } if (!at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) TORCH_INTERNAL_ASSERT(result.use_count() <= 1, "function: sinh"); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } c10::optional<at::Tensor> result_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_result && (result.defined())) { auto self_t_raw = toNonOptFwGrad(self); auto self_tensor = toNonOptTensor(self); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self); result_new_fw_grad_opt = (self_t.conj() self_p.cosh().conj()).conj(); } if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result._set_fw_grad(result_new_fw_grad_opt.value(), /* level / 0, / is_inplace_op / false); } return result; } at::Tensor norm_Scalar(c10::DispatchKeySet ks, const at::Tensor & self, const at::Scalar & p) { auto& self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); [[maybe_unused]] auto _any_has_forward_grad_result = (isFwGradDefined(self)); std::shared_ptr<NormBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<NormBackward0>(new NormBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->p = p; grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::norm(ks & c10::after_autograd_keyset, self_, p); })(); auto result = std::move(_tmp); #ifndef NDEBUG if (self__storage_saved.has_value() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved == self_.getIntrusivePtr()); if (result.has_storage() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) { TORCH_INTERNAL_ASSERT(result.storage().use_count() == 1, "function: norm_Scalar"); } if (!at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) TORCH_INTERNAL_ASSERT(result.use_count() <= 1, "function: norm_Scalar"); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "norm"); c10::optional<at::Tensor> result_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_result && (result.defined())) { auto self_t_raw = toNonOptFwGrad(self); auto self_tensor = toNonOptTensor(self); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self); result_new_fw_grad_opt = norm_jvp(self_p, self_t, p, result); } if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result._set_fw_grad(result_new_fw_grad_opt.value(), / level / 0, / is_inplace_op */ false); } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106043 Approved by: https://github.com/soulitzer	2023-07-27 03:13:24 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
PyTorch MergeBot	8958f041be	Revert "Add forward mode AD to out-place foreach functions (#102409 )" This reverts commit e2ec0ba404f9fbd3c215cad4cabd7383c692cb33. Reverted https://github.com/pytorch/pytorch/pull/102409 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but it is failing some tests in trunk `e799f565eb` ([comment](https://github.com/pytorch/pytorch/pull/102409#issuecomment-1615254393))	2023-06-30 22:46:57 +00:00
Masaki Kozuki	e2ec0ba404	Add forward mode AD to out-place foreach functions (#102409 ) The major difference from in-place support is that some out-place functions have their derivatives spelled out in derivatives.yaml, which requires some changes in `load_derivatives.py` and some handlings in various places due to the others whose derivatives are generated by `torchgen`. rel: - #58833 - #100695 --- # Generated Foreach ```c++ ::std::vector<at::Tensor> _foreach_sinh(c10::DispatchKeySet ks, at::TensorList self) { auto self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); std::vector<bool> _any_has_forward_grad_result(self.size()); for (const auto& i : c10::irange(self.size())) { _any_has_forward_grad_result[i] = isFwGradDefined(self[i]); } std::shared_ptr<ForeachSinhBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<ForeachSinhBackward0>(new ForeachSinhBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = make_saved_variable_list(self); grad_fn->self_size_ = self.size(); } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::_foreach_sinh(ks & c10::after_autograd_keyset, self_); })(); auto result = std::move(_tmp); #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } std::vector<c10::optional<at::Tensor>> result_new_fw_grad_opts(self.size(), c10::nullopt); for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { if (_any_has_forward_grad_result[i]) { auto self_t_raw = toNonOptFwGrad(self[i]); auto self_tensor = toNonOptTensor(self[i]); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self[i]); result_new_fw_grad_opts[i] = (self_t.conj() * self_p.cosh().conj()).conj(); } } for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { auto& result_new_fw_grad_opt = result_new_fw_grad_opts[i]; if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result[i].defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result[i]._set_fw_grad(result_new_fw_grad_opt.value(), /* level / 0, / is_inplace_op / false); } } return result; } ::std::vector<at::Tensor> _foreach_norm_Scalar(c10::DispatchKeySet ks, at::TensorList self, const at::Scalar & ord) { auto self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); std::vector<bool> _any_has_forward_grad_result(self.size()); for (const auto& i : c10::irange(self.size())) { _any_has_forward_grad_result[i] = isFwGradDefined(self[i]); } std::shared_ptr<ForeachNormBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<ForeachNormBackward0>(new ForeachNormBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->ord = ord; grad_fn->self_ = make_saved_variable_list(self); grad_fn->self_size_ = self.size(); } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::_foreach_norm(ks & c10::after_autograd_keyset, self_, ord); })(); auto result = std::move(_tmp); #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } std::vector<c10::optional<at::Tensor>> result_new_fw_grad_opts(self.size(), c10::nullopt); for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { if (_any_has_forward_grad_result[i]) { auto self_t_raw = toNonOptFwGrad(self[i]); auto self_tensor = toNonOptTensor(self[i]); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self[i]); result_new_fw_grad_opts[i] = norm_jvp(self_p, self_t, ord, result[i]); } } for (const auto& i : c10::irange(result_new_fw_grad_opts.size())) { auto& result_new_fw_grad_opt = result_new_fw_grad_opts[i]; if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result[i].defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result[i]._set_fw_grad(result_new_fw_grad_opt.value(), / level / 0, / is_inplace_op / false); } } if (grad_fn) { grad_fn->result = result; } return result; } ``` # Reference ```c++ at::Tensor sinh(c10::DispatchKeySet ks, const at::Tensor & self) { auto& self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); [[maybe_unused]] auto _any_has_forward_grad_result = (isFwGradDefined(self)); std::shared_ptr<SinhBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<SinhBackward0>(new SinhBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::sinh(ks & c10::after_autograd_keyset, self_); })(); auto result = std::move(_tmp); #ifndef NDEBUG if (self__storage_saved.has_value() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved == self_.getIntrusivePtr()); if (result.has_storage() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) { TORCH_INTERNAL_ASSERT(result.storage().use_count() == 1, "function: sinh"); } if (!at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) TORCH_INTERNAL_ASSERT(result.use_count() <= 1, "function: sinh"); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } c10::optional<at::Tensor> result_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_result && (result.defined())) { auto self_t_raw = toNonOptFwGrad(self); auto self_tensor = toNonOptTensor(self); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self); result_new_fw_grad_opt = (self_t.conj() self_p.cosh().conj()).conj(); } if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result._set_fw_grad(result_new_fw_grad_opt.value(), /* level / 0, / is_inplace_op / false); } return result; } at::Tensor norm_Scalar(c10::DispatchKeySet ks, const at::Tensor & self, const at::Scalar & p) { auto& self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); [[maybe_unused]] auto _any_has_forward_grad_result = (isFwGradDefined(self)); std::shared_ptr<NormBackward0> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<NormBackward0>(new NormBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->p = p; grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::redispatch::norm(ks & c10::after_autograd_keyset, self_, p); })(); auto result = std::move(_tmp); #ifndef NDEBUG if (self__storage_saved.has_value() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved == self_.getIntrusivePtr()); if (result.has_storage() && !at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) { TORCH_INTERNAL_ASSERT(result.storage().use_count() == 1, "function: norm_Scalar"); } if (!at::impl::dispatch_mode_enabled() && !at::impl::tensor_has_dispatch(result)) TORCH_INTERNAL_ASSERT(result.use_count() <= 1, "function: norm_Scalar"); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "norm"); c10::optional<at::Tensor> result_new_fw_grad_opt = c10::nullopt; if (_any_has_forward_grad_result && (result.defined())) { auto self_t_raw = toNonOptFwGrad(self); auto self_tensor = toNonOptTensor(self); auto self_t = (self_t_raw.defined() \|\| !self_tensor.defined()) ? self_t_raw : at::_efficientzerotensor(self_tensor.sizes(), self_tensor.options()); auto self_p = toNonOptPrimal(self); result_new_fw_grad_opt = norm_jvp(self_p, self_t, p, result); } if (result_new_fw_grad_opt.has_value() && result_new_fw_grad_opt.value().defined() && result.defined()) { // The hardcoded 0 here will need to be updated once we support multiple levels. result._set_fw_grad(result_new_fw_grad_opt.value(), / level / 0, / is_inplace_op */ false); } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/102409 Approved by: https://github.com/soulitzer	2023-06-30 04:51:43 +00:00
Masaki Kozuki	ba2bc7df8f	Enable `backward` on `_foreach_zero_` (#101149 ) Currently torchgen cannot find an appropriate `DifferentiabilityInfo` for `_foreach_zero_` because `gen_foreach_derivativeinfo` doesn't correctly make use of `functional_info_by_signature` and `differentiability_infos`, and `is_reference_for_foreach` a bit too strict to `_foreach_zero_`. Generated code in `VariableType` ```c++ void _foreach_zero_(c10::DispatchKeySet ks, at::TensorList self) { auto self_ = unpack(self, "self", 0); [[maybe_unused]] auto _any_requires_grad = compute_requires_grad( self ); std::vector<c10::optional<at::Tensor>> original_selfs(self.size()); std::vector<std::shared_ptr<ZeroBackward0>> grad_fns; if (_any_requires_grad) { for (const auto& i : c10::irange( self.size() )) { const auto ith_requires_grad = compute_requires_grad(self[i]); check_inplace(self[i], ith_requires_grad); grad_fns.push_back([&]() -> std::shared_ptr<ZeroBackward0> { if (!ith_requires_grad) { return nullptr; } else { auto grad_fn = std::shared_ptr<ZeroBackward0>(new ZeroBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self[i] )); return grad_fn; } }()); } } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif { at::AutoDispatchBelowAutograd guard; at::redispatch::_foreach_zero_(ks & c10::after_autograd_keyset, self_); } #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif if (!grad_fns.empty()) { auto differentiable_outputs = flatten_tensor_args( self ); TORCH_INTERNAL_ASSERT(differentiable_outputs.size() == grad_fns.size()); for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { rebase_history(differentiable_outputs[i], grad_fns[i]); } } } } ``` Rel: - #58833 - #96405 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101149 Approved by: https://github.com/soulitzer	2023-05-17 03:10:13 +00:00
Masaki Kozuki	6c934a89a7	Skip invalid grads in outplace foreachs' backward (#100256 ) Fixes #100248 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100256 Approved by: https://github.com/soulitzer, https://github.com/albanD	2023-04-29 22:45:26 +00:00
Masaki Kozuki	674018903d	per-Tensor `grad_fn` for in-place foreach functions (#96405 ) Generate a `grad_fn` for each (tuple of) `Tensor`(s) of the same index for `_foreach_foo_` and each `grad_fn` is `FooBackward`. The current status of foreach functions' backward support for the record: - out-place: Implemented, but no optimized implementations like their forward path - in-place: not implemented. I think this check `7eaaefafb3/torchgen/api/autograd.py (L309-L311)` is partly responsible but the difference of signature between out-place and in-place (see https://github.com/pytorch/pytorch/pull/96405#discussion_r1154690940) would prevent in-place from using out-place versions (the logic is around `7eaaefafb3/torchgen/api/autograd.py (L495-L500)`) ```c++ void _foreach_abs_(c10::DispatchKeySet ks, at::TensorList self) { auto self_ = unpack(self, "self", 0); #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif { at::AutoDispatchBelowAutograd guard; at::redispatch::_foreach_abs_(ks & c10::after_autograd_keyset, self_); } #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) AT_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) AT_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif } ``` Related: - #95431 - #95765 for multiple `grad_fn`s logic --- Examples: outputs of `_foreach_add_.List`, `_foreach_addcmul_.ScalarList`, and `_foreach_exp` ```c++ void _foreach_addcmul__ScalarList(c10::DispatchKeySet ks, at::TensorList self, at::TensorList tensor1, at::TensorList tensor2, at::ArrayRef<at::Scalar> scalars) { auto self_ = unpack(self, "self", 0); auto tensor1_ = unpack(tensor1, "tensor1", 1); auto tensor2_ = unpack(tensor2, "tensor2", 2); auto _any_requires_grad = compute_requires_grad( self, tensor1, tensor2 ); (void)_any_requires_grad; std::vector<c10::optional<at::Tensor>> original_selfs(self.size()); std::vector<std::shared_ptr<AddcmulBackward0>> grad_fns; if (_any_requires_grad) { for (const auto& i : c10::irange( self.size() )) { const auto ith_requires_grad = compute_requires_grad(self[i], tensor1[i], tensor2[i]); check_inplace(self[i], ith_requires_grad); grad_fns.push_back([&]() -> std::shared_ptr<AddcmulBackward0> { if (!ith_requires_grad) { return nullptr; } else { auto grad_fn = std::shared_ptr<AddcmulBackward0>(new AddcmulBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self[i], tensor1[i], tensor2[i] )); return grad_fn; } }()); } if (!grad_fns.empty()) { for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { grad_fn->self_scalar_type = self[i].scalar_type(); grad_fn->tensor1_scalar_type = tensor1[i].scalar_type(); if (grad_fn->should_compute_output(1)) { grad_fn->tensor2_ = SavedVariable(tensor2[i], false); } grad_fn->value = scalars[i]; if (grad_fn->should_compute_output(2)) { grad_fn->tensor1_ = SavedVariable(tensor1[i], false); } grad_fn->tensor2_scalar_type = tensor2[i].scalar_type(); } } } } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); std::vector<c10::optional<Storage>> tensor1__storage_saved(tensor1_.size()); for (const Tensor& tensor : tensor1_) tensor1__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> tensor1__impl_saved(tensor1_.size()); for (size_t i=0; i<tensor1_.size(); i++) if (tensor1_[i].defined()) tensor1__impl_saved[i] = tensor1_[i].getIntrusivePtr(); std::vector<c10::optional<Storage>> tensor2__storage_saved(tensor2_.size()); for (const Tensor& tensor : tensor2_) tensor2__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> tensor2__impl_saved(tensor2_.size()); for (size_t i=0; i<tensor2_.size(); i++) if (tensor2_[i].defined()) tensor2__impl_saved[i] = tensor2_[i].getIntrusivePtr(); #endif { at::AutoDispatchBelowAutograd guard; at::redispatch::_foreach_addcmul_(ks & c10::after_autograd_keyset, self_, tensor1_, tensor2_, scalars); } #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } for (size_t i=0; i<tensor1_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (tensor1__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(tensor1_)) TORCH_INTERNAL_ASSERT(tensor1__storage_saved[i].value().is_alias_of(tensor1_[i].storage())); } for (size_t i=0; i<tensor1_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (tensor1__impl_saved[i] && !at::impl::tensorlist_has_dispatch(tensor1_)) TORCH_INTERNAL_ASSERT(tensor1__impl_saved[i] == tensor1_[i].getIntrusivePtr()); } for (size_t i=0; i<tensor2_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (tensor2__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(tensor2_)) TORCH_INTERNAL_ASSERT(tensor2__storage_saved[i].value().is_alias_of(tensor2_[i].storage())); } for (size_t i=0; i<tensor2_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (tensor2__impl_saved[i] && !at::impl::tensorlist_has_dispatch(tensor2_)) TORCH_INTERNAL_ASSERT(tensor2__impl_saved[i] == tensor2_[i].getIntrusivePtr()); } #endif if (!grad_fns.empty()) { auto differentiable_outputs = flatten_tensor_args( self ); TORCH_INTERNAL_ASSERT(differentiable_outputs.size() == grad_fns.size()); for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { rebase_history(differentiable_outputs[i], grad_fns[i]); } } } } ``` ```c++ void _foreach_add__List(c10::DispatchKeySet ks, at::TensorList self, at::TensorList other, const at::Scalar & alpha) { auto self_ = unpack(self, "self", 0); auto other_ = unpack(other, "other", 1); auto _any_requires_grad = compute_requires_grad( self, other ); (void)_any_requires_grad; std::vector<c10::optional<at::Tensor>> original_selfs(self.size()); std::vector<std::shared_ptr<AddBackward0>> grad_fns; if (_any_requires_grad) { for (const auto& i : c10::irange( self.size() )) { const auto ith_requires_grad = compute_requires_grad(self[i], other[i]); check_inplace(self[i], ith_requires_grad); grad_fns.push_back([&]() -> std::shared_ptr<AddBackward0> { if (!ith_requires_grad) { return nullptr; } else { auto grad_fn = std::shared_ptr<AddBackward0>(new AddBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self[i], other[i] )); return grad_fn; } }()); } if (!grad_fns.empty()) { for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { grad_fn->other_scalar_type = other[i].scalar_type(); grad_fn->alpha = alpha; grad_fn->self_scalar_type = self[i].scalar_type(); } } } } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); std::vector<c10::optional<Storage>> other__storage_saved(other_.size()); for (const Tensor& tensor : other_) other__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> other__impl_saved(other_.size()); for (size_t i=0; i<other_.size(); i++) if (other_[i].defined()) other__impl_saved[i] = other_[i].getIntrusivePtr(); #endif { at::AutoDispatchBelowAutograd guard; at::redispatch::_foreach_add_(ks & c10::after_autograd_keyset, self_, other_, alpha); } #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } for (size_t i=0; i<other_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (other__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(other_)) TORCH_INTERNAL_ASSERT(other__storage_saved[i].value().is_alias_of(other_[i].storage())); } for (size_t i=0; i<other_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (other__impl_saved[i] && !at::impl::tensorlist_has_dispatch(other_)) TORCH_INTERNAL_ASSERT(other__impl_saved[i] == other_[i].getIntrusivePtr()); } #endif if (!grad_fns.empty()) { auto differentiable_outputs = flatten_tensor_args( self ); TORCH_INTERNAL_ASSERT(differentiable_outputs.size() == grad_fns.size()); for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { rebase_history(differentiable_outputs[i], grad_fns[i]); } } } } ... void _foreach_exp_(c10::DispatchKeySet ks, at::TensorList self) { auto self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::vector<c10::optional<at::Tensor>> original_selfs(self.size()); std::vector<std::shared_ptr<ExpBackward0>> grad_fns; if (_any_requires_grad) { for (const auto& i : c10::irange( self.size() )) { const auto ith_requires_grad = compute_requires_grad(self[i]); check_inplace(self[i], ith_requires_grad); grad_fns.push_back([&]() -> std::shared_ptr<ExpBackward0> { if (!ith_requires_grad) { return nullptr; } else { auto grad_fn = std::shared_ptr<ExpBackward0>(new ExpBackward0(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self[i] )); return grad_fn; } }()); } } #ifndef NDEBUG std::vector<c10::optional<Storage>> self__storage_saved(self_.size()); for (const Tensor& tensor : self_) self__storage_saved.push_back( tensor.has_storage() ? c10::optional<Storage>(tensor.storage()) : c10::nullopt); std::vector<c10::intrusive_ptr<TensorImpl>> self__impl_saved(self_.size()); for (size_t i=0; i<self_.size(); i++) if (self_[i].defined()) self__impl_saved[i] = self_[i].getIntrusivePtr(); #endif { at::AutoDispatchBelowAutograd guard; at::redispatch::_foreach_exp_(ks & c10::after_autograd_keyset, self_); } #ifndef NDEBUG for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__storage_saved[i].has_value() && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__storage_saved[i].value().is_alias_of(self_[i].storage())); } for (size_t i=0; i<self_.size() && !at::impl::dispatch_mode_enabled(); i++) { if (self__impl_saved[i] && !at::impl::tensorlist_has_dispatch(self_)) TORCH_INTERNAL_ASSERT(self__impl_saved[i] == self_[i].getIntrusivePtr()); } #endif if (!grad_fns.empty()) { auto differentiable_outputs = flatten_tensor_args( self ); TORCH_INTERNAL_ASSERT(differentiable_outputs.size() == grad_fns.size()); for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { rebase_history(differentiable_outputs[i], grad_fns[i]); } } } if (!grad_fns.empty()) { for (const auto& i : c10::irange(grad_fns.size())) { auto grad_fn = grad_fns[i]; if (grad_fn != nullptr) { grad_fn->result_ = SavedVariable(self[i], true, self[i].is_view()); } } } } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/96405 Approved by: https://github.com/soulitzer	2023-04-28 00:55:04 +00:00
Masaki Kozuki	49f6849f58	Fix codegen logic for foreach derivatives (#95263 ) follow-up https://github.com/pytorch/pytorch/pull/93901. Unexpected numerical mismatches observed in some foreach functions' backward result seemed to be caused by the wrong order of `IndexRangeGenerator::range` call. This pr has `args_with_derivatives` have the same or similar order of `foreach_native_function.func.arguments.flat_non_out` --- what the current master generates for `_foreach_mul.List`: ```cpp variable_list ForeachMulBackward0List::apply(variable_list&& grads) { std::lock_guard<std::mutex> lock(mutex_); TORCH_CHECK(!other_released_, ERR_BACKWARD_TWICE); TORCH_CHECK(!self_released_, ERR_BACKWARD_TWICE); IndexRangeGenerator gen; auto other_ix = gen.range(other_size_); auto self_ix = gen.range(self_size_); variable_list grad_inputs(gen.size()); auto other = unpack_list(other_); auto self = unpack_list(self_); if (task_should_compute_output({ other_ix })) { std::vector<Tensor> grad_result; grad_result.reserve(grads.size()); for (const auto & i : c10::irange(grads.size())) { grad_result.emplace_back(mul_tensor_backward(grads[i], self[i], other[i].scalar_type())); } copy_range(grad_inputs, other_ix, grad_result); } if (task_should_compute_output({ self_ix })) { std::vector<Tensor> grad_result; grad_result.reserve(grads.size()); for (const auto & i : c10::irange(grads.size())) { grad_result.emplace_back(mul_tensor_backward(grads[i], other[i], self[i].scalar_type())); } copy_range(grad_inputs, self_ix, grad_result); } return grad_inputs; } ``` with this PR the generated backward is ```cpp variable_list ForeachMulBackward0List::apply(variable_list&& grads) { std::lock_guard<std::mutex> lock(mutex_); TORCH_CHECK(!self_released_, ERR_BACKWARD_TWICE); TORCH_CHECK(!other_released_, ERR_BACKWARD_TWICE); IndexRangeGenerator gen; auto self_ix = gen.range(self_size_); <----- diff auto other_ix = gen.range(other_size_); <----- diff variable_list grad_inputs(gen.size()); auto self = unpack_list(self_); auto other = unpack_list(other_); if (task_should_compute_output({ other_ix })) { std::vector<Tensor> grad_result; grad_result.reserve(grads.size()); for (const auto & i : c10::irange(grads.size())) { grad_result.emplace_back(mul_tensor_backward(grads[i], self[i], other[i].scalar_type())); } copy_range(grad_inputs, other_ix, grad_result); } if (task_should_compute_output({ self_ix })) { std::vector<Tensor> grad_result; grad_result.reserve(grads.size()); for (const auto & i : c10::irange(grads.size())) { grad_result.emplace_back(mul_tensor_backward(grads[i], other[i], self[i].scalar_type())); } copy_range(grad_inputs, self_ix, grad_result); } return grad_inputs; } ``` The change is to fix the order of `self_ix` and `other_ix`.[](url) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95263 Approved by: https://github.com/soulitzer	2023-03-04 20:03:54 +00:00
Masaki Kozuki	f54233e273	[foreach] bump tensor's version and define backward via torchgen (as possible) (#93901 ) ## summary - increment tensor versions in inplace foreach functions - add a logic to take care of `ArrayRef<Scalar>` rel: https://github.com/pytorch/pytorch/issues/58833, https://github.com/pytorch/pytorch/pull/89591 Pull Request resolved: https://github.com/pytorch/pytorch/pull/93901 Approved by: https://github.com/albanD	2023-02-20 23:18:07 +00:00
Masaki Kozuki	30876229a7	[mta] Backward of unary foreach functions (#89591 ) as per title, this PR defines backward of those. This doesn't implement forward-mode automatic differentiation as [the current codegen](`a747326423/tools/autograd/gen_variable_type.py (L1513)`) doesn't seem to handle `ArrayRef<Tensor>`. Rel: - https://github.com/pytorch/pytorch/issues/53796 - https://github.com/pytorch/pytorch/issues/58833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89591 Approved by: https://github.com/albanD	2023-01-23 08:28:06 +00:00
Edward Z. Yang	1c46a32b67	Minor typing improvements (#91068 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91068 Approved by: https://github.com/Skylion007, https://github.com/soumith	2022-12-20 23:43:11 +00:00
Edward Z. Yang	ad44670fa1	Back out "Revert D38984222: Don't introduce new overload for SymInt (#83628 )" (#84173 ) Also Back out "Revert D39075159: [acc_tensor] Use SymIntArrayRef for overloaded empty.memory_format's signature" Original commit changeset: dab4a9dba4fa Original commit changeset: dcaf16c037a9 Original Phabricator Diff: D38984222 Original Phabricator Diff: D39075159 Also update Metal registrations for C++ registration changes. Also update NNPI registration to account for tightened schema checking Differential Revision: [D39084762](https://our.internmc.facebook.com/intern/diff/D39084762/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39084762/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84173 Approved by: https://github.com/Krovatkin	2022-08-29 18:01:07 +00:00
PyTorch MergeBot	c7edcd6968	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit 9790d90e4b0288796ab44a6b4979db0a67580ba8. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to Breaks internal builds, see D39076487	2022-08-27 01:23:17 +00:00
Edward Z. Yang	9790d90e4b	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-26 01:35:40 +00:00
PyTorch MergeBot	a7edf71360	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit 8fae7027b399e65e6071d335aa874497682c84d0. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to breaking internal builds, see https://www.internalfb.com/diff/D38984222	2022-08-25 00:49:40 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
Edward Z. Yang	8fae7027b3	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-23 22:04:07 +00:00
Mengwei Liu	d0d6b1f222	[torchgen] Generate out variant for functional operator (#81437 ) Summary: Previously we don't generate out variant (both schema and kernel) for an operator with functional variant only. This adds support for that and adds test. ## Changes on `native_function_generation.py` We are generating out variant for all functional variants if possible. This PR introduces a lot of newly generated out variants and `native_functions.yaml` needs to incorporate the changes by adding `autogen` keywords. The logic for determining what operators we should generate an out variant for is the following: 1. No existing out variant for this `NativeFunction` 2. Contains an existing in place, mutable or functional variant 3. Contains at least 1 tensor like return(s) For operators matching the first two conditions but failing the third, I listed them in `FUNCTIONAL_OPS_THAT_CANNOT_GET_AN_OUT_VARIANT`. ## Special handling The following operators satisfy all 3 criteria above but we chose to not autogen them, with some reasons. * `mkldnn_adaptive_avg_pool2d`, the generated out variant `mkldnn_adaptive_avg_pool2d.out` is colliding with the `mkldnn_adaptive_avg_pool2d_out` kernel in `adaptive_avg_pool2d.out` operator. I manually created `mkldnn_adaptive_avg_pool2d.out` and renamed `mkldnn_adaptive_avg_pool2d_out` to `mkldnn_adaptive_avg_pool2d_out_stub`. * `min`, `max` and `mean`. There already exist `min.out`, `max.out` and `mean.out` but they are having different semantics with the functional ones. I manually created `min.unary_out`, `max.unary_out` and `mean.dtype_out` to disambiguate. ## Autograd Changes We introduced a logic to not match derivatives info in `derivatives.yaml` to out variant, since we are generating `NOT_IMPLEMENTED` kernels for those out variants anyway. The issue we are seeing with the original logic is that it doesn't handle `TensorOption` arguments really well. For example we have these two operators: * `_to_copy(Tensor self, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, MemoryFormat? memory_format=None) -> Tensor` `_to_copy.out(Tensor self, *, bool non_blocking=False, MemoryFormat? memory_format=None, Tensor(a!) out) -> Tensor(a!)` If we uses `_to_copy` derivative info, there will be compilation error since `dtype` is missing from `_to_copy.out` signature. Test Plan: Rely on unit test Differential Revision: D37832342 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81437 Approved by: https://github.com/iseeyuan, https://github.com/bdhirsh	2022-08-13 05:44:53 +00:00
Mikayla Gawarecki	e3e33cfae0	Enable codegen of per-dispatch key derivative formulas in derivatives.yaml (#82801 ) `derivatives.yaml` can now take a `dispatch` entry which registers per-autograd dispatch key derivatives such as ``` name: foo(Tensor self, Tensor y) -> Tensor dispatch: Default: x: grad y: grad.expand(y.sizes()) AutogradNestedTensor: x: grad y: NestedTensor_foo_backward(grad, y) output_differentiabilty: [True] ``` However the old schema where there is no `dispatch` entry is still supported. Would greatly appreciate feedback on how to improve the testing strategy of this PR, currently have registered an aten test op in TestOps.cpp with dummy gradients in derivatives.yaml and have some tests in test_autograd.py:TestAutogradMultipleDispatch but I am not sure whether these are sufficiently rigorous. Additionally, this PR also makes the assumption that sets like [VIEW_FUNCTIONS](`ff5399e528/tools/autograd/gen_inplace_or_view_type.py (L60)`) are per-native-function and not per-native-function-and-dispatch-key. I'm not sure whether this is necessarily the case, would there ever be a situation where (e.g. a nested_tensor op is a view op but the aten function is not or vice versa?) * __->__ #82801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82801 Approved by: https://github.com/bhosmer, https://github.com/albanD	2022-08-10 19:26:29 +00:00
soulitzer	b55f9047e1	Add forward AD support for elu_, celu_, selu_ (#83080 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83080 Approved by: https://github.com/albanD	2022-08-09 20:15:44 +00:00
Huy Do	a4647cc1fa	Apply ufmt linter to all py files under torchgen (#81570 ) Previous batches: * https://github.com/pytorch/pytorch/pull/81285 * https://github.com/pytorch/pytorch/pull/81335 We have multiple batches here to minimize merge conflicts and reviewing process. Once everything has been formatted by ufmt (black+usort), the current black linter will be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81570 Approved by: https://github.com/ezyang	2022-07-16 03:52:25 +00:00
Brian Hirsh	0161e9eb00	[test] attempt to functionalize ops with mutable positional-only args Pull Request resolved: https://github.com/pytorch/pytorch/pull/76320 Approved by: https://github.com/ezyang	2022-05-19 18:50:34 +00:00
Edward Yang	36420b5e8c	Rename tools/codegen to torchgen (#76275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76275 In preparation for addressing https://github.com/pytorch/pytorch/issues/73212 Diff was generated with: ``` git mv tools/codegen torchgen git grep -l 'tools.codegen' \| xargs sed -i 's/tools.codegen/torchgen/g' sed -i "s/\${TOOLS_PATH}\/codegen/\${TORCH_ROOT}\/torchgen/g" caffe2/CMakeLists.txt ``` and a manual edits to: * tools/test/test_gen_backend_stubs.py * torchgen/build.bzl * torchgen/gen_backend_stubs.py aka this diff: ``` diff --git a/tools/test/test_gen_backend_stubs.py b/tools/test/test_gen_backend_stubs.py index 3dc26c6d2d..104054575e 100644 --- a/tools/test/test_gen_backend_stubs.py +++ b/tools/test/test_gen_backend_stubs.py @@ -9,7 +9,7 @@ from torchgen.gen_backend_stubs import run from torchgen.gen import _GLOBAL_PARSE_NATIVE_YAML_CACHE # noqa: F401 path = os.path.dirname(os.path.realpath(__file__)) -gen_backend_stubs_path = os.path.join(path, '../torchgen/gen_backend_stubs.py') +gen_backend_stubs_path = os.path.join(path, '../../torchgen/gen_backend_stubs.py') # gen_backend_stubs.py is an integration point that is called directly by external backends. # The tests here are to confirm that badly formed inputs result in reasonable error messages. diff --git a/torchgen/build.bzl b/torchgen/build.bzl index ed04e35a43..d00078a3cf 100644 --- a/torchgen/build.bzl +++ b/torchgen/build.bzl @@ -1,6 +1,6 @@ def define_targets(rules): rules.py_library( - name = "codegen", + name = "torchgen", srcs = rules.glob(["*/.py"]), deps = [ rules.requirement("PyYAML"), @@ -11,6 +11,6 @@ def define_targets(rules): rules.py_binary( name = "gen", - srcs = [":codegen"], + srcs = [":torchgen"], visibility = ["//visibility:public"], ) diff --git a/torchgen/gen_backend_stubs.py b/torchgen/gen_backend_stubs.py index c1a672a655..beee7a15e0 100644 --- a/torchgen/gen_backend_stubs.py +++ b/torchgen/gen_backend_stubs.py @@ -474,7 +474,7 @@ def run( ) -> None: # Assumes that this file lives at PYTORCH_ROOT/torchgen/gen_backend_stubs.py - pytorch_root = pathlib.Path(__file__).parent.parent.parent.absolute() + pytorch_root = pathlib.Path(__file__).parent.parent.absolute() template_dir = os.path.join(pytorch_root, "aten/src/ATen/templates") def make_file_manager(install_dir: str) -> FileManager: ``` run_all_fbandroid_tests Test Plan: sandcastle Reviewed By: albanD, ngimel Differential Revision: D35770317 fbshipit-source-id: 153ac4a7fef15b1e750812a90bfafdbc8f1ebcdf (cherry picked from commit c6d485d1d4648fa1c8a4c14c5bf3d8e899b9b4dd)	2022-04-25 01:38:06 +00:00

34 Commits