* C++ Average Pool Module (#25800)
Summary:
This PR adds Average Pool module to C++ front-end.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25800
Differential Revision: D17318094
Pulled By: yf225
fbshipit-source-id: c914c0e802bbe5f1d1f0a21a669c28bc956899db
* Better error messages in C2 ONNX backend (#25809)
Summary:
Just a tiny fix to make debugging easier (output errors to stderr and include in the exception message)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25809
Reviewed By: zrphercule
Differential Revision: D17329957
Pulled By: houseroad
fbshipit-source-id: 0d73dd9f62c735fbc5096e6a7c0e5f58e4cd90ae
* Add new API for Fully Connected and Convolution Operators in QNNPACK (#25862)
Summary:
This change adds a new prepack and run function for FC and Convolution operators in QNNPACK.
The new functions added are `PackBMatrix`, `qnnpackLinear`, `PrePackConvWeights` and `qnnpackConv`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25862
Test Plan:
QNNPACK unit tests
fully-connected-test
convolution-test
Differential Revision: D17299260
Pulled By: supriyar
fbshipit-source-id: fdc4e2d5f1232675acd153f3efb9d17ed8628a54
* Enable more mGPU tests (#26055)
Summary:
Enable mGPU tests that pass on ROCm as of 2.7.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26055
Differential Revision: D17331484
Pulled By: bddppq
fbshipit-source-id: 51f956a84a6c14a1a41473d322950994fa29c25c
* remove verbose in pytorch_ci hypothesis profile (#26075)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26075
att, remove verbose argument to reduce noice in the logs
Test Plan:
ci
Imported from OSS
Differential Revision: D17335935
fbshipit-source-id: 2e4289e838bf4489dcad8d5533353eebcff0d481
* TorchScript Serialization for dynamic LSTM module
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25877
Test Plan: Imported from OSS
Reviewed By: jianyuh
Differential Revision: D17275746
Pulled By: jamesr66a
fbshipit-source-id: db2f38ddd99f02ccb4fb754fa1c1e6cad4425fa8
* Upgrade the naming for fbgemm quantized op (#26064)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26064
Just changing the names after https://github.com/pytorch/pytorch/pull/25678.
ghstack-source-id: 89944542
Test Plan: CI
Differential Revision: D17332068
fbshipit-source-id: 5e9febed7a2fcd10d44273e55643b277d33a3ad7
* Use BytesIO instead of tempfile (#25976)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25976
As recommended in https://github.com/pytorch/pytorch/pull/25877/files#r322956051:
> We should move more of these toward using BytesIO. Using files in tests is generally considered bad practice because it introduces syscalls and dependencies on the execution environment, and thus can cause test flakiness/instability.
ghstack-source-id: 89929947
Test Plan: CI
Differential Revision: D17310441
fbshipit-source-id: ba97cce4224225df45ff44062f1bc8ebefb25922
* Revert "TorchScript Serialization for dynamic LSTM module" (#26079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26079
This reverts commit e3039612d851d0fbd337546c8debc27ec7cfc4e4.
Test Plan: Imported from OSS
Differential Revision: D17337585
Pulled By: jamesr66a
fbshipit-source-id: 4b93a4c5ca2fe491d609da889a42d22be8e52889
* Add Runtime flag for quantized backend. (#25680)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25680
Add a runtime flag to choose between FBGEMM and QNNPACK when compiled with both.
The flag can be set by using torch.backends.quantized.engine = torch.fbgemm/torch.qnnpack or ctx::setPreferredQuantizedEngine(at::QEngine)
ghstack-source-id: 89935643
Test Plan: Verified torch.backends.quantized.engine works
Differential Revision: D17198233
fbshipit-source-id: e5449d06f4136385e0e6d18bd4237f8654a61672
* Dynamic registration of RPC backends (#25734)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25734
[pytorch] Dynamic registration of RPC backends
Allow non-pg rpc backends to be plugged in as a backend.
ghstack-source-id: 89938296
Differential Revision: D17183789
fbshipit-source-id: 885fed12d80b82b60f9a125f78302a161e708089
* Make regular softmax warp size aware (#25956)
Summary:
Enable one unit test that passes now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25956
Differential Revision: D17298150
Pulled By: bddppq
fbshipit-source-id: 8763e71ad7ef80be915fe93a3471b29f27f3f0a4
* Move NamedTensorMetaInterface definitions to TensorImpl.h (#26030)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26030
Test Plan:
- [namedtensor ci]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26030
Differential Revision: D17322383
Pulled By: zou3519
fbshipit-source-id: d5b914d646b48a6f4e0104aceb435e694b72bd96
* Experimental warning for named tensors (#26050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26050
Throws a warning once when someone attempts to attach names to a tensor.
This is guaranteed to happen at the callsite `set_named_tensor_meta`.
Test Plan: - run tests [namedtensor ci]
Differential Revision: D17331634
Pulled By: zou3519
fbshipit-source-id: 44f5e5c95acd9c7ba543c1210a3b1314aab348f0
* print source code when a function is executed (#25868)
Summary:
While this isn't ideal as it might print out the same source every time a function is run; it's still easier to go and tweak python code to reduce loop counts, than to insert `std::cout` and recompile cpp code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25868
Differential Revision: D17318386
Pulled By: Krovatkin
fbshipit-source-id: 928ba6543204042924ab41a724635594709630de
* Disable test_cuda.test_stream_event_nogil on ROCm (#26087)
Summary:
Was recently enabled in https://github.com/pytorch/pytorch/pull/26055, it's flaky on master:
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/37575
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/37577
```
05:39:35 test_stream_event_nogil (__main__.TestCuda) ... Exception in thread Thread-3:
05:39:40 Traceback (most recent call last):
05:39:40 File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
05:39:40 self.run()
05:39:40 File "/usr/lib/python2.7/threading.py", line 754, in run
05:39:40 self.__target(*self.__args, **self.__kwargs)
05:39:40 File "test_cuda.py", line 1894, in _test_stream_event_nogil
05:39:40 c2p.put(sync_func(self, TestCuda.FIFTY_MIL_CYCLES))
05:39:40 File "test_cuda.py", line 1882, in _event_wait
05:39:40 self.assertTrue(s1.query())
05:39:40 File "/usr/lib/python2.7/unittest/case.py", line 422, in assertTrue
05:39:40 raise self.failureException(msg)
05:39:40 AssertionError: False is not true
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26087
Differential Revision: D17340891
Pulled By: bddppq
fbshipit-source-id: b2b70beb1b068db53197a5f9f6a80cb046e66ebd
* TorchScript Serialization for dynamic LSTM
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26084
Test Plan: Imported from OSS
Differential Revision: D17339315
Pulled By: jamesr66a
fbshipit-source-id: 03a2674edcf779becfe3b8ec96f1bae23c74b11c
* Automatic update of fbcode/onnx to 7988d8360b11e6003560076e9b1d4aa426db3244 (#25959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25959
Previous import was 28ca699b69b5a31892619defca2391044a9a6052
Included changes:
- **[7988d836](https://github.com/onnx/onnx/commit/7988d836)**: Supporting negative axes for all existing onnx ops (#2281)
- **[5ca0a09e](https://github.com/onnx/onnx/commit/5ca0a09e)**: Update managingexperimentalops.md (#1981)
- **[bc0495c1](https://github.com/onnx/onnx/commit/bc0495c1)**: Fix link to community docs in readme (#2261)
- **[2fdb3ef6](https://github.com/onnx/onnx/commit/2fdb3ef6)**: move map and sequence types to onnx domain, (#2244)
- **[568b65aa](https://github.com/onnx/onnx/commit/568b65aa)**: Improve compatiblity with proto3 and enable reading attributes (#2288)
- **[1f350f2c](https://github.com/onnx/onnx/commit/1f350f2c)**: Remove type info for loop variadic input in Loop op used to compose the Range op (#2287)
- **[eb139446](https://github.com/onnx/onnx/commit/eb139446)**: Add Foundation WG to working-groups.md (#2276)
- **[4eabc4b3](https://github.com/onnx/onnx/commit/4eabc4b3)**: Fix testdata model for CumSum. Add exclusive attribute. (#2271)
- **[1a62afdb](https://github.com/onnx/onnx/commit/1a62afdb)**: Support GatherND operator in ONNX (#2106)
- **[0e330e9d](https://github.com/onnx/onnx/commit/0e330e9d)**: Support ScatterND operator in ONNX (#2220)
- **[733f7a6a](https://github.com/onnx/onnx/commit/733f7a6a)**: Add Det to ONNX (#2233)
- **[52187738](https://github.com/onnx/onnx/commit/52187738)**: Update the description of nearest_mode of resize op (#2257)
- **[64b4b686](https://github.com/onnx/onnx/commit/64b4b686)**: Adding sparse tensor to ONNX (#2019)
- **[c8a8b7cc](https://github.com/onnx/onnx/commit/c8a8b7cc)**: Support Range operator in ONNX (#2242)
- **[44b0d6d5](https://github.com/onnx/onnx/commit/44b0d6d5)**: Update resize op (#2057)
- **[7d907964](https://github.com/onnx/onnx/commit/7d907964)**: Add function to fuse dynamic quantization graph into 1 node (#2187)
- **[36f8e6d9](https://github.com/onnx/onnx/commit/36f8e6d9)**: Update logo_request.md (#2231)
- **[4eb737c8](https://github.com/onnx/onnx/commit/4eb737c8)**: Update Clip in opset 11 to support min/max as inputs instead of attributes (#2096)
- **[a25e1388](https://github.com/onnx/onnx/commit/a25e1388)**: Fix segfault in tile shape inference (#2221)
- **[2dc273c7](https://github.com/onnx/onnx/commit/2dc273c7)**: update onehot shape inference to reflect the spec for depth input (#2224)
- **[665211c1](https://github.com/onnx/onnx/commit/665211c1)**: Add GatherElements Op and Rename ScatterElements (#2143)
- **[3ba2e31a](https://github.com/onnx/onnx/commit/3ba2e31a)**: Unique (#2141)
- **[5a5588ad](https://github.com/onnx/onnx/commit/5a5588ad)**: Clarify dimension variable scoping (#2211)
- **[fabe39d5](https://github.com/onnx/onnx/commit/fabe39d5)**: Liqun/topk sort (#2126)
- **[453aa644](https://github.com/onnx/onnx/commit/453aa644)**: Update document for NMS (#2193)
- **[34e28ec2](https://github.com/onnx/onnx/commit/34e28ec2)**: Handle negative 'axis' value in Split type and shape inferencing (#2177)
- **[28ec4583](https://github.com/onnx/onnx/commit/28ec4583)**: depth to space shuffle order (#2163)
- **[98f72629](https://github.com/onnx/onnx/commit/98f72629)**: minor updates to fix links in readme (#2189)
- **[321d1467](https://github.com/onnx/onnx/commit/321d1467)**: Add check to disallow squeezing input axes which are not 1 (#2204)
- **[573f0dc9](https://github.com/onnx/onnx/commit/573f0dc9)**: fix a bug in fun shape inference (#2188)
- **[36dc7110](https://github.com/onnx/onnx/commit/36dc7110)**: Clarify ambiguity in gather spec regarding indices expectation (#2202)
- **[a2449673](https://github.com/onnx/onnx/commit/a2449673)**: Fix some minor issues in IR.md and Versioning.md (#2108)
- **[349aff69](https://github.com/onnx/onnx/commit/349aff69)**: Skip install typing package for python >=3.5 (#2199)
Test Plan: ci
Reviewed By: bddppq, benoitsteiner
Differential Revision: D17296390
fbshipit-source-id: 9f9f5ce85d9694128008d756c2ea393bd4e0cb71
* Skip test_triangular_solve_batched (#26108)
Summary:
cc: gchanan zou3519
I will look into why this is failing spuriously.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26108
Differential Revision: D17348399
Pulled By: zou3519
fbshipit-source-id: aed4ccfc3f106692d4e32acc029740309570b0c3
* Exposing Fused8BitRowwiseQuantizedToFloat in PyTorch (#26080)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26080
Will be used in c2 ctr_mbl_feed model to PyTorch conversion
Test Plan: Unit test
Reviewed By: yinghai
Differential Revision: D17337604
fbshipit-source-id: a90d9f5dc38301608d1562c6f2418e7f4616e753
* make sure all out stringstreams start out empty in jit_log.hpp
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25863
Differential Revision: D17347386
Pulled By: Krovatkin
fbshipit-source-id: a42cf56680a27bc3e50fd945ab372a409225b875
* tracing with an opt-in by file name (#25895)
Summary:
This basically works a simple filter as you suggested ZolotukhinM
`export PYTORCH_JIT_LOG_LEVEL=guard_elimination` will print all `GRAPH_DUMP` and `GRAPH_UPDATE` statements.
`export PYTORCH_JIT_LOG_LEVEL=>guard_elimination:>alias_analysis` will print all `GRAPH_DUMP`, `GRAPH_UPDATE` **and** `GRAPH_DEBUG` statements in `guard_elimination.cpp` **and** in `alias_analysis.cpp`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25895
Differential Revision: D17309090
Pulled By: Krovatkin
fbshipit-source-id: 8fa9e67cc9af566b084d66cc15223633fda08444
* Stop re-ordering TH(C)Blas arguments. (#25606)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25606
This just complicates the codegen for no benefit.
Test Plan: Imported from OSS
Differential Revision: D17172498
Pulled By: gchanan
fbshipit-source-id: d2f50e45400ac0336792422518e03dbae3a1bedc
* Kill TH(C)Blas kwarg_only declarations. (#25607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25607
Since we don't generate these as end-user bindings, and we no longer reorder based on this property, we can just get rid of the property.
Test Plan: Imported from OSS
Differential Revision: D17172500
Pulled By: gchanan
fbshipit-source-id: f84fd8bb2b13598501897f56871b21339585d844
* simplify build_android_gradle.sh (#25897)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25897
It doesn't hurt to set all variables unconditionally.
And we can create link to lib directory instead of specific files - this
way it's easier to switch between dynamic/static library names.
Test Plan:
- check android gradle CI;
- use stack diff to check all 4 architectures on PR;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25897
Differential Revision: D17307240
Pulled By: ljk53
fbshipit-source-id: c975085ddda852ef7da1c29935c2f6a28d797e5a
* change gradle build to use static libtorch + gc-sections (#25984)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25984
Link static libtorch libraries into pytorch.so (API library for android)
with "-Wl,--gc-sections" flag to remove unused symbols in libtorch.
Test Plan:
- full gradle CI with stacked PR;
- will check final artifacts.tgz size change;
Differential Revision: D17312859
Pulled By: ljk53
fbshipit-source-id: 99584d15922867a7b3c3d661ba238a6f99f43db5
* remove "build_deps" arg from setup.py command in (#26113)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26113
After https://github.com/pytorch/pytorch/pull/16914, passing in an
argument such as "build_deps" (i.e. python setup.py build_deps develop) is
invalid since it gets picked up as an invalid argument.
ghstack-source-id: 90003508
Test Plan:
Before, this script would execute "python setup.py build_deps
develop", which errored. Now it executes "python setup.py develop" without an
error. Verified by successfully running the script on devgpu. In setup.py,
there is already a `RUN_BUILD_DEPS = True` flag.
Differential Revision: D17350359
fbshipit-source-id: 91278c3e9d9f7c7ed8dea62380f18ba5887ab081
* Stop reordering TH random function arguments.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25608
Test Plan: Imported from OSS
Differential Revision: D17172494
Pulled By: gchanan
fbshipit-source-id: 5a46889cc040297231e2473ae5b2879b39f8d60a
* fix base_lr overridden in cyclic lr (#26105)
Summary:
base_lr parameter was being overridden by super `__init__`, see https://github.com/pytorch/pytorch/issues/21965.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26105
Reviewed By: yf225
Differential Revision: D17346724
Pulled By: vincentqb
fbshipit-source-id: 4b146bd64f4f385c0a9c4f4df8eb8991312fb15c
* Skip inserting duplicate observers (#25504)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25504
Skip inserting duplicate observers for values observed
in forward method of a child module or other methods in
the current module.
Test Plan:
python test/test_jit.py -- 'TestJit.insert_observers'
python test/test_jit.py -- 'TestJit.insert_observers_child_qconfig'
python test/test_jit.py -- 'TestJit.insert_observers_skip_values'
Imported from OSS
Differential Revision: D17208888
fbshipit-source-id: e04f1c22ab1c4f410933a17a3ef31acf5f217323
* Implementation of ConstantThenLinearWarmupLRPolicy and CompositeCyclicalLRPolicy (#25970)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25970
ConstantThenLinearWarmupLRPolicy:
* first use a constant warm up
* then ramp up to the fixed learning rate linearly
CompositeCyclicalLRPolicy:
* first use a constant warm up
* then ramp up to the fixed learning rate linearly
* then use cyclical learning rates for the rest of time
Pull Request resolved: https://our.intern.facebook.com/intern/opensource/shipit/preview/D17302632/
Test Plan:
* buck test
* https://our.intern.facebook.com/intern/testinfra/testconsole/testrun/5910974518377039/
* https://our.intern.facebook.com/intern/testinfra/testrun/1407375027118303
* checked the consistency of learning rates w.r.t. iterations with offline simulations n143987
Reviewed By: swatirallapalli
Differential Revision: D17302632
fbshipit-source-id: 1098d4dd9109a48932b76e36d78239e49f8077a1
* Fix build warning in vec256_qint.h
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26121
Test Plan: Imported from OSS
Differential Revision: D17351960
Pulled By: jamesr66a
fbshipit-source-id: 12389729fe5fb8d863cf47288920ea375a3e74ab
* Kill kwarg_only declarations in Declarations.cwrap. (#25609)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25609
They don't do anything anymore.
Test Plan: Imported from OSS
Differential Revision: D17172497
Pulled By: gchanan
fbshipit-source-id: 5cf7fdcf7d2da0054ac1bd7d8d2b70a2264b8c93
* Support quantizing any methods called (#25505)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25505
Support for quantizing all the methods called by forward method, including
child module methods and other methods in the current module
It relies on module level constant prop, we need to figure out a way to do constant prop
for these methods as well. We can either do constant prop in the module level or do constant
prop in the quantization function, but this will need some discussion.
Test Plan:
python test/test_jit.py 'TestJit.insert_quant_dequant'
python test/test_quantizer.py
Imported from OSS
Differential Revision: D17208887
fbshipit-source-id: 21749457b21b00a6edada290c26324e2fb210b10
* C++ unregister_module function for Module (#26088)
Summary:
This PR adds ```unregister_module``` to ```nn::Module``` and ```erase``` function to ```OrderedDict```.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26088
Differential Revision: D17360058
Pulled By: yf225
fbshipit-source-id: f1f375b4751317da85b8da1458e092fe2405ceec
* Port fuse_linear from pytorch/tvm (#25623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25623
Port over fuse_linear pass from pytorch/tvm project, we'll need this
in backend specific quantization pass to match aten::linear and swap
it with quantized linear
Test Plan:
python test/test_jit.py 'TestJit.test_fuse_linear'
Imported from OSS
Differential Revision: D17208890
fbshipit-source-id: f4ff3889ae4525797d3b986f46ae37e50ea49116
* Add device check before accessing data_ptr in PackLayer (#26056)
Summary:
fixes https://github.com/pytorch/xla/issues/927
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26056
Differential Revision: D17331859
Pulled By: ailzhang
fbshipit-source-id: bdc334f03c8dcbb4ef4f5e059a63ef188a0b8b61
* Create TensorBoard test classes in all cases (#26005)
Summary:
To give better signal to the user, we will now always create the TensorBoard tests classes and just disable tests if TensorBoard is not installed.
cc lanpa sanekmelnikov natalialunova pietern
[test macos]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26005
Reviewed By: sanekmelnikov
Differential Revision: D17352430
Pulled By: orionr
fbshipit-source-id: 87a592064f4768ffded76a3d666a8e508a1ef164
* Automatic update of fbcode/onnx to 95252c2adec185e305e34486c6756ece9aa8f57f (#26137)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26137
Previous import was 7988d8360b11e6003560076e9b1d4aa426db3244
Included changes:
- **[95252c2a](https://github.com/onnx/onnx/commit/95252c2a)**: Fix shapeinference function (#2296)
- **[414285bb](https://github.com/onnx/onnx/commit/414285bb)**: fix the buffer overflow problem in shape inference logic of Squeeze op
- **[797cdd0f](https://github.com/onnx/onnx/commit/797cdd0f)**: Support for negative indices in 'Gather', 'GatherElements', 'ScatterElements', 'OneHot' (#2260)
- **[7636978d](https://github.com/onnx/onnx/commit/7636978d)**: Fix collect_snippets warnings (#2277)
- **[fa70c33b](https://github.com/onnx/onnx/commit/fa70c33b)**: Update printable_graph in helper.py to output details of initializers that do not have matching graph inputs. (#2135)
- **[428d09b0](https://github.com/onnx/onnx/commit/428d09b0)**: test int64 input type for 'where' op (#2253)
Test Plan: ci
Reviewed By: bddppq
Differential Revision: D17353795
fbshipit-source-id: 6d4f39754863a30f427f4512c7b228e45d3ce84f
* Add fusion for quantized linear (#25624)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25624
First fuse the splitted op into aten::linear and then fuse
`dequant - aten::linear - quant` into quantized linear op
Test Plan:
python test/test_jit.py 'TestJit.quant_fusion'
Imported from OSS
Differential Revision: D17208891
fbshipit-source-id: 864b19fabab2e8e6f8f8ad35eb3dbbf2d5fdb8c4
* Implement tensor.refine_names (#25842)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25842
`tensor.refine_names(*names)` takes `tensor` and attempts to name its
dimensions `names` out-of-place. If a dimension `i` already had a name,
then it cannot be changed (so tensor.names[i] must equal names[i]);
if the original dimension did not have a name, then the new name
(names[i]) can be anything.
`tensor.refine_names(*names)` also accepts a glob '*' that greedily selects
names from `tensor`. Here are some examples:
- `Tensor[None].refine_names('N') -> Tensor[N]`
- `Tensor[N].refine_names('N') -> Tensor[N]`
- `Tensor[N].refine_names('D') -> Error!`
- `Tensor[N].refine_names(None) -> Error!`
- `Tensor[None, None].refine_names('*', D) -> Tensor[None, D]`
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17255548
Pulled By: zou3519
fbshipit-source-id: fdbdb3a12f24fbe37ce1e53ed09dc8a42589d928
* Implement tensor.align_as(other), change tensor.align_to(names) (#25843)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25843
`tensor.align_to(*names)` permutes the dimensions of `tensor` and adds
additional 1-sized dimensions such that the output tensor has dimensions
in the same order as `names`. All dimensions of `tensor` must be
present in `names`, in addition, this function requires that all dims of
`tensor` be named.
`tensor.align_as(other)` is equivalent to
`tensor.align_to(*other.names)`.
I'm planning on changing `torch.align_tensors(*tensors)` to align closer
to these semantics because there didn't seem to be a clear use case for the old
semantics that preserve unnamed dimensions. That will come in a future
change.
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17255549
Pulled By: zou3519
fbshipit-source-id: 1e437ad81e9359b4d5bd0e7e64c3a1be441fc3e3
* C++ API parity: at::Tensor::data
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26008
Test Plan: Imported from OSS
Differential Revision: D17343488
Pulled By: pbelevich
fbshipit-source-id: b9ba5e26cad621a428a14292446d7fb5a6e5535d
* Fix bug with named tensors and (no) tracer support (#26106)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26106
Previously, in the named tensors build, an operator is marked as
non-traceable if ANY of its overloads are named tensor overloads. This
breaks the tracer for things like torch.full (has a names= overload for
named tensor) and tensor.sum (has a Dimname overload for named tensor).
This PR fixes the problem by putting the "no tracer support" logic into
the location where the tracer attempts to construct a graph by adding a
Dimname/DimnameList argument to a node.
Test Plan:
- new test in test_jit.py to check if torch.full is traceable
- new test in test_namedtensor.py to check what happens when someone
tries to trace a function that uses named tensor APIs.
- [namedtensor ci]
Differential Revision: D17353452
Pulled By: zou3519
fbshipit-source-id: b0b843c8357ffe54baee6e8df86db914f0b1ece4
* Add data field to Tensor pyi. (#26093)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26093
Signed-off-by: Edward Z. Yang
Test Plan: Imported from OSS
Reviewed By: vsiles
Differential Revision: D17366320
Pulled By: ezyang
fbshipit-source-id: 025f1c3d75d294fc1b51ddc540e542a05dc72b6a
* Change schedulers to chainable form (#24352)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513370208).
* Changing the behavior of schedulers to the chainable formula when available
* Using the closed form whenever epoch is different from None until the next release with a deprecation warning
* Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513940729) for new syntax)
* Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](https://github.com/pytorch/pytorch/pull/21800#discussion_r294305485)) referring to `get_computed_values`, and deprecating it in the next release.
* `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch
* `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax.
# #20527
### Before
The user calls scheduler with a constant epoch either across loops or in the same loop.
```
import torch.optim as optim
from torch import nn
conv = nn.Conv2d(3,3,3)
optimizer = optim.Adam(conv.parameters())
lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2)
# Scheduler with sometimes-constant epoch number
for epoch in [0, 0, 1, 1, 2, 2, 3, 3]:
lr_scheduler.step(epoch)
print(optimizer.param_groups[0]['lr'])
```
### After
If the user wants to step
```
import torch.optim as optim
from torch import nn
conv = nn.Conv2d(3,3,3)
optimizer = optim.Adam(conv.parameters())
lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2)
last_epoch = -1
for epoch in [0, 0, 1, 1, 2, 2, 3, 3]:
# Check if epoch number has changed manually
if epoch-last_epoch > 0:
lr_scheduler.step()
last_epoch = epoch
print(epoch, scheduler.get_computed_values())
```
# #22107
### Before
```
import torch
from torchvision.models import resnet18
net = resnet18()
optimizer = torch.optim.SGD(net.parameters(), 0.1)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1)
for i in range(10):
# Scheduler computes and returns new learning rate, leading to unexpected behavior
print(i, scheduler.get_lr())
scheduler.step()
```
### After
```
import torch
from torchvision.models import resnet18
net = resnet18()
optimizer = torch.optim.SGD(net.parameters(), 0.1)
lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1)
for i in range(10):
# Returns last computed learning rate by scheduler
print(i, lr_scheduler.get_computed_values())
lr_scheduler.step()
```
Test Plan: Imported from OSS
Differential Revision: D17349760
Pulled By: vincentqb
fbshipit-source-id: 0a6ac01e2a6b45000bc6f9df732033dd81f0d89f
* Run PyTorch macOS CPU-only build/test on all PRs
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26096
Test Plan: Imported from OSS
Differential Revision: D17366419
Pulled By: pietern
fbshipit-source-id: 138659dae346aad3cde52d488cd1780614e7692f
* Use CircleCI commands for brew update/install (#26159)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26159
The snippets for working with Homebrew were duplicated across binary
builds, macOS builds, and iOS builds. In #25336, the CircleCI
configuration version was updated to version 2.1, which supports
parameterized commands. This means we no longer have to use YAML
tricks to duplicate stanzas and instead can natively define a series
of reusable steps.
Motivation for doing this is that the macOS binary builds were still
using the slow `brew update` instead of `git fetch` (see #25988).
[test macos]
[test wheel]
Test Plan: Imported from OSS
Differential Revision: D17366538
Pulled By: pietern
fbshipit-source-id: 194c0f37c1dc999705f3ba97fdabf4ff18728d93
* Turn should_run_job into command
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26160
Test Plan: Imported from OSS
Differential Revision: D17366539
Pulled By: pietern
fbshipit-source-id: a870d6da21925764986c6c748ad291440b78e6fd
* Turn setup_linux_system_environment into command
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26162
Test Plan: Imported from OSS
Differential Revision: D17366537
Pulled By: pietern
fbshipit-source-id: 98413daa344812f06578c3373d8516292d2f21f5
* Turn setup_ci_environment into command
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26163
Test Plan: Imported from OSS
Differential Revision: D17366536
Pulled By: pietern
fbshipit-source-id: 07181a77aaeba5457aa716ceac9cc404aacefe5f
* Kill most defaults in Declarations.cwrap. (#25610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25610
They don't do anything anymore, since this isn't the end-user interface.
Test Plan: Imported from OSS
Differential Revision: D17172495
Pulled By: gchanan
fbshipit-source-id: a380d970f0836ed85eb9ac2aa42eb73655d775aa
* Get rid of more defaults in Declarations.cwrap.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25611
Test Plan: Imported from OSS
Differential Revision: D17172493
Pulled By: gchanan
fbshipit-source-id: 0f4319f8024ac4eca62576231214227b341f56c4
* Kill remaining defaults in Declarations.cwrap.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25612
Test Plan: Imported from OSS
Differential Revision: D17172499
Pulled By: gchanan
fbshipit-source-id: f99e813a4a90e8576541da317027e6f8ae76079b
* Remove requests as dependency (#26083)
Summary:
local build is slow... test in CI...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26083
Differential Revision: D17346949
Pulled By: ailzhang
fbshipit-source-id: f552d1a4be55ad4e2bd915af7c5a2c1b6667c446
* Fix 'in' return true incorrectly (#24156)
Summary:
Because of 'return NotImplemented', __contains__ return True when the element is not a number.
bool(NotImplemented) == True
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24156
Differential Revision: D16829895
Pulled By: zou3519
fbshipit-source-id: 9d3d58025b2b78b33a26fdfcfa6029d0d049f11f
* guard dyndep with a lock (#26153)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26153
I am suspecting that our multithreaded test-system causes issue with dyndep, if two places try to concurrently InitOpsLibrary. So perhaps we just guard this by a lock. This is just a guess-fix, as it is impossible to repro.
Test Plan: sandcastle
Reviewed By: bddppq
Differential Revision: D17361310
fbshipit-source-id: 596634a2098b18881abbd26a5a727a5ba0d03b6e
* Add documentation to logging
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26175
Differential Revision: D17371085
Pulled By: Krovatkin
fbshipit-source-id: ea06f4e16fc320940a299e8e1d4f4d7c76f5950a
* Fold quantize op into module (#25625)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25625
We want to fold the quantize op for weights/bias into module to avoid quantizing weights on the fly.
Test Plan:
python test/test_jit.py
Imported from OSS
Differential Revision: D17208889
fbshipit-source-id: 1854b8953b065855d210bc1166533c08ca264354
* Revert D17349760: Change schedulers to chainable form
Test Plan: revert-hammer
Differential Revision:
D17349760
Original commit changeset: 0a6ac01e2a6b
fbshipit-source-id: 41c2c136215dabc26cad5098a08eff2a2a29b715
* Use torch::from_blob instead of shareExternalPointer, nits (#25973)
Summary:
The main part is to switch at::Tensor creation from usage of `torch::empty(torch::IntArrayRef(...))->ShareExternalPointer(...) to torch::from_blob(...)`
Removed explicit set of `device CPU` as `at::TensorOptions` by default `device CPU`
And renaming of local variables removing `input` prefix to make them shorter
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25973
Differential Revision: D17356837
Pulled By: IvanKobzarev
fbshipit-source-id: 679e099b8aebd787dbf8ed422dae07a81243e18f
* Make schema part of RegisterOperators::Options (#26114)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26114
With this diff, the operator schema or name can be specified as part of the options objects:
```
static auto registry = torch::RegisterOperators()
.op(torch::RegisterOperators::options().schema("my_op").kernel(&kernel))
.op(...);
```
This does not break backwards compatibility, all old APIs are kept as shorthands.
This (a) makes the API more consistent, accumulating all options into the options objects and not treating schema special anymore, and (b) this is required for allowing the c10 dispatcher to forward registration calls to ATenDispatch for ops that are still on that dispatcher, see plan in https://github.com/pytorch/pytorch/issues/24132
ghstack-source-id: 90049402
Test Plan: unit tests
Differential Revision: D17350383
fbshipit-source-id: cbb8f33a52dccb2a4522753e7b5ac8ba35b908fd
* Allow overwriting catch-all kernels (#25947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25947
Previously, the c10 dispatcher didn't allow having a catch-all kernel and backend specific kernels at the same time.
This is also the long term goal. But to make the current XLA implementation work, we need to allow them to overwrite these ops with XLA variants.
This diff changes that so that ops can have both, catchall and backend specific kernels, and will call into the catchall kernel if there is no more specific kernel registered.
This is also the current behavior of globalATenDispatch.
ghstack-source-id: 90049398
Test Plan: unit tests
Differential Revision: D17293036
fbshipit-source-id: f2d5928e904c1dc9b6b89e9bb468debe48a4056c
* Register ATen ops with c10 (#26131)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26131
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
- This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.
Reasons for some ops to be excluded (i.e. not have the `use_c10_dispatcher` flag set to true):
- `Tensor?(a!)` (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- `-> void` in native_functions.yaml vs `-> ()` expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
- `Tensor?` (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like `int[3]` not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.
ghstack-source-id: 90060748
Test Plan: a diff stacked on top uses these registrations to call these ops from ATen
Differential Revision: D16603131
fbshipit-source-id: 315eb83d0b567eb0cd49973060b44ee1d6d64bfb
* Updating submodules
Summary:
GitHub commits:
https://github.com/facebook/rocksdb/commit/83a6a614e9bf5f3f06abc265b736e868acee498b
https://github.com/pytorch/fbgemm/commit/c8cac64995d8d8af871e461affbf505ac7fce4d8
Test Plan: n/a
Reviewed By: 2d2d2d2d2d
fbshipit-source-id: 1f5bc1e065fe13d89eeb42539f21a8ab0ab8b8a1
* Nightly build for for iOS (#26074)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26074
### Summary
This PR creates a nightly job for iOS builds. The job will generate a couple of static libraries that contains three architectures(x86, arm64, armv7s) and upload them to AWS s3.
### Note
The test phase in this job is missing right now, meaning if there is a linking error, we won't be able to know it. To add the test jobs, we have to put a dummy test App in the repo and manually link the libraries to the app after the build finishes. This will be done in the next following PRs
Test Plan: Imported from OSS
Differential Revision: D17363066
Pulled By: xta0
fbshipit-source-id: 5beeb4263af5722f0a852297023f37aaea9ba4b1
* Change the source link in podspec (#26089)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26089
### Summary
A couple of changes
1. Replace the source link with the newly nightly build address
2. Remove module support for Swift and Objective-C
3. Expose all static libraries instead of archiving them into one single library. This is because those static libraries might contain object files that have the same name, e.g. `init.c.o` in both `libcupinfo.a` and `libqnnpack.a`. If we archive them into one using this `libtool -static` command, by default, it only picks one object file and discards the others, which could result in undefined symbols when linking the executable. The change here is to expose all the static libraries and let the linker decide which one to use.
### Test Plan
- pod spec lint succeed
- `pod spec lint --verbose --allow-warnings --no-clean --use-libraries --skip-import-validation`
Test Plan: Imported from OSS
Differential Revision: D17363037
Pulled By: xta0
fbshipit-source-id: ba77b0001b58e6e2353d8379d932db598166d37d
* Updating submodules
Summary:
GitHub commits:
https://github.com/facebook/rocksdb/commit/97631357aa274d06a7ab09b3cde7b909262cc4dd
https://github.com/pytorch/fbgemm/commit/2f1477dfee9465c1e2dbdf21722970b3fa1baf86
Test Plan: n/a
Reviewed By: 2d2d2d2d2d
fbshipit-source-id: 33029d2e8c6a3664a35823829670f6ed9dfc3b44
* Tensor renaming to dtype, shape; support long, double (#26183)
Summary:
Applying dzhulgakov review comments
org.pytorch.Tensor:
- dims renamed to shape
- typeCode to dtype
- numElements to numel
newFloatTensor, newIntTensor... to newTensor(...)
Add support of dtype=long, double
Resorted in code byte,int,float,long,double
For if conditions order float,int,byte,long,double as I expect that float and int branches will be used more often
Tensor.toString() does not have data, only numel (data buffer capacity)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26183
Differential Revision: D17374332
Pulled By: IvanKobzarev
fbshipit-source-id: ee93977d9c43c400b6c054b6286080321ccb81bc
* use whitelist for selecting observed values (#25974)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25974
Previously we observe all the Tensor values, but what we want is actually
observing only the ones that can be quantized.
Test Plan:
python test/test_jit.py
python test/test_quantizer.py
Imported from OSS
Differential Revision: D17348986
fbshipit-source-id: 55be0d73862a0e7eb1e7fd882d16e0d830618b63
* fix circle CI
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26225
Test Plan: Imported from OSS
Differential Revision: D17379899
Pulled By: xta0
fbshipit-source-id: 4077aa0149b23560f3a9e29531ca9bc612a2c09c
* Add histogram observer (#23959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23959
Add histogram observer that records the running histogram of tensor values along with min/max values.
ghstack-source-id: 90076996
Test Plan:
Added a test test_histogram_observer
buck test mode/dev caffe2/test:quantization -- 'test_histogram_observer'
buck test mode/dev caffe2/test:quantization -- 'test_observer_scriptable'
Differential Revision: D16692835
fbshipit-source-id: 0f047d3349cb9770fad4a2b6cb346c51d9e99cd4
* Add isBackwardCompatibleWith for Argument and FunctionSchema (#23409)
Summary:
we intend to be conservative, and will relax the checks in future if necessary.
So far, we consider the following three conditions as backward compatible:
1) two schemas are equal
2) two schemas have same number of arguments, and this schema's
arguments are backward compatible with the corresponding ones in
argument list of old_schema.
3) this schema has m argument, old_argument has n argument, m > n.
the first n arguments of this schema are backward compatible with
the corresponding arguments of old_schema. the remaning arguments
must be either OptionalType or provide default values.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23409
ghstack-source-id: 90111021
Test Plan: buck test //caffe2/test:function_schema
Reviewed By: hl475
Differential Revision: D16505203
fbshipit-source-id: e4099537776a60e8945e5c3cd57fa861f3598a9b
* Creates generic device type testing framework (#25967)
Summary:
This PR addresses https://github.com/pytorch/pytorch/issues/24851 by...
1. lets device types easily register themselves for testing
2. lets tests be written to run on multiple devices and with multiple dtypes
3. provides a mechanism to instantiate those tests so they are discoverable and filterable by unittest and pytest
It refactors three tests from test_torch.py to demonstrate how to use it.
`test_diagonal` is the simplest example. Most tests just need to be modified to accept 'device' as an argument. The framework will then instantiate `test_diagonal_cpu` and `test_diagonal_cuda` (when CUDA is available) which call `test_diagonal` with the appropriate 'device' argument.
`test_neg` also has dtype variants. It accepts both 'device' and 'dtype' as arguments, and the dtypes it runs with are specified with the 'dtypes' decorator. Dtypes can be specified for all device types and particular device types. The framework instantiates tests like `test_neg_cpu_torch.float`.
`test_inverse` has device-specific dependencies. These dependencies are expressed with the sugary 'skipCUDAIfNoMagma' and 'skipCPUIfNoLapack' decorators. These decorators are device-specific so CPU testing is not skipped if Magma is not installed, and there conditions may be checked after or before the test case has been initialized. This means that skipCUDAIfNoMagma does not initialize CUDA. In fact, CUDA is only initialized if a CUDA test is run.
These instantiated tests may be run as usual and with pytest filtering it's easy to run one test on all device types, run all the tests for a particular device type, or run a device type and dtype combination.
See the note "Generic Device-Type Testing" for more detail.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25967
Differential Revision: D17381987
Pulled By: mruberry
fbshipit-source-id: 4a639641130f0a59d22da0efe0951b24b5bc4bfb
* adds sync to flaky test_events_multi_gpu_query (#26231)
Summary:
This test can sometimes fail in CI.
I suspect this flakiness is because the test asks a CUDA stream to record an event, fails to synchronize the CPU with that stream, then checks if the event is recorded on the CPU. There is no guarantee this will have happened.
This one-line change preserves the intent of the test while ensuring the GPU has recorded the event before the CPU queries it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26231
Differential Revision: D17382110
Pulled By: mruberry
fbshipit-source-id: 35b701f87f41c24b208aafde48bf10e1a54de059
* Added possible out of shared memory error message (#25730)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/5040
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25730
Differential Revision: D17226214
Pulled By: pbelevich
fbshipit-source-id: 92278272aab74e6690f14fc9597acfd1a98854b7
* Remove armv7s build from iOS (#26222)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26222
### Summary
The last generation of armv7s device is Phone 5C. As discussed with David offline, we decided not to support iOS armv7s devices.
### Test plan
- CI finishes successfully
- Builds can be run only on X86_64 and arm64 devices
Test Plan: Imported from OSS
Differential Revision: D17385308
Pulled By: xta0
fbshipit-source-id: f883999aed18224ea3386b1f016964a33270fa34
* Back out "[quant][observer] Add histogram observer" (#26236)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26236
Original diff broke oss CI. Reverting.
Original commit changeset: 0f047d3349cb
ghstack-source-id: 90125990
Test Plan: testinprod
Reviewed By: hx89
Differential Revision: D17385490
fbshipit-source-id: 4258502bbc0e3a6dd6852c8ce01ed05eee618b1a
* Ports most of test_torch.py to generic device type framework (#26232)
Summary:
This PR moves many tests in test_torch.py to the generic device type framework. This means that many CUDA tests now run in test_torch.py and there is greater consistency in how tests for many device types are written.
One change is that all MAGMA tests are run on the default stream due to intermittent instability running MAGMA on the non-default stream. This is a known issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26232
Test Plan:
While this PR edits the tests itself, it was validated using two independent methods:
(1) The code was reviewed and it was verified that all deleted functions were actually moved.
(2) The output of the TestTorch CI was reviewed and test outputs were matched before and after this PR.
Differential Revision: D17386370
Pulled By: mruberry
fbshipit-source-id: 843d14911bbd52e8aac6861c0d9bc3d0d9418219
* Add type hint for cuda.set_rng_state (#26200)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/26199
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26200
Differential Revision: D17386885
Pulled By: soumith
fbshipit-source-id: 9da03aae29281b2ed691cbfdd7b85fde55e5b7ef
* Add a wrapper for inspect in JIT to produce better error message (#25415)
Summary:
If source code is not available due to packaging (e.g. sources are compiled to .pyc), TorchScript produces very obscure error message. This tries to make it nicer and allow to customize message by overriding _utils_internal.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25415
Test Plan: Really hard to unittest properly. Did one off testing by compiling to .pyc and checking the message.
Differential Revision: D17118238
Pulled By: dzhulgakov
fbshipit-source-id: 3cbfee0abddc8613000680548bfe0b8ed52a36b0
* Use MIOpen for transpose convolutions (#26172)
Summary:
Provides significant performance uplift where used.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26172
Differential Revision: D17374862
Pulled By: bddppq
fbshipit-source-id: 85d2df3c67b8935bc54f3a81a912a25c0102743a
* Call aten ops through c10 dispatcher (#23668)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23668
- The eager mode frontend now calls operators who are defined in native_functions.yaml with `use_c10_dispatcher: True` through the c10 dispatcher and not anymore through globalATenDispatch().
- These operators aren't registered with globalAtenDispatch anymore, only on c10 now.
- Backend extensions calling globalATenDispatch().registerOp() to add their own kernels still work, this function will forward the registration to the c10 dispatcher for them.
ghstack-source-id: 90130455
Test Plan: benchmarks at https://docs.google.com/document/d/1gpzKZcFf1JJameY1vKxF7Cloul9s6D8HKIK2_Pp1hFo/edit#
Differential Revision: D16603133
fbshipit-source-id: 991f17b355e9c78c5e86fee4fa381df7ab98ac82
* Remove unboxedAutogradKernel from c10 (#26130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26130
Since we now just use TensorTypeId::VariableTensorId, there's no need to treat autograd kernels any differently.
ghstack-source-id: 90130457
Test Plan: unit tests
Differential Revision: D17353873
fbshipit-source-id: d4468506a5366bc5e7429144b090b3e78af9de62
* Refines test_torch.py generic device testing (#26244)
Summary:
- Adds SkipCUDAIfRocm and skipCPUIfNoMkl decorators, ports corresponding tests
- Changes "SkipIf" input semantics for consistency
- Removes torchtest, which has been replaced with this new generic framework
- Refactors some common parts out of CUDA tests to TestTorchDeviceType
- Ensures all MAGMA tests run on default stream by putting the skipCUDANonDefaultStreamIf in the skipCUDAIfNoMagma decorator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26244
Differential Revision: D17389060
Pulled By: mruberry
fbshipit-source-id: 1375774f24c2266049e6d4b899e7300ddf32eac8
* Fix Windows build (#26246)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26246
Broken due to https://github.com/pytorch/pytorch/issues/12117. Try fixing it.
ghstack-source-id: 90137033
Test Plan: waitforsandcastle
Reviewed By: zou3519
Differential Revision: D17387317
fbshipit-source-id: 705998c0b1608668d510b47f4fe20cecf5057c5f
* Fix CI (#26250)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26250
Exclude some ops from the c10 dispatcher that don't work with it yet.
ghstack-source-id: 90138046
Test Plan: waitforsandcastle
Reviewed By: zou3519
Differential Revision: D17390117
fbshipit-source-id: a87fb3048aeba2c3293b95d610ddb8e94369f8fe
* Back out "[pytorch][PR] Refines test_torch.py generic device testing" (#26252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26252
Original commit changeset: 1375774f24c2
Testing to see if this is somehow the source of hangs on ROCm builds.
Test Plan: Change is to tests themselves. This diff is for testing the ROCm hang, however.
Differential Revision: D17390575
fbshipit-source-id: a6ffd5eb1df3971b99b6d42271a8d3d501ac79c6
* Fix namedtensor ci (#26257)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26257
In native_functions.yaml, all overloads must have unique overload names.
This PR fixes `flatten` to have unique names for the overloads.
Test Plan: - tested locally, but also [namedtensor ci]
Differential Revision: D17391243
Pulled By: zou3519
fbshipit-source-id: aaef654953b4275c43b9d7bd949c46bd011f6c73
* Switch to the new profiler infrastructure (#26174)
Summary:
The ones supported going forward are rocprofiler and roctracer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26174
Differential Revision: D17387538
Pulled By: bddppq
fbshipit-source-id: 19d9828d9d07b5073ab5fa288e24fd65a8b18b52
* Fix binary size of OpsAlreadyMovedToC10.cpp (#26237)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26237
Calling a lot of `std::string` constructors is horrible for binary size, see t53997334.
Using `const char*` instead should make the binary size much smaller.
ghstack-source-id: 90145501
Test Plan: size checks on the diff
Differential Revision: D17386002
fbshipit-source-id: c5420adf225e535396e806a0df92419a7e2ad3e8
* Fix no auto batching bugs: cannot bulk load; not work with namedtuple (#26065)
Summary:
see title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26065
Differential Revision: D17392851
Pulled By: soumith
fbshipit-source-id: 468cd41c8e03d689ff2e0261d948e28daad6bfaf
* Upgrade MKLDNN to v0.20.5 (#25757)
Summary:
1. Fix issues exposed by below posts.
https://github.com/pytorch/pytorch/issues/25242
https://github.com/pytorch/pytorch/issues/25101
https://github.com/pytorch/pytorch/issues/23825
2. Fix RNN support issue in mkldnn-bridge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25757
Differential Revision: D17367948
Pulled By: VitalyFedyunin
fbshipit-source-id: d8430d3909ecbf853afa0ce3d968735f86f1da31
* fix hypothesis timeout (#26280)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26280
ghstack-source-id: 90160270
Test Plan: testinprod
Differential Revision: D17396861
fbshipit-source-id: ee2348ffa7f6092e2c5647a42d0e17879dcfacd0
* Migrate away from using Variable( in test_nn.py (#26077)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26077
As per #26071, we would like to get rid of the calls to Variable(
where possible. This diff removes the calls in the test file test_nn.py. The
unit tests should all still pass as expected.
ghstack-source-id: 90086624
Test Plan: tests in `test_nn.py` should all pass.
Differential Revision: D17336484
fbshipit-source-id: 43fc7bd0b0be835ae89d06162ce1cbe4e0056d91
* Enabled conv methods for the bfloat16
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26167
Differential Revision: D17367728
Pulled By: izdeby
fbshipit-source-id: 0a7bd9a6dbc15815af195d644c9372af2135e93a
* Move the CUDA implementation of round to ATen. (#25041)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25041
Fix #24617
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25041
Test Plan: Imported from OSS
Differential Revision: D17114368
Pulled By: VitalyFedyunin
fbshipit-source-id: 6ec6ef99b4451acd7e93491fd4b44fca9ce1809d
* Whiltelist and fusion support for quantized::linear - addmm (#26208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26208
Supporing `addmm` -> `quantized::linear` quant fusion
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'
Imported from OSS
Differential Revision: D17380074
fbshipit-source-id: fae88f118f85663d777648695768b0504ed7ccf9
* Whiltelist and fusion support for quantized::linear - matmul(without bias) (#26209)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26209
Support quant fusion for `matmul`(without bias) -> `quantized::linear`
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'
Imported from OSS
Differential Revision: D17380075
fbshipit-source-id: 290caee7f7bcf94d2731c0ee9bd40054f0fb9b07
* Updating submodules
Summary:
GitHub commits:
https://github.com/facebook/mcrouter/commit/653434b898ea35810d7369d0911e3bdab9a1c3ac
https://github.com/facebook/proxygen/commit/b74fbefc1a69de78989f540d9d0d312945aeadeb
https://github.com/facebook/rocksdb/commit/9bd5fce6e89fcb294a1d193f32f3e4bb2e41d994
https://github.com/facebookincubator/mvfst/commit/6efcef720fac04011708840b89d1f174d3f290d0
https://github.com/facebookresearch/pytorch-biggraph/commit/cb7830b6b30d2d24b591178705eaf9e8209ecd09
https://github.com/pytorch/fbgemm/commit/53f0c0d175ae4283609a5b251052f9c6598b8aee
Test Plan: n/a
Reviewed By: yns88
fbshipit-source-id: 78d0e24f5601aa990391a2404ae9d23b325de93f
* Add ProcessGroupGloo::createDefaultDevice (#26166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26166
There were 2 variants to create a new device. One to do so based the
name of a network interface, and one to do so based on a hostname or
address. In the latter, if the address was not specified, it would
lookup the local hostname and try to resolve that. If that failed, the
process would crash.
In this default path, we now try to lookup and use the local hostname,
and if that fails we fallback to using the loopback address.
If the local hostname doesn't resolve to an address that we can bind
to, it is very likely that this process won't join other processes
over the network, and that the user is trying to run a local test.
If this assumption is wrong, the user can override the default
interface selection by setting the environment variable
`GLOO_SOCKET_IFNAME` to the name of the external network interface.
I tested this by changing the local hostname to a bogus name and
confirmed that default initialization works as expected.
Closes #26049.
Test Plan: Imported from OSS
Differential Revision: D17397898
Pulled By: pietern
fbshipit-source-id: 95a2467761d89df87b520d6e5837b92184b0dc12
* Disable broken unit tests (#26301)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26301
-
ghstack-source-id: 90176419
Test Plan: waitforsandcastle
Differential Revision: D17400971
fbshipit-source-id: b6f9cb27fe955b0200d62591300c70ba79a90e5f
* Kill defaults in nn.yaml. (#26282)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26282
Since this isn't the end-user API anymore, we shouldn't have defaults.
Test Plan: Imported from OSS
Differential Revision: D17397153
Pulled By: gchanan
fbshipit-source-id: d44040bec0ee9c70734a53ebcc10a96f12226a29
* Upgrade Caffe2 docker images to 306 to include roctracer and rocprofiler
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26260
Differential Revision: D17391902
Pulled By: bddppq
fbshipit-source-id: 89ab3dedf05ba398acb7300fac95f03cfb31f0ba
* Whiltelist and fusion support for quantized::linear - matmul(with bias) (#26204)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26204
Support quant fusion for `matmul` with bias to `quantized::linear`.
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'
Imported from OSS
Differential Revision: D17380073
fbshipit-source-id: 00014469a852cc5d5b66469fc4b8d05eafba1e3e
* Add __s390x__ compiler define for s390 builds. (#26233)
Summary:
pytorch builds fail on 390 architecture because
in simd.h the ifdef macros default to an x86 asm instruction.
This patchs adds an ifdef __s390x__ to be able to build on s390.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26233
Differential Revision: D17392714
Pulled By: soumith
fbshipit-source-id: 037672bfea64fc5e52da2390d93b973534137c12
* Clarified ambiguous docstring in NegativeBinomial
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25923
Differential Revision: D17392848
Pulled By: soumith
fbshipit-source-id: 2833e72fe449c74dfd8273a7b1eb46c05c63d999
* Dynamic quantization for bias. (#26057)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26057
bias is now unquantized (i.e. floating type) for qconv and qlinear. It is dynamically quantized by fbgemm.
TODO: Add some performance numbers.
Tests:
test:quantization
```
Summary (total time 8.41s):
PASS: 24
FAIL: 0
SKIP: 0
FATAL: 0
TIMEOUT: 0More details at https://our.intern.facebook.com/intern/buck/build/74d5f6f7-55c9-4350-a618-2013042fffd8
OMIT: 0
```
test:quantized
```
Summary (total time 13.21s):
PASS: 43
FAIL: 0
SKIP: 5
caffe2/test:quantized - test_qnnpack_maxpool2d (test_quantized.TestQNNPackOps)
caffe2/test:quantized - test_compare_tensor_scalar (test_quantized.TestComparatorOps)
caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps)
caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
caffe2/test:quantized - test_qnnpack_add (test_quantized.TestQNNPackOps)
FATAL: 0
TIMEOUT: 0
OMIT: 0
```
ghstack-source-id: 90166254
Test Plan:
buck test mode/dev caffe2/test:quantization
buck test mode/dev caffe2/test:quantized
Differential Revision: D17328028
fbshipit-source-id: d4a163d730d0f4a03e8e0faf7420710cf36eec09
* Use expected_wrapper only if CMAKE_{C,CXX}_COMPILER and/or is not set by user (#26306)
Summary:
This will honor user's preference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26306
Differential Revision: D17408030
Pulled By: soumith
fbshipit-source-id: 6841b805603d40cd7caf78dbb42405a0c931f052
* Add derivative of cholesky_solve (#26185)
Summary:
Changelog:
- Add derivative of cholesky_solve. The equations are derived akin to the derivative of solve methods using the technique detailed [here](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwiXrOjIyM7kAhWstlkKHRxqCDgQFjAAegQIAhAC&url=https%3A%2F%2Fpeople.maths.ox.ac.uk%2Fgilesm%2Ffiles%2FNA-08-01.pdf&usg=AOvVaw0BNISOvM_I9KjPrl0xv1R_)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26185
Test Plan:
- Added tests for cholesky_solve in test_autograd.py
Closes half of https://github.com/pytorch/pytorch/issues/4669.
Differential Revision: D17408123
Pulled By: soumith
fbshipit-source-id: f9668c8d4d758c0dc658941a8b730a17683091aa
* Kill 'default_init', which isn't needed anymore.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26281
Test Plan: Imported from OSS
Differential Revision: D17397097
Pulled By: gchanan
fbshipit-source-id: fb53e90637a3dfb2300fca78f414abe2d82832f3
* Export round (#26126)
Summary:
Added round export in opset 11
Pull Request resolved: https:…