* [mpscnn] MPSCNNChannelShuffle
att
* [Easy] Adding tags as an argument to the functional layer
Without it "tags" would be added as an argument to the operator.
The change here is based on the assumption that there is no operator that takes "tags" as an argument.
* Fix locally_connected_op schema check.
Fix locally_connected_op schema check.
* [C2] Add TypeAndShape inference for few more operators
As desc
* [c2] Shape inference should support 0 as dimension
Tensors can have 0 in their dimension.
* Make MockHiveReader loop over and support max_examples
Replace DatasetReader with RandomDatasetReader.
So that Mock Hive Reader can simulate a large data input using a small sample file as source.
* Utility function to wipe cache between benchmark runs
Caffe2 benchmark does not wipe out cache between runs, and this potentially creates an unrealistically optimistic picture of performance. This diff adds utility function to wipe out the cache.
* Allow caffe2 GlobalInit to be invoked multiple times
Allow caffe2 GlobalInit to be invoked multiple times. Will re-parse gflags and update logging levels on successive invocations, but will not re-run init functions or perform other one-time initialization.
* Add Caffe2 GlobalInitIsCalledGuard to base net and operator classes
Warn if caffe2's GlobalInit function has not been invoked before creating an operator or net object. This is based on discussion here: https://fb.quip.com/kqGIAbmK7vNG
* Rethrow current exception on failure
Rethrow current exception instead of copy constructing a new one on op failure.
* Make `clone()` return subclass of List/Struct
`clone()` is not working correctly when we subclass those classes
* Wipe the cache before the net run
the util function is copied from D7409424
will rebase once D7409424 is landed.
* [Caffe2] [Mobile] Support utils/cast.h::GetCastDataType with LITE_PROTO builds
* Correct includes
async_polling include -> async_base include
* Prepare execution flags for executor migration
Making async_scheduling aware of underlying net type to prepare for executor
migration
* Add operator level observers into async executor
Adding operator level observers into RunAsync operators' calls
* Cleanup TEST_Benchmark
Remove duplicate code and provide default implementation in NetBase
* [C2] Fix type and shape inference for binary comparison ops
As desc.
* Add GlobalInit to predictor to ensure initialization is always done before prediction
FACEBOOK:
Redo D7651453 the correct way.
Now use a static variable for the arguments passed to GLog
* Remove spammy log message
This method is currently used in various places inside Caffe itself.
* Disable events for operators inside a chain
We don't need to use events in operators within a chain because the chain is
always scheduled on a single stream, keeping only first and last event for
scheduling purposes
* Ensure correct finish run order
In rare cases we might call finishRun and trigger net's destruction while
another worker is still holding shared_ptr to a thread pool, that can cause
thread pool destruction from within a worker thread in case no other nets are
using the pool. This diff fixes the order of calling finishRun and also changes
pool() to return raw pointer to keep pool's ownership within the net
* Reduce unnecessary polling
Make sure we don't waste CPU by polling operators that we can set an efficient
callbacks on
* Squash commit of syncing 9506eeb from github to fbcode
Patch xplat buck fix
add virtual destructor to OptimizationPass
add virtual destructor to OptimizationPass
build fixes for sync
build fixes for sync
* Fix net tracing
Fix net tracing from async_scheduling
* Fix logging
* Fix handling of empty batches in SumReduceDimsOp
As titled
* Deferrable async_scheduling finishRun fix
Proper order of finishing run operations in deferrable_async_scheduling net
* Simplify exception handling in async_scheduling
Simplify exception handling, no need to busy wait, thread that processes the
last task can finish the run
* [C2]worker_coordinator_memorize_worker_ids
As titled. This is related to T28689868, where the number of blobs we want to create is equal to the number of worker ids
* Add unit test for nets with no type set
* Ignore total length argument in sympolic_pad_packed_sequence
1- There was a mistake in the code that total_length was added to the wrong symbolic function (pack_padded_sequence) instead of (pad_packed_sequence)
2- No need to throw an exception if total_length is given since it is only used to enable data_parallel training on multi-gpus and doesn't have anything to do with onnx export, so just ignore it. https://fburl.com/tk4gciqp
* Add support for MKLDNN to async_scheduling
Just add MKLDNN as a possible CPU option to async_scheduling's pool function
* [AuFL][ensemble] support branch output for prediction
This diff supports using predictions from different branches and thus enables model ensembling (not fully independent).
* Fix a bug in add_loss in layer_model_helper
As titled.
* Support lradaption for adam
1.lr adaption operator
2.apply to dense adam
* Perf tweaks for async_scheduling
Restore single pool option + remove unnecessary (no-ops) calls
* add quantization to SparseSimdAdagradOp
add a bunch of quantization signatures to SparseSimdAdagradOp, implementations to come next
* [sr] [codemod] Change all SR callsites to use new API
@allow-large-files
This diff refactors all callsites of SR to use the slightly changed API introduced in the diff below. Really what this means is that you need to include the correct header. Also if you were using `ClientFactory::newFactory` you need to not prefix it with `ClientFactory::`.
```
cd ~/fbsource/fbcode
find ./ -type f -exec sed -i -e 's:#include "servicerouter/client/cpp2/ClientFactory.h":#include "servicerouter/client/cpp2/ServiceRouter.h":' -e 's:#include <servicerouter/client/cpp2/ClientFactory.h>:#include <servicerouter/client/cpp2/ServiceRouter.h>:' -e 's/ClientFactory::newFactory(/newFactory(/g' {} \;
```
Also manually fixed spots that couldn't be done automatically (or broke because they depended on transitive includes).
* Back out "Fix handling of empty batches in SumReduceDimsOp"
Original commit changeset: 282da1730cc2 This commit is blocking the
Github->fbcode sync, which really needs to get merged ASAP. D7881937 which this
diff depends on will be reverted in the sync D7990948 which causes this to
break. The sync diff cannot be patched with this reversion because it must be
landed against base revision 5c8c099 , and D7881937 must not be included in the
sync diff because it is breaking GPU tests that are not available in sandcastle
: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-cuda8.0-cudnn6-ubuntu16.04-test/3638/console
for one example.
* Add the flow to support operator benchmark
1) generate model with the operator 2) upload to everstore 3) generate model spec into json file 4) start running the benchmark
* [tum][gpu] Connect DPM trainer with flow and unit tests
This diff:
- Fix some small bugs for Yiming's recent changes to parallelizer, so it suits real use cases.
- Add correct tags to the TUM code, so we can do data parallel transform
- pass extra info when instantiation.
- add unit test for using DPM in TUM model
After this diff, we can do simple box, multi-gpu fully-sync trainer for TUM in Fblearner workflow, but may still need to do speed benchmarking.
* w/o normalized lradaption for adam dense only
The previous lr adaption includes a normalization step when performing the dot product operation. This is not exactly same as what is proposed in the paper. I add normalization as an option. Without it, the operator performs exactly what the paper proposed. With the option, we add the normalization step
* [fb] Use SharedPromise in DeferrableAsyncSchedulingNet
This code is to simplify DeferrableAsyncSchedulingNet by removing condition
variable + small fixes
* [tum] implement cuda sparseLengthsMean and LengthsMean
as title
* Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.
Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.
* Move feature_to_index to FeatureSpec.feature_to_index
move feature_to_index to FeatureSpec.feature_to_index to avoid override other fields
* [Caffe2] Rename bytes_moved to bytes_written
Just a rename in preparation for supporting bytes_read.
* [c2] fix ReduceFrontSumOp for empty case by setting 0
otherwise, it may use the results from last iteration when it's empty batch.
* [Caffe2] [Int8] Improve Intel CPU performance
* [Easy] Improve PrependDim op logging
as titled
* DBFileReader expand db_path using os.path.expanduser(..)
Since there are a lot of possible use cases of `DBFileReader` to read from user home path, like `~/local/sample.db`, I want to save people's trouble of calling `os.path.expanduser(db_path)` themselves.
* [Caffe2] Add bytes_read to cost structure
We're adding analytical read bytes to cost functions. This extends the structure accordingly for all CostInference defined operators.
Additionally, some small bug fixes were performed:
1) Cost functions now extract type information of operands instead of assuming float
* Fix sleef on aarch64 for hhvm
@bypass-lint
Rename flag
* Remove duplicated part in caffe2/ideep/operators/conv_op.cc
should be sync error
* Rename test helper function test_adagrad_sparse_helper to adagrad_sparse_test_helper to avoid confusing pytest
The schema.Scalar class makes pretty strict assumptions (via its docstring)
on the spec of the shape of its underlying object. Because of idiosyncracies
of numpy indexing and the use of np.dtype, those assumptions are broken on an
edge case (dtype = (scalar_type, 1)). This corrects the behavior of this
edge case to conform to the spec.
Summary: as title. This is similar with python pprint utility for nested json data structure. It can be useful for checking schema during debugging.
Reviewed By: kittipatv
Differential Revision: D6710767
fbshipit-source-id: e450aa5477fa1ad4f93c4573f8108a2f49956da8
Summary: Make LastNWindowCollector optionally thread-safe. The main benefit is that the mutex can then be used to lock the buffer later, avoiding the need to copy the data.
Reviewed By: chocjy
Differential Revision: D5858335
fbshipit-source-id: 209b4374544661936af597f741726510355f7d8e
Summary: Currently, it's not easy to track down which tensor is missing type and shape info. Print it out for easier debuggin.
Reviewed By: volkhin, xianjiec
Differential Revision: D5695223
fbshipit-source-id: 7f0be0be777a35bb5a71b3799b29b91f0763c159
Summary:
Currently, for `from_column_list` if the input col_names=[], it throws
errors. To solve this issue, we fix the get_field function so that it creates
an empty Struct when empty col_names is given.
Reviewed By: kittipatv
Differential Revision: D5543865
fbshipit-source-id: f6dfa25326e355f8ec24e5542761851a276beeb9
Summary:
This is for the ease of removing the common fields of a struct from another.
For example,
s1 = Struct(
('a', Scalar()),
('b', Scalar()),
)
s2 = Struct(('a', Scalar()))
s1 - s2 == Struct(('b', Scalar()))
More examples are provided in the code comments.
Differential Revision: D5299277
fbshipit-source-id: 7008586ffdc8e24e1eccc8757da70330c4d90370
Summary:
As described in T19378176 by kittipatv, in this diff, we fix the issue of __getitem__() of schema.List.
For example, given Map(int32, float) (Map is a special List), field_names() will return "lengths", "values:keys", & "values:values". "values:keys" and "values:values" are not accessible via __getitem__(). __getitem__() bypasses the values prefix and directly access the fields in the map. Other APIs (e.g., _SchemaNode & dataset_ops) expect "values:keys" and "values:values" as it simplifies traversal logic. Therefore, we should keep field_names() as is and fix __getitem__().
Reviewed By: kittipatv
Differential Revision: D5251657
fbshipit-source-id: 1acfb8d6e53e286eb866cf5ddab01d2dce97e1d2
Summary:
The current version of schema.py has a Metadata class with three fields. The default for it is set to
four Nones. This is just changing that to three Nones so that the number of default values matches the number
of actual fields.
Reviewed By: kennyhorror
Differential Revision: D5250463
fbshipit-source-id: 42e5650d270f5f63662614d8445b4819ed370dec
Summary: Previous implementation relied on the order of fields for some reason.
Reviewed By: azzolini
Differential Revision: D5164478
fbshipit-source-id: 12717310860584e18ce4ca67d0bd5048354cdc0a
Summary:
fixing missing future package issue.
Recently we found some of our users does not have future module support. So we might need a try/catch wrapper around all past import
Reviewed By: Yangqing
Differential Revision: D5183547
fbshipit-source-id: 262fdf2940ee1be4454bf0b0abb9e6a0f1a0ee82
Summary:
Split the Caffe2 memory based model into to parts
- Dimension reduction MLP
- DNN with concatenation of memory and obj feature
Currently only implement simple mean
Differential Revision: D4866825
fbshipit-source-id: d2f6813402513ec9af30dbe29a50593e2d3cdb3b
Summary: This diff is one step towards enabling python 3 build by making it be more diligent in its handling of strings.
Reviewed By: salexspb
Differential Revision: D4893083
fbshipit-source-id: 28b8adf3280e8d1f0a7dc9b0fee5ad53f2fada57
Summary: The code snippet below is invalid in the add unit test is invalid but it may or may not cause exception. Disable the syntax so people don't accidentally use it.
Reviewed By: dzhulgakov
Differential Revision: D4985030
fbshipit-source-id: ffa2b26f7b29128b196aba1b1001a97c87e381cf
Summary: I ran into this earlier and the debug messages were not helpful enuogh
Reviewed By: kennyhorror
Differential Revision: D4985754
fbshipit-source-id: b3d12b5e2cfa1b54fca9126768c84c902664ef28
Summary: Calling `set()` or `set_value()` on Scalar is dangerous as something might be holding a reference to it. This is especially true with `LayerModel`, where instantiation is delayed. The code may still run but it will produce unexpected results, i.e., values maybe written to the wrong blob.
Reviewed By: kennyhorror
Differential Revision: D4955366
fbshipit-source-id: f5e8694a9a411ee319ca9f39a0fed632d180b8a5
Summary:
This diff contains the following changes:
- implementing __repr__ on Field types; this makes it a little easier to see what broken in the unit tests
- preserve the shape of ndarray input to schema; previously, empty and scalar arrays lose their shape, while other keeps the shape.
- type-checking ndarray input; this ensures basic integrety of schema
Reviewed By: xianjiec
Differential Revision: D4913030
fbshipit-source-id: bd0f6b8722d95bfe800edf98ba05029c5b99d2af
Summary: `not field` calls `__len__()`, causing the field to appear to be missing even when it's not
Differential Revision: D4910587
fbshipit-source-id: bc2b2fadab96571ae43c4af97b30e50c084437af
Summary:
as desc.
small fix in the feature_proc layer for the case when we only have one preproc type
Reviewed By: chocjy
Differential Revision: D4908933
fbshipit-source-id: 1338048fc395f85c3724721a9996ad1ee51f0f20
Summary:
Add distributed training to dper2 and keep the dper1 working.
* Created a ModelDelegator to wrap ModelHelper and LayerModelHelper to mitigate the difference.
* To get the average length for sparse feature, I extracted some information in feature_processor. There should be some better way to do it after we have new compute_meta.
* metric right now only runs on the first trainer.
* The model is saved correctly for evaluation. But I'm still not sure how to handle the weights for adagrad.
Reviewed By: kennyhorror
Differential Revision: D4767745
fbshipit-source-id: 0559d264827a7fd9327071e8367d1e84a936bea9
Summary:
D4690225 added support for nested field name lookup in nested
`schema.Struct`s. It would throw a KeyError if trying to access a nested
`List`s field. Writing the lookup recursively avoids the need to enumerate
all complex field types in the lookup.
Differential Revision: D4719755
fbshipit-source-id: 37c87a32d730f0f45f72fb20894da3e32f820999
Summary:
1. migrate the basic mtml model to dper 2
2. test dper 2 mtml model
3. test all optimizers
Reviewed By: kittipatv
Differential Revision: D4680215
fbshipit-source-id: 7aac5c59bdac22fcad8ed869b98e9e62dca1d337
Summary:
We are having more and more nested Struct schema. There is increasing need to get/adda field by nested name, e.g., for the following nest Struct schema:
st = Struct(
('a': Scalar()),
('b': Struct(
('c': Scalar()),
)),
)
We may want to get the field "b:c" and/or insert a new field "b:x". The immediate need is for dper2 metrics.
This diff is to achieve this.
Reviewed By: kittipatv
Differential Revision: D4690225
fbshipit-source-id: 71d4a74b36bd1228a2fefd901db2f200602152b7
Summary: Whe debug using LayerModelHelper, adding Print to model will trigger this assert.
Reviewed By: xianjiec
Differential Revision: D4687859
fbshipit-source-id: 6932e38f8dd17ba0b80da18a20943ecdb2e8af0a
Summary:
This diff is trying to address one of the concerns that Xianjie have had - requirements create a layer for all operators and attach pass shapes and other info around.
The basic idea of the diff:
1. Try to create a layer with a given name, but if it's not available try to fallback on operator with that name (that is expected to have no parameters).
2. For all operators that we're adding through this functional style of creation - try to use C2 Shape/Type inference logic to get output type. If we fail to get - it just return untyped record and expect user to annotate it when it's really needed.
Reviewed By: xianjiec
Differential Revision: D4408771
fbshipit-source-id: aced7487571940d726424269970df0eb62670c39
Summary: Do I understand correctly? It must be of size 1 for sigrid
Reviewed By: kennyhorror
Differential Revision: D4576541
fbshipit-source-id: 92fa8dc62e36ff095e14cceeb80b03c0028f5695
Summary:
Remove the use of `NextName` in layer model helper, so that the same function return `model_helper` that should construct identical `Net`, when under the same NameScope.
The `NextScopedBlob` should only take effect when there is real name conflicting, otherwise it returns ScopedBlobReference.
This is critical for parameter blobs. In long run, we need to be able to specify parameter blobs more explicitly. (kennyhorror is working on this). This solution works in short term for e.g., two tower sparse nn models.
Reviewed By: kennyhorror
Differential Revision: D4555423
fbshipit-source-id: 2c4b99a61392e5d51aa878f7346466a8f14be187
Summary:
We want to train models with user sequence data for mobile side ranking.
The operators are for preprocessing the sequence based data. They read in a sequence with a batch and convert the examples with different method.
I also add a new loader for connecting the operator to existing trainers
Differential Revision: D4485411
fbshipit-source-id: 0cf17206704995f2ce079e1594607bea70b1ed0c
Summary:
Ievgen ran into this bug with his dper work - we didn't preserve metadata on lengths field.
Also, we didn't take keep_blobs into account for List's main field. Now fixed.
Also, reformat the file to be nice.
Differential Revision: D4357859
fbshipit-source-id: 1c26c533a10d38afab13b46ccbcb541f5fa9074a
Summary: att. part of the effort to unify loader configueration.
Differential Revision: D4342147
fbshipit-source-id: bb021112f61d4838b0ccc7a5a8bcaf272cb35cd8
Summary:
We want to implement request only net and to do this we decided to split the work into two parts. The first part will propagate required metadata and the second part will cut the nets properly.
This diff is to propagate request_only metadata across the layers.
A few notes about implementation:
- Each layer contains a field request_only which can be set based on the input_record. If all the scalars from the input_record are marked request_only we mark a layer as request_only;
- Sparse-To-Dense layer sets request_only metadata;
- SigridTransformation and SparseLookup layers propagate request_only status;
- As for now we join request_only and other sparse features together in input_record, but ideally we may want to separate this, because request_only should be served separately;
Reviewed By: xianjiec
Differential Revision: D4259505
fbshipit-source-id: db8a30ef92cba84f1a843981b9dde3a8b9633608