## MOTIVATION
To generalize Distributed checkpoint test cases for non-CUDA devices
## CHANGES
18 test files with minimal device abstraction changes updated in
test/distributed/checkpoint/
- Use device_type from DTensorTestBase wherever appropriate
- Replaced hard coded device names with torch.accelerator.current_accelerator()
- extend multi gpu decrator for other devices
test/distributed/checkpoint/test_state_dict_stager.py has large diff, that's because i changed the name cuda_obj to gpu_obj. Functional change is minimum.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159242
Approved by: https://github.com/guangyey, https://github.com/d4l3k