Previously, we imported from bitsandbytes eagerly if the package was
installed. This caused two major issues:
- Slow loading time of PEFT (~4 sec)
- Errors with multiprocessing because bnb initializes CUDA
This commit fixes both issues by importing bitsandbytes lazily. PEFT
import time is now reduced to ~2sec.
Notes
Implementation-wise, I use a combination of local imports and
module-level __getattr__. The latter was introduced in Python 3.7 and
should therefore be safe to use.