Models that train successfully in FP32 may run into troubles converging when trained using FP16 precision. Fixing the convergence troubles requires extra care both in model preparation and choice of hardware architecture.
FP16 offers less precision vs. FP32, and – importantly – FP16 sometimes lacks range to express very small and very large values. As a result, very small FP32 values can become zeroes when cast to FP16. This can cause models convergence to break when trained using FP16.
To fix the convergence problem, one can
- Use FP32 hardware for accumulation (but FP16 for multiplies). The use of these two different precision formats is referred to as “mixed precision training”.
- Tweak the model to artificially scale up model parameters, such they don’t become zeroes in FP16. NVIDIA and Baidu explain this “loss scaling” tweak in more detail.
- Keep a master copy of weights using full FP32 precision when applying weight updates – yet use reduced-precision of those weights FP16 for forward and back-propagation