AdaFamily optimizer
AdaFamily: A family of Adam-like adaptive gradient methods for training neural networks
The repository contains the python code for AdaFamily, a novel method for training deep neural networks. It is a family of adaptive gradient methods and can be interpreted as sort of a blend of the optimization algorithms Adam, AdaBelief and AdaMomentum. It outperforms these methods on standard datasets for image classification, as is shown in the paper at https://arxiv.org/pdf/2203.01603.pdf.
The repository contains a wrapper for several SoA adaptive-gradient optimizer (Adam/ AdamW/ EAdam/ AdaBelief/ AdaMomentum/ AdaFamily) via one API, including my novel 'AdaFamily' algorithm. Regarding 'AdaFamily', see the arxiv preprint at https://arxiv.org/abs/2203.01603 (submitted to ISPR 2022 conference). Setting 'myu' to either 0.25 oder 0.75 might be the best choice according to experiments (see preprint).
The main class to use is the class 'OmniOptimizer' (file 'omni_optimizer.py').
See Link for an application where AdaFamily is used for NLP finetuning (and shows better performance than finetuning with 'Adam').