Parameter-Efficient Fine Tuning

Within the landscape of large-scale models, conventional compression techniques may prove insufficient to address the challenges posed by their large size. In this context, Parameter Efficient Fine Tuning (PEFT) emerges as a more aggressive strategy aimed at reducing the number of trainable weights, thereby mitigating communication overhead.

PEFT adjusts only a small proportion of model parameters, resulting in a significant reduction in computational complexity. This reduction in the number of trainable weights translates to a decrease in communication payload, particularly beneficial in Decentralized AI (DeAI) settings.

PEFT can be categorized into several approaches\cite{han2024parameter}:

Additive modules modifies the model architecture by injecting an additive trainable modules or layers \cite{houlsby2019parameter, pfeiffer2020adapterfusion}.
Soft prompts utilize continuous embedding spaces of soft prompts to refine model performance. Notable examples include Prefix-Tuning, which leverages prefix vectors for inference after fine-tuning \cite{li2021prefix,liu2021p}.
Selective fine-tuning involves selecting a small subset of parameters, making them tunable while keeping the remaining weights frozen. Diff Pruning\cite{guo2020parameter,vucetic2022efficient} applies parameter pruning techniques to achieve efficiency gains . \item Reparameterized PEFT employs low-rank parameterization techniques to construct more efficient representations, which are then transformed back for inference\cite{hu2021lora}.

The emergence of large-scale models has catalyzed the development of diverse techniques aimed at significantly reducing the computational complexity and communication overhead associated with these models in the realm of DeAI.

PreviousCompression NextConclusion

Last updated 1 year ago