Leverage pre-trained models and transfer learning

Applicable Role: Provider

Description

Training AI and ML models from scratch requires significant compute, data, and time, leading to high energy consumption and carbon emissions. In many cases, models can be initialized from pre-trained versions and adapted to specific tasks through fine-tuning.

Leveraging pre-trained models avoids redundant training effort and reduces the overall resource footprint of model development.

Solution

Select pre-trained models that are relevant to the target task
Fine-tune models instead of training from scratch where possible
Reuse existing model weights and representations to reduce training effort
Evaluate whether full training is necessary before starting new model development
Use domain-adapted or task-specific pre-trained models when available

SCI Impact

SCI = (E × I) + M per R

E (Energy): Avoiding full training significantly reduces compute and energy consumption.

M (Embodied Carbon): Reduced infrastructure usage lowers embodied emissions associated with training.

R (Functional Unit): For providers using per FLOP or per training token as the functional unit, transfer learning dramatically reduces the total FLOPs and tokens required, lowering total carbon (C) while R scales proportionally, resulting in a more favorable SCI score.

Cost Impact

Training costs: Dramatically reduced by avoiding full model training
Compute time: Significantly lower for fine-tuning vs. training from scratch
Pre-trained model licensing: Potential licensing costs for commercial model access
Data costs: May be lower if transfer learning requires less training data
Trade-off: Pre-trained model licensing may offset training cost savings

Assumptions

Suitable pre-trained models are available for the target use case
Fine-tuning can achieve the required performance

Considerations

Pre-trained models may introduce biases or limitations from their original training data
Fine-tuning large foundation models can still require substantial compute resources comparable to training from scratch; evaluate the true cost-benefit of fine-tuning vs. full training for your use case
Licensing and usage restrictions of pre-trained models must be evaluated
Model suitability should be validated for the specific domain

Description​

Solution​

SCI Impact​

Cost Impact​

Assumptions​

Considerations​

References​