Adopt serverless architecture for AI/ML workload processes
Building an ML model takes significant computing resources that need to be optimized for efficient utilization.
Building an ML model takes significant computing resources that need to be optimized for efficient utilization.
Fine-tune existing pre-trained models instead of training from scratch to dramatically reduce the compute, energy, and time required for model development.
Design agentic AI workflows to minimise redundant model invocations and unnecessary compute through caching, conditional logic, and efficient orchestration patterns.
Use efficient storage formats, compression, and indexing strategies for AI datasets and embeddings to reduce storage footprint, data transfer, and retrieval compute.
Deploy AI inference on edge devices or local infrastructure to reduce data transfer, network energy use, and reliance on centralised cloud compute.
Training an AI model implies a significant carbon footprint. The underlying framework used for the development, training, and deployment of AI/ML needs to be evaluated and considered to ensure the process is as energy efficient as possible.
Match AI workloads to the most energy-efficient hardware accelerator or instance type to improve utilisation and reduce energy consumption per inference or training run.
Choose ML frameworks and inference runtimes that best match your hardware and workload to reduce compute overhead and improve energy efficiency across training and production inference.
Reduce the carbon impact of AI workloads by running them in cloud regions with lower grid carbon intensity and scheduling deferrable jobs during periods of high renewable energy availability.
Evaluate and use alternative, more energy efficient, models that provide similar functionality.
Trigger AI and agent workloads only when needed using serverless or event-driven platforms to eliminate idle compute and reduce unnecessary energy consumption.
Select and optimize AI models that are appropriately sized for the task to reduce compute, memory, and energy consumption during training and inference.