Green Software Patterns

Containerize your workloads

Containerizing workloads enables better resource utilisation and bin packing, reducing unnecessary compute allocation and embodied carbon compared to running full virtual machines.

Service state refers to the in-memory or on-disk data required by a service to function. State includes the data structures and member variables that the service reads and writes. Depending on how the service is architected, the state might also include files or other resources stored on the disk. Applications that consume large memory or on-disk data require larger VM sizes, especially for cloud computing where you would need larger VM SKUs to support high RAM capacity and multiple data disks.

Leverage pre-trained models and transfer learning

Fine-tune existing pre-trained models instead of training from scratch to dramatically reduce the compute, energy, and time required for model development.

Match utilization requirements of virtual machines (VMs)

It's better to have one VM running at a higher utilization than two running at low utilization rates, not only in terms of energy proportionality but also in terms of embodied carbon. Two servers running at low utilization rates will consume more energy than one running at a high utilization rate. In addition, the unused capacity on the underutilized server could be more efficiently used for another task or process.

Match utilization requirements with pre-configured servers

It's better to have one VM running at a higher utilization than two running at low utilization rates, not only in terms of energy proportionality but also in terms of embodied carbon. Two servers running at low utilization rates will consume more energy than one running at a high utilization rate. In addition, the unused capacity on the underutilized server could be more efficiently used for another task or process.

Minimize main thread work

Long-running JavaScript on the browser's main thread underutilises multi-core CPUs; offloading heavy computations to Web Workers or server-side implementations reduces energy consumption and improves efficiency.

Optimize agent orchestration to reduce unnecessary model calls

Design agentic AI workflows to minimise redundant model invocations and unnecessary compute through caching, conditional logic, and efficient orchestration patterns.

Optimize average CPU utilization

CPU usage and utilization varies throughout the day, sometimes wildly for different computational requirements. The larger the variance between the average and peak CPU utilization values, the more resources need to be provisioned in stand-by mode to absorb those spikes in traffic.

Optimize peak CPU utilization

CPU usage and utilization varies throughout the day, sometimes wildly for different computational requirements. The larger the variance between the average and peak CPU utilization values, the more resources need to be provisioned in stand-by mode to absorb those spikes in traffic.

Reduce network traversal between VMs

Placing VMs in the same region or availability zone minimises the physical distance data must travel between instances, reducing the energy consumed by network traversal.

Run AI models at the edge

Deploy AI inference on edge devices or local infrastructure to reduce data transfer, network energy use, and reliance on centralised cloud compute.

Scale infrastructure with user load

Demand for resources depends on user load at any given time. However, most applications run without taking this into consideration. As a result,resources are underused and inefficient.

Scale logical components independently

Decomposing applications into independently scalable microservices allows each component to be right-sized for its own demand, reducing overall compute resource consumption and embodied carbon.

Scan for vulnerabilities

Many attacks on cloud infrastructure seek to misuse deployed resources, which leads to an unnecessary spike in usage and cost.

Select efficient accelerators and instance types for AI workloads

Match AI workloads to the most energy-efficient hardware accelerator or instance type to improve utilisation and reduce energy consumption per inference or training run.

Select efficient ML frameworks and inference runtimes

Choose ML frameworks and inference runtimes that best match your hardware and workload to reduce compute overhead and improve energy efficiency across training and production inference.

Shed lower priority traffic

When resources are constrained during high-traffic events or when carbon intensity is high, more carbon emissions will be generated from your system. Adding more resources to support increased traffic requirements introduces more embodied carbon and more demand for electricity. Continuing to handle all requests during high carbon intensity will increase overall emissions for your system. Shedding traffic that is lower priority during these scenarios will save on resources and carbon emissions. This approach requires an understanding of your traffic, including which call requests are critical and which can best withstand retry attempts and failures.

Terminate TLS at border gateway

Transport Layer Security (TLS) ensures that all data passed between the web server and web browsers remain private and encrypted. However, terminating and re-establishing TLS increases CPU usage and might be unnecessary in certain architectures.

Use carbon-aware scheduling and region selection for AI workloads

Reduce the carbon impact of AI workloads by running them in cloud regions with lower grid carbon intensity and scheduling deferrable jobs during periods of high renewable energy availability.

Use circuit breaker patterns

Modern applications need to communicate with other applications on a regular basis. Since these other applications have their own deployment schedule, downtimes and availability, the network connection to these application might have problems. If the other application is not reachable, all network requests against this other application will fail and future network requests are futile.

Use cloud native network security tools and controls

Network and web application firewalls provide protection against most common attacks and load shedding bad bots. These tools help to remove unnecessary data transmission and reduce the burden on the cloud infrastructure, while also using lower bandwidth and less infrastructure.

24 docs tagged with "compute"