Use on-demand execution for AI and agent workloads

Applicable Role: Consumer

Description

AI systems increasingly operate as dynamic, multi-step workflows, especially in agentic architectures where models interact with tools, data sources, and other models.

Keeping compute resources or agent workflows active when not required leads to unnecessary energy consumption. This includes idle infrastructure, continuously running agents, and long-lived orchestration processes.

Using on-demand execution ensures that compute and workflows are triggered only when needed, reducing idle time and improving overall efficiency.

Solution

Use serverless or event-driven platforms to execute workloads only when triggered
Design agent workflows to run only when required and terminate after task completion
Avoid long-running or always-on agent processes unless continuously needed
Trigger model calls and tool usage conditionally rather than continuously
Use orchestration frameworks that support event-driven execution and efficient workflow management
Scale resources dynamically based on demand and workload intensity

SCI Impact

SCI = (E × I) + M per R

E (Energy): Reduces energy consumption by eliminating idle compute and unnecessary agent execution.

I (Carbon Intensity): On-demand execution can be combined with carbon-aware scheduling (Pattern 5) to trigger workloads during low-carbon periods.

M (Embodied Carbon): Improved utilization of shared infrastructure reduces overall hardware demand.

Cost Impact

Compute costs: Reduced by eliminating idle infrastructure and always-on processes
Cold start overhead: Serverless platforms may incur higher per-invocation costs than reserved instances
Provisioned concurrency: Can mitigate cold starts but adds baseline cost
State management: Stateless design may require additional storage or messaging infrastructure
Trade-off: Per-invocation serverless pricing vs. reserved instance baseline; evaluate break-even point

Assumptions

Workloads and agent workflows can be structured as event-driven processes
Execution environments support dynamic scaling and orchestration, and workloads can be safely interrupted and resumed without losing state or requiring expensive recomputation

Considerations

Cold start latency may impact responsiveness
Complex workflows may require careful orchestration design
Not all workloads are suitable for on-demand execution
Inefficient agent design can still lead to excessive compute even in serverless environments
Trade-offs between responsiveness, cost, and carbon should be evaluated

Description​

Solution​

SCI Impact​

Cost Impact​

Assumptions​

Considerations​

References​