MIT and Microsoft's 'Murakkab' cuts the cost and energy of running AI agents

Researchers from MIT and Microsoft Azure have detailed Murakkab, a system designed to make agentic AI workflows dramatically cheaper and less energy-hungry to run. Described in an MIT News writeup on June 25, the work targets a problem that has grown alongside the agent boom: the elaborate, multi-model pipelines behind modern agents are expensive and wasteful, partly because developers hand-wire which models, tools, and hardware each step uses and rarely revisit those choices.

Describe the goal, not the plumbing

Murakkab's central idea is to let a developer describe what an agentic application should accomplish in plain language, rather than specifying how the components fit together. The system then automatically selects the best available models and tools, decides which steps can run in parallel versus in sequence, and maps the workload onto hardware — continuously reconfiguring at runtime to honor the developer's stated priorities, whether that is minimizing cost, maximizing speed, or reducing energy use.

A profile-guided optimizer and an adaptive runtime jointly manage that full stack, a kind of cross-layer optimization the researchers say existing agent frameworks and cloud schedulers cannot do on their own because each only sees part of the picture.

The efficiency numbers

In the team's tests, the gains were substantial. Murakkab completed comparable work using about 35 percent of the compute, roughly 27 percent of the energy, and around 25 percent of the cost of more conventional approaches. In one case the system achieved better than an order-of-magnitude reduction in energy with only about a 2 percent drop in accuracy — the kind of trade-off that becomes attractive at scale.

The project is credited to MIT's Gohar Chaudhry and associate professor Adam Belay, working with Microsoft Azure technical fellow Ricardo Bianchini and collaborators, and is slated for presentation at the USENIX Symposium on Operating Systems Design and Implementation (OSDI).

The timing is notable. As enterprises move agents from pilots into production, the recurring bill for inference — not the one-time cost of building an agent — is becoming the dominant expense, and data-center energy draw is under growing scrutiny. Work like Murakkab reframes agent reliability and economics as a systems problem, suggesting much of the waste in today's agent stacks comes not from the models themselves but from how clumsily they are wired together.

AI-assisted reporting, overseen by the AgentsAI team. Spotted an error? Let us know.

Agents AI

MIT and Microsoft's 'Murakkab' cuts the cost and energy of running AI agents

Describe the goal, not the plumbing

The efficiency numbers

Sources

More ai news

OpenAI and Broadcom unveil 'Jalapeño,' a custom chip built only for LLM inference

US partially lifts Anthropic's Mythos 5 export ban, Fable 5 still blocked

OpenAI agrees to stagger GPT-5.6 release as US government screens early access

Transformer co-inventor Noam Shazeer leaves Google for OpenAI