Agent Learning: Execution-Aware Policy Optimization for LLM Tool Systems
Mar 2026
Agentic Workflow
Policy Optimization
Formalizes tool orchestration in LLM systems as a learnable policy optimized for execution cost, latency, and constraint satisfaction.