
Session-Level vs Node-Level Metrics: What Each Reveals About Agent Quality
Evaluating AI agents requires more than a single score. Real systems involve multi-turn interactions, tool usage, retrieval, and branching decisions. The most reliable method is to measure quality at two layers: session level and node level. Session-level metrics summarize the outcome and user experience of a complete interaction. Node-level metrics