Human-in-the-loop (HITL)
Human-in-the-loop (HITL) is a design pattern in which AI does the work and a person reviews and approves at defined checkpoints before consequences attach. The system extracts, scores, drafts, or recommends; a human confirms before an award is made, a payment released, or a commitment sent to a supplier. Checkpoint placement, not model quality alone, determines how much risk an AI workflow carries.
Examples
Confidence routing: Quote extraction auto-accepts line items above 98% confidence and queues the rest. Reviewers see 11% of lines instead of 100%, and the two real errors that month surface in the queue, not in a mispriced PO.
Sample audit: A monthly audit pulls 50 auto-accepted lines at random. After two quarters with zero errors found, the team widens auto-accept from 98% to 96% confidence, shrinking the review queue further.
Award checkpoint: An agent assembles a recommended 60/40 split award with its reasoning attached. The category manager overrides to 70/30 to keep tooling consolidated, approves, and the system records both the recommendation and the override for the next audit.
Definition
HITL exists because AI output is probabilistic. A system that is right 96% of the time is a great assistant and a poor unsupervised actor, because the other 4% lands somewhere expensive. Routing consequential actions through review converts model accuracy into an acceptable operating posture without giving up the speed gains on everything upstream of the checkpoint.
Where the loop belongs: actions that are irreversible, external, or contractual. Supplier awards, purchase orders and payments, contract terms, and anything a supplier will reasonably read as a commitment. This is also the boundary that keeps agentic systems and autonomous negotiation inside governance: machines propose and operate within bounds, people own commitments. The approval trail doubles as compliance evidence: who approved what, when, on what information.
The failure mode is review theater. If approvers wave through 99.8% of items untouched, the checkpoint is dead weight and attention has already left the room. Better: raise auto-accept confidence thresholds, review only exceptions, and sample-audit the rest. Platforms like LightSource are built around this division of labor, with AI handling extraction and comparison and buyers approving what leaves the building.
Related Terms
*GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and COOL VENDORS is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.