Back to Glossary

Natural language processing (NLP)

Natural language processing (NLP) is the set of computational techniques for working with human language: extracting fields from documents, classifying text, matching similar items, and searching by meaning rather than exact keywords. In procurement, NLP parses contracts and quotes, normalizes free-text line items and part descriptions, and reads supplier emails so their contents become usable data instead of inbox sediment.

Examples

Line-item normalization: Three plants buy the same O-ring under 14 description variants. Matching collapses them into one item with 480,000 units of combined annual volume, and exposes that one plant pays $0.064 while another pays $0.081 for the identical part.

Clause extraction: Extraction pulls termination notice periods from 120 supply agreements in an afternoon. Nine contracts carry 30-day terms the team believed were 90, which changes the risk math on two single-sourced components.

Email triage: Supplier emails auto-classify as quote, order confirmation, or delay notice. Delay notices route to planners within the hour instead of after the weekly inbox sweep, buying back two to three days of reaction time.

Definition

The canonical procurement problem NLP solves: HEX BOLT M8X40 ZN and Bolt, hex head, M8 x 40 mm, zinc plated are the same part to a human and different strings to a database. Matching, extraction, and classification close that gap, which is why NLP sits underneath spend classification, supplier-record deduplication, and contract clause work. Historically each task needed its own statistical model or rule set, one extractor for dates, another for payment terms. Large language models now cover many of these tasks with a single general model, though purpose-trained classifiers survive where volume and cost dominate, like coding millions of invoice lines.

What separates working deployments from demos is measurement: precision and recall per task, on your own documents. Quote PDFs from a 40-person machine shop look nothing like clean sample contracts, and messy real-world documents are where accuracy claims collapse. Good practice is to hold out a labeled test set, measure against it, and route low-confidence outputs to human review rather than forcing an answer.

Related Terms

Large language model (LLM)

Unstructured data

Spend classification

Machine learning (ML)

Multi-sourcing

Nearshoring

*GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and COOL VENDORS is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Natural language processing (NLP)

Examples

Definition

Related Terms

Product

Industries

Resources

Company

Product

Industries

Resources

Company

Product

Industries

Resources

Company