Data cleansing

Data cleansing in procurement is the work of detecting and correcting errors in supplier, item, and transaction records: deduplicating supplier entries, normalizing names and units of measure, fixing miscoded categories, and reconciling records across systems. Clean data is the prerequisite for trustworthy spend analysis, supplier consolidation, and price comparison, because every duplicate or misspelled record fragments the picture of what a company actually buys.

Examples

Supplier dedup: A cleansing pass on 6,800 vendor records finds 1,150 duplicates and collapses the file to 5,650 unique suppliers. One connector supplier consolidates from four records into a $2.7M relationship, enough to qualify for the next volume-discount tier at renegotiation.

Unit normalization: Resin purchases logged in pounds at one plant and kilograms at another made price-per-unit comparison meaningless. Converting everything to per-kilogram pricing reveals a 7 percent gap between plants on identical material.

Category repair: 2,300 transactions coded to "miscellaneous" are re-mapped using line descriptions, moving $3.4M into sourceable categories like machined components and labels.

Definition

Procurement data rots in predictable ways. "Acme Industries," "ACME IND," and "Acme Industries LLC" enter the vendor master as three suppliers, usually because three plants onboarded the same vendor separately. Units of measure flip between pieces and thousands. The same part carries different item numbers in two ERP instances after an acquisition. None of this blocks transactions, which is exactly why it accumulates: the PO still clears, the invoice still pays, and the damage only appears when someone tries to analyze.

The damage is concrete. A supplier that looks like $400K across three records is actually a $1.2M relationship negotiated as three small ones. Two plants buying the same bracket under different item numbers never see the price gap between them. Spend analysis on dirty data produces precise-looking answers to the wrong questions.

Cleansing is the corrective pass: match and merge duplicates, standardize naming and units, repair categories. It differs from data enrichment, which appends new external attributes rather than fixing existing ones, and from master data management, the ongoing governance that keeps records from drifting again. Cleansing without that governance is mopping the floor with the tap running.

Related Terms

Vendor master data

Data enrichment

Spend analysis

Master data management (MDM)

*GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and COOL VENDORS is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.