Table of Contents
You have your product catalog. Competitors have theirs. Before you can compare prices, track MAP violations, or analyze assortment gaps — you need to connect them.
Which of their products is the same as yours?
This is product matching. It sounds simple. It's not. And it's where most competitive intelligence efforts break down.




That's the problem in one image. And it plays out every day across e-commerce teams.
The pricing analyst who gets automated match suggestions — Score: 94%. Score: 87%. Score: 91% — and approves them because what else can she do? Two weeks later, her boss asks why the competitive analysis shows their prices 15% above market. She checks. Three of the "matches" were wrong. Different products entirely. The analysis is garbage. She starts over.
The brand manager who knows retailers are violating MAP — he can see the prices are wrong — but can't prove it. His catalog says "SKU-4892." The retailer says "Designer Bag - Brown." Are they the same product? He's 80% sure. 80% isn't enough to send a violation notice.
This is Match Failure. The invisible prerequisite that determines whether your competitive intelligence works — or doesn't.
Everyone assumes matching is simple. You have product names. They have product names. Just connect them.
Here's where that breaks down.
The Spreadsheet Dead End
VLOOKUP requires exact matches. Product names are never exact. "Nike Air Max 90 White Men's Running Shoe" won't match "Nike AM90 White." Fuzzy matching add-ins help — until they match "Nike Air Max 90" to "Nike Air Max 95" because the text similarity is high enough.
One category manager described it: "I export competitor prices to Excel. I export my catalog. Then I spend the next 6 hours trying to connect them. By Thursday, I have Monday's analysis ready."
When Identifiers Don't Help
Barcodes, UPCs, GTINs — when they exist and match, the problem is solved. But they rarely do.
Competitors often don't display barcodes on product pages. Private label products have no shared identifiers. Bundles and multipacks create new identifiers that don't connect to anything. Different regions use different coding systems. And marketplaces like Amazon and eBay use internal IDs (ASIN, MPN) that don't map to yours.
A workwear brand monitoring 400 retailers found: eBay uses MPN, Amazon uses ASIN, individual retailer sites use internal codes. None of them match the brand's own SKUs. So what do you do?
Why Most Tools Don't Solve This
Most matching tools use text-only comparison. They look at product names, descriptions, maybe some attributes, and calculate similarity scores. The problem: text matching only works when both sides have rich, consistent, standardized text.
When a retailer lists "IM Oskan Moon Shoulder Bag - Camel" and the brand catalog says "OSKAN MOON shoulder bag, studded suede calfskin leather baguette" — how does a text algorithm know if that's the same product? The words overlap, but is it the same product? The color is different. The material is different. The hardware might be different. Text algorithms alone can’t know.
A few tools attempt image matching. But image-only matching has its own problems. We tested image similarity algorithms on a luxury fashion dataset. A correct match scored 0.89. A wrong match (same model, different color) scored 0.88. The gap is too small to set a reliable threshold.
Text alone doesn't work. Images alone don't work. Basic algorithms can't distinguish between a 90-score match that's correct and a 90-score match that's wrong.
We tested multiple matching approaches on a luxury fashion catalog: 2,068 retailer products matched against a brand's 3,300-product catalog. The results:
The challenge isn't just finding matches — it's correctly rejecting near-matches. Of the 3,795 pairs we evaluated closely, 644 scored between 85-95% similarity. High enough that basic automation would approve them. But they were different products — same brand, same model name, different variant. Our system correctly rejected them.
A 50% match is obviously wrong. Anyone catches that. A 92% match that's actually wrong? That's where most tools fail. That's where accuracy matters.
Remember those 644 pairs that scored 85-95%? High enough that most tools would approve them automatically.
Every single one was wrong.
Same brand. Same model name. Different product. Here's how that happens.
See the pattern? These aren't random errors. They're systematic failures that compound. If 5% of your matches are wrong and you have 10,000 SKUs, that's 500 wrong comparisons. Every single refresh cycle. Week after week.
And each wrong match has a cost:
Decision cost: Wrong match → wrong price move → margin loss. You drop prices to "match" a competitor who isn't actually selling the same product.
Compliance cost: Wrong match → false MAP enforcement → channel conflict. You accuse a retailer of a violation that doesn't exist. The relationship takes months to repair.
Trust cost: Leadership finds errors in your reports → they stop believing the system → the project dies. All that investment in competitive intelligence, abandoned because nobody trusts the data.
If your team spends even 4 hours a week cleaning up matches, that's 200+ hours a year — and that's just the visible cost. The invisible cost is every decision made on data that looked right but wasn't.
So you can't trust automated matching. What's the alternative?
Manual verification. And here's what that actually looks like.
We matched a luxury retailer's designer brand listings against the brand's own US website — 2,068 products against a 3,300-product catalog. Same brand on both sides. Should be straightforward, right?
For each product: search the brand catalog, compare images side by side, check attributes (color, material, size), decide if it's an exact match, a variant, or no match at all.
And this was the easy case — same brand on both sides, high-quality images, detailed product descriptions. Matching across different brands, or retailers with poor product data, takes longer.
The result: 814 matches found (632 exact + 182 variants). The other 1,254 products? They simply don't exist in the brand's own catalog — retailer-exclusive items, discontinued products, or regional variations. That's not a matching failure; that's reality. The system correctly identified what matched and what didn't.
When we audited the final outputs: 99% accuracy. Not on a test set — on the actual deliverable, after human review on low-confidence cases.
Now imagine doing this across 10 retailers. Or 50. Or maintaining it weekly as new products appear and old ones change.
The math doesn't work. Manual matching doesn't scale.
At 100 SKUs, you can match manually. Open both spreadsheets, compare line by line, connect what you can. It takes a day. It's manageable.
At 2,068 SKUs matched against 3,300 products — the case we just described — manual matching would take 86 hours. More than two weeks full-time.
So you turn to automation. And here's where it gets interesting.
In that same project, 644 pairs scored between 85-95% similarity. High enough that text-based tools would approve them as matches. Same brand, same model name, similar descriptions.
All 644 were wrong.
Different color. Different material. Different variant. We caught them. A text-only tool wouldn't.
A 50% match is obviously wrong — anyone catches that. A 92% match that's wrong? Text-only tools approve it. It corrupts your data.
The scale problem has three layers:
Detection: You can't manually verify thousands of matches. So errors hide until they cause visible damage — a pricing decision that doesn't make sense, a MAP notice that gets disputed, an executive who spots a mismatch in your report.
Trust: Once you've found errors, you stop trusting the data. But you still have to use it. So you add caveats, hedge recommendations, spend time defending methodology instead of driving decisions.
Compounding: Bad matches don't just affect one report. They feed into pricing algorithms, trend analysis, competitive positioning. One wrong match can ripple through multiple decisions before anyone notices.
Here's the thing we kept seeing: teams would try text matching. It would fail. They'd try image matching. It would fail. They'd try AI-powered tools. Same result.
The breakthrough wasn't finding a better algorithm. It was realizing that no single method can solve this alone.
Text catches name similarities. But it can't see that "camel brown" and "taupe" are different colors. Images catch visual differences. But they can't read that one product is leather and one is suede. Attributes catch variant details. But they miss new products that haven't been categorized.
What actually works is layering them — and knowing when to escalate to human review:
Extract model names, brand, key attributes from titles and descriptions. Normalize variations ("OSKAN" vs "Oskan" vs "oskan-moon").
Compare structured attributes: color families (is "camel" close to "tan"?), sizes, materials, hardware. Not just text similarity — semantic understanding.
Visual comparison catches what text misses. Same model name but different design? Images reveal it. But images alone aren't enough — they need text context.
Not just a number — why the match was made. "Matched on brand, model name, color family, and visual similarity" vs "Matched on text only, images differ."
High-confidence matches auto-approve. Low-confidence matches route to review. You see the reasoning; you make the final call on ambiguous cases.
The key difference: no single method works alone. Text catches name similarities. Attributes catch variant differences. Images catch visual mismatches. Human review catches the edge cases that algorithms can't resolve. Each layer filters what the previous one missed.
This is why most tools fail — they rely on one or two methods and call it "AI-powered matching." The 90-score match that's wrong? They can't catch it. The variant that looks identical in text but differs in images? They miss it.
We ran this exact process on a luxury fashion matching project. Real products, real stakes.
Challenge: Match 2,068 retailer products against a 3,300-product brand catalog. Products have color, material, and hardware variations — the Oskan Moon bag alone comes in 6+ variants. Text-only matching couldn't distinguish them.
Complication: 644 product pairs scored 85-95% similarity — high enough for basic tools to approve. All were wrong matches (same model name, different variant). Missing these would corrupt pricing comparisons.
How it works: Text analysis identifies brand and model. Image comparison catches color and material differences. Attribute extraction distinguishes variants (suede vs leather, whipstitch vs studs). 632 exact matches + 182 variant matches identified. 644 near-matches correctly rejected. 1,254 products confirmed as not existing in the brand catalog.
That's 814 correct matches. 644 near-matches correctly rejected. 1,254 products confirmed as not in the brand catalog. Final audited accuracy: 99% — on real products, not a test set.
99% accuracy sounds like a marketing claim. Here's what it actually means — and what it doesn't.
What We Handle
Where the 1% Comes From
You get confidence scores and reasons for every match. You can audit the logic. You can override decisions. You're not locked into a black box.
Here's what the output actually looks like:
| Your Product | Matched To | Type | Score | Reasons |
|---|---|---|---|---|
| Oskan Moon Bag - Camel Suede | OSKAN MOON shoulder bag | Exact | 98% | Brand ✓ Model ✓ Color ✓ Material ✓ Image match |
| Dalby Boots - Brown Leather | Dalby Boots - Dark Brown | Variant | 91% | Brand ✓ Model ✓ Color differs (brown vs dark brown) |
| Lisia Dress - Floral Print | No match found | None | — | Product not in brand catalog (retailer exclusive) |
Product matching is where competitive intelligence breaks. Every pricing decision, every MAP alert, every competitive benchmark depends on the matches being right.
So why is it so hard?
Because matching isn't an algorithm problem. It's a judgment problem. Text alone can't tell that “camel brown suede” and “taupe leather” are different. Images alone can't read that one is a bundle and one is a single unit. Scores alone can't distinguish a correct 92% from a wrong 92%.
Most tools try to solve this with one method. They fail.
If you're spending hours trying to connect competitor data to your catalog — matching is the blocker.
If you know you have a MAP problem but can't prove it — matching is the blocker.
If your team gave up on competitor monitoring because “it's too complicated” — matching is probably why.
We've been solving this for over 20 years. The answer isn't a better algorithm. It's combining text, images, attributes, and human judgment — because no single method works alone.
If matching is your blocker, let's talk.