Table of Contents

outdated
Proof
Use Case 1
The Complexity
The Result
Use Case 2
The Complexity of Cross-Border Pricing
Two Years
Who This Helps

The Company: A global luxury fashion marketplace operating globally, working with 150 in-scope partner seller sites who list products on the platform — and also sell those same products on their own websites.

The commercial team faced a problem they couldn't quantify. Products were missing from their marketplace — items sellers had on their own sites but hadn't listed on the platform. Every missing item was missing GMV.

But no one could tell sellers which products were missing. Or how many. Or for which brands and categories. Account managers walked into negotiations with hunches instead of numbers. Sellers had no reason to change anything.

300,000 Pages. 10% Complete. 90% Outdated.

Before working with us, account sales teams collected assortment data manually. At this scale — 150 sellers, hundreds of brands, multiple categories — that's roughly 300,000 pages to check every week. Coverage reached about 10%.

Over 120 hours per week of senior commercial time across 20 account managers — not spent on strategy, but on data collection that still couldn't be trusted. The data that did exist was 3-4 weeks old by the time it reached decision-makers. In luxury fashion, where inventory turns weekly, that meant negotiating with information from two seasons ago.

48-Site Proof in ~48 Hours

For the proof of concept, we scraped 48 of their 3P seller sites. Within 48 hours, structured data showed product counts by brand, by category, by seller. Accuracy was 98% in that proof scope.

The commercial leadership team could immediately see where gaps existed — and more importantly, could show sellers the exact same data.

Use Case 1: Assortment Intelligence (Weekly)

Every week, structured data arrives across all 150 in-scope seller sites:

Field	Example
Store	Partner Boutique (UK)
Country	United Kingdom
Category	Men
Subcategory	Clothing
Brand	Gucci
Total Products	66
Out of Stock	0

These fields make assortment planning actionable. Account managers walk into negotiations knowing precisely what to ask for.

The Complexity Behind Weekly Delivery

Luxury e-commerce sites are notoriously difficult to scrape. Here's what weekly delivery actually requires:

Anti-bot protection Many luxury retailer sites actively resist automated collection — CAPTCHA, rate limiting, behavioral analysis. We use multiple approaches to maintain scheduled delivery reliability across all in-scope sites.

Site navigation Every site structures their catalog differently — infinite scroll, pagination, "load more" buttons, nested categories. Complete product discovery requires handling all of these patterns.

Constant changes Dozens of those 150 sites require maintenance in any given week due to HTML or URL changes. Detection and fixes happen before scheduled delivery.

Cross-region normalization Some seller sites are in Italian, French, or Spanish. Naming and mapping normalization ensures results are comparable across regions.

The Result: "You Have 29 Gucci. Average is 258."

That's the kind of conversation account teams can now have.

Before, negotiations were vague: "We think you're missing some products." Now they're specific: "For Gucci menswear, you have 29 products listed. The average across our sellers is 258. Here's the gap by category."

Assortment completion went from 50% to 90-98%. Sellers started adding products they hadn't realized were missing — or had kept exclusive to their own sites.

120+ hours per week returned to the commercial team. Twenty account managers, no longer doing manual data collection.
Negotiation conversations changed. When you have exact numbers by brand and category, sellers respond differently than when you're estimating.

Use Case 2: Cross-Border D2C Pricing

Once the assortment team had weekly, trusted data, the pricing team explored the same delivery model for a different challenge.

Two years into the engagement, the Global Senior Director of Operations and Senior Pricing Lead reached out. After COVID, luxury brands had shifted focus to their direct-to-consumer channels. Several top luxury brands were running sales on their own sites — with no visibility into that pricing from the marketplace side.

Without D2C pricing data, there was risk of either leaving money on the table (pricing below market) or losing customers (pricing above competitors). In luxury fashion, even a 5% pricing gap drives customers to D2C checkout.

Very reliable — data we don't have to second-guess.

Global Senior Director of Operations, recommending us to the pricing team

That internal recommendation is why they came to us instead of starting over with a new vendor.

This engagement was larger: 300 brands across 15 countries — roughly 4,500 brand-country combinations tracked every two weeks.

Field	Example
Product ID	1575825
Product Name	Example product name (redacted)
Category	Hats
Full Price	$176.00
Sale Price	$123.20
Discount	30%
Status	Active
Sizes	One Size

These fields feed directly into the data warehouse. Analysts match scraped D2C prices against their own catalog by brand ID, build BI dashboards for commercial and catalog teams, and inform pricing decisions based on real market data — not assumptions.

The Complexity of Cross-Border Pricing

Tracking 300 brands across 15 countries creates challenges most tools can't handle:

Currency and format handling Each country displays prices differently — 1,005.00 USD vs 1.000,55 EUR vs 1'000.00 SGD. Everything is normalized into comparable formats.

Regional URL variations Sometimes the URL stays the same but the country/price changes. Region detection and scraping the correct market is handled automatically.

Heavy anti-bot protection Luxury D2C sites use aggressive anti-scraping measures. These defenses are handled to maintain access across regions.

Product ID extraction Product IDs can appear in different parts of the page. We extract the identifier that maps to their internal catalog.

Scale maintenance When tracking 4,500 site-country combinations, dozens change their structure in any given period. Detection and fixes happen before scheduled delivery.

Two Years, Two Use Cases, One Relationship

This has been a 2+ year customer relationship. It started with 48 sites and 100 brands for assortment tracking. Now two parallel data programs run continuously:

Program	Scope	Frequency
Assortment Intelligence	150 sites × 200 brands × 10+ categories	Weekly
D2C Pricing Intelligence	300 brands × 15 countries	Bi-weekly

Both data streams feed directly into their data warehouse tables accessible to analysts, commercial teams, and catalog managers. No manual transformation. No CSV cleanup.

The combined impact:

Assortment completion: 50% → 90-98%
Commercial team time saved: 120+ hours/week
Pricing algorithms had fresher D2C inputs across regions (less reliance on assumptions)
Parity decisions were made with D2C visibility across 15 countries
Brand pricing strategies visible across all tracked markets

Who This Helps

This story resonates with fashion and luxury teams facing similar challenges:

Marketplaces that work with third-party sellers and need assortment visibility
Brands or retailers tracking D2C competitors across multiple regions
Commercial teams negotiating without reliable data
Pricing teams building algorithms that need accurate market inputs

See What This Looks Like for Your Catalog

We'll scrape your actual products from your actual competitors or partners. You'll see real data for an agreed proof scope within 48 hours.

Request a Sample Delivery

No commitment. No setup on your end.

50% → 90–98% Assortment Completion Within ~90 Days