Normalising Product Data at Scale: Developing a Bespoke AI Powered App To Solve Our Multi-Manufacturer Challenge

This is part of real-world use cases of ai. Non-gimmicky time saving business use cases that help reduce the “boring work” There’s a lot of bluster, these aim to be practical insights.

Managing product data from 70+ manufacturers is a challenge. Each formats their specifications differently. For example Hisense might call it “washing capacity” while Bosch uses “wash load.” Measurements vary between millimeters and centimeters. Ambiguous terms like “large,” “compact,” or “standard” traditionally require human review to determine actual values. Even with sophisticated PIM systems, hardcoded mappings for every manufacturer variation become unmanageable at scale especially when working with smaller suppliers who don’t have sophisticated data management setups.

The Problem with Manual Data Management

No matter your data warehouse or PIM system, the normalisation bottleneck remains. For a small team, manually standardising thousands of products isn’t feasible. The output can only be as good as the input and in many cases we do not receive well organised product information. As of 23rd May Appliance World has 150,000 pieces of additional data on active products (thousands are inactive or discontinued). That doesn’t include the description, title, weights, key features, and manuals.

Our Technical Solution

In November 2024 we built an AI-powered data parser using Claude 3.5 Sonnet that automatically normalises product specifications to applianceworldonline.com‘s Shopify standards.

Here’s how it works:

  • Dual input: Any piece of product data can be pasted into the window or it can be scraped from a URL. Using LLM’s amazing natural language processing to parse data this gives us the raw data we need.
  • Dynamic Schema Generation: The system pulls product specifications from our Supabase database and generates validation schemas on the fly. Each product type (washing machines, ovens, fridges) has specific field mappings and accepted values. These are mapped 1:1 with our store.
  • Intelligent Data Mapping: Claude analyses raw manufacturer data and maps it to our standardised fields. It converts measurements (mm to cm), standardises terminology, and formats everything consistently as well as creating useful bullet-pointed descriptions, formatting image URLs and ensuring titles are consistent e.g. “Brand Model – Colour Product Type – Key Specs – Energy Rating” would output Bosch WAN28259GB – White Freestanding 9KG Washing Machine – 1400 RPM – A energy.
  • Multi-Input Support: The tool accepts manual data entry or scrapes product URLs directly using a scraping tool, then processes everything through the same normalisation pipeline.
  • Quality Control: The system generates unique product descriptions, ensures proper HTML formatting, and validates all technical specifications before creating Shopify products with metafields.

Real Impact

Since November, we’ve added 2,500+ products using this system. Each entry includes standardised specifications, quality bullet-pointed descriptions, and proper image formatting. While legacy products still need updating, this represents significant progress for our team size. The system handles the complexity automatically. No more manual mapping of “wash load” vs “washing capacity” or converting between measurement units. It’s a practical solution to a common e-commerce problem that scales with our growing manufacturer network.