Help Center

Find answers to your questions about Feedmaster

Product Deduplication Strategies

Product Deduplication Strategies

Deduplication is essential for maintaining clean, efficient product feeds. This guide covers everything you need to know about removing duplicate products intelligently.

Why Deduplication Matters

  • Marketplace Compliance: Many platforms reject feeds with duplicate products
  • Better User Experience: Customers see each product only once
  • Cost Efficiency: Reduce advertising spend on duplicate listings
  • Inventory Accuracy: Prevent overselling due to duplicate entries

How Deduplication Works

1

Identify Duplicates

Products are grouped by the "Match Field" you specify

2

Compare Priority

Within each group, products are ranked by the "Priority Field"

3

Keep Best Match

Only the product with the best priority value is retained

Common Deduplication Strategies

Strategy 1: Price-Based Deduplication

Scenario: Multiple sellers offer the same product

  • Match Field: gtin (or mpn)
  • Priority Field: price
  • Priority Direction: lowest

Result: Keep only the cheapest offer for each unique product

Strategy 2: Stock-Based Deduplication

Scenario: Same product in multiple warehouses

  • Match Field: sku
  • Priority Field: quantity
  • Priority Direction: highest

Result: Show only the location with most stock

Strategy 3: Quality-Based Deduplication

Scenario: Products with varying data quality

  • Match Field: title
  • Priority Field: description_length
  • Priority Direction: highest

Result: Keep product with most detailed description

Strategy 4: Variant Consolidation

Scenario: Show only one variant per product group

  • Match Field: parent_id
  • Priority Field: is_default
  • Priority Direction: highest

Result: Display only the default variant

Advanced Deduplication Techniques

Multi-Stage Deduplication

Apply multiple deduplication rules in sequence for complex scenarios:

  1. Stage 1: Remove exact SKU duplicates (keep highest stock)
  2. Stage 2: Remove GTIN duplicates (keep lowest price)
  3. Stage 3: Remove title duplicates (keep best rated)

Conditional Deduplication

Combine with complex rules for selective deduplication:

IF Category = "Electronics" AND Brand = "Samsung"

THEN Deduplicate by model_number keeping lowest price

ELSE Deduplicate by title keeping highest margin

Important Considerations

Things to Watch Out For

  • Case Sensitivity: Match fields are case-insensitive
  • Empty Values: Products with empty match fields are skipped
  • Processing Order: Deduplication happens after all other rules
  • Performance: Large feeds may take longer with complex deduplication

Measuring Success

Track these metrics to ensure effective deduplication:

  • Reduction in total products (typically 10-30%)
  • Improved feed acceptance rates
  • Higher click-through rates (less customer confusion)
  • Better conversion rates (showing best options)

Troubleshooting

Too many products being removed?

Check if your match field is too broad. For example, matching by "category" might remove many unique products.

Wrong product being kept?

Verify your priority field contains the expected values and the sort direction is correct.

Deduplication not working?

Ensure the match field exists and has values. Check the "Excluded Products" tab for details.