Enrich Magento Catalogs Without Breaking Its Indexer
Many Magento users face the same problem. You have thousands of products with missing attributes, thin descriptions, or no translations.
Finding data sources is easy. The hard part is putting that data into your catalog without crashing your store.
The common mistake is using a simple loop to save products one by one.
If you use the product repository save method in a loop, you trigger a full lifecycle for every single item. You run validation, observers, and reindexing triggers thousands of times. This makes scripts run for hours and slows your admin panel to a crawl.
The save path is built for humans editing one product. It is the wrong tool for bulk updates.
Follow these steps to update your catalog safely:
Use mass attribute updates. Instead of saving the whole product model, use Magento\Catalog\Model\Product\Action. Use the updateAttributes method to write straight to the database tables. Do this in batches of 1,000 to 2,000 IDs at a time.
Change your indexer settings. Set your indexers to Update by Schedule before you start. If you use Update on Save, every write triggers a synchronous reindex. On a schedule, writes go to a changelog and the cron job handles the work.
Manage translations correctly. A translation is an attribute value for a specific store view. Pass the correct store ID to the updateAttributes method. Do not overwrite your global default values when adding local languages.
Handle AI content with care. LLMs write great copy but often hallucinate facts. They might say a shirt is cotton when it is polyester. • Write AI content into a staging field or a disabled scope first. • Review a small sample before you go live. • Keep technical specs like dimensions and materials sourced from verified data.
Summary for bulk enrichment:
- Set indexers to scheduled mode.
- Use a staging field for new data.
- Apply updates in batches of 1,000 to 2,000 IDs.
- Avoid the full product save path.
- Reindex the changes.
- Test a sample of your product pages.
Data sources are the easy part. Managing a live catalog requires a different approach.
Source: https://dev.to/iamrobindhiman/enriching-a-large-magento-catalog-without-melting-the-indexer-3mk9
