This article is the final part of a four-part series exploring how to extend Coveo’s powerful search platform with vector embeddings from Amazon Bedrock to enable advanced visual search experiences for ecommerce.
This blog series describes customer-managed implementation patterns for teams that need to experiment now. These are not the default or recommended Coveo architectures. Where Coveo offers or is introducing native capabilities, those should be evaluated first. The examples here are intended to explain design tradeoffs, orchestration patterns, and governance considerations, not to imply product commitments or turnkey support.
The Metadata Challenge
Product metadata is the foundation of effective search. Rich, accurate metadata enables:
- Better relevance: Users find what they’re looking for
- Faceted navigation: Filter by color, material, style
- Personalization: Recommend based on attributes
- SEO optimization: Structured data for search engines
But manual tagging is expensive, slow, and inconsistent. With global ecommerce projected to reach $6.8 trillion by 2025, the scale of product catalogs makes manual enrichment impractical.
Why Amazon Bedrock Nova?
Amazon Bedrock Nova Lite is a multimodal model that understands both images and text. For metadata extraction:
| Capability | Benefit |
| Multimodal understanding | Analyzes image content directly |
| Structured output | Returns clean JSON |
| Fast inference | ~1-2 seconds per image |
| Cost-effective | ~$0.0001 per image |
Architecture
Our AI metadata pipeline is designed for batch processing. It downloads images, extracts metadata using Amazon Bedrock Nova, and then indexes the enriched products into Coveo.

Pipeline Architecture

The pipeline processes images in batch, extracting metadata and preparing data for Coveo indexing.
Implementation

Complete code is in scraper/ in our GitHub repository.
The Extraction Prompt
Prompt engineering is critical for consistent output:
Python

Key principles:
- Constrained values: “MUST be one of” ensures consistent facets
- Examples: Guide the model toward expected format
- Clear instructions: “Return ONLY valid JSON” prevents extra text
Calling Bedrock Nova Lite
Python

Low temperature (0.1) ensures consistent, deterministic output.
Processing Pipeline
Python

Indexing to Coveo
Field Configuration
For the purpose of this example, we use simple custom field names (category, color, material). In a production e-commerce implementation, it is best practice to map these extracted values to the standardized Coveo Commerce Fields (e.g., ec_category, ec_color) either at the indexing stage or using field mapping rules in the Coveo Administration Console. This ensures full compatibility with features like Coveo ML’s Automatic Relevance Tuning (ART).
Before indexing, configure Coveo fields for faceting:
Python

Push API Integration
Python

Integration with Coveo ML
AI-extracted metadata is foundational for Coveo’s machine learning features to perform optimally:
- ART (Automatic Relevance Tuning):ART learns from user behavior (clicks, purchases) linked to specific queries and facets. Consistent, high quality metadata ensures that the underlying taxonomy and filtering options are reliable. This consistency is what allows ART to accurately identify user intent and optimize relevance across the catalog.
- Query Suggestions: Suggestions are context aware and rely on the availability of robust, standardized facets. The enriched metadata ensures a clean, unified set of values, which directly leads to more accurate and valuable query suggestions.
- Faceted Search: Standardized values enable reliable filtering and navigation, which is the primary driver of high-quality search experiences.
The combination of AI-extracted metadata and Coveo ML creates a virtuous cycle—better metadata leads to better user interactions, which improves ML model accuracy.
Running the Pipeline
Bash

Cost Analysis
For 150 images:
| Service | Cost |
| Bedrock Nova Lite | ~$0.02 |
| S3 Storage | ~$0.001 |
| Total | ~$0.02 |
At scale (10,000 images): ~$1.50
Production Considerations for AI Enrichment
The simplified extraction prompt in this article is for illustrative purposes only. For a production-grade catalog enrichment pipeline, additional steps and governance are essential to ensure data quality and user trust:
* Validation and Confidence Thresholds: Implement a process to check the generated JSON structure and validate that values adhere to your product schema. You may discard or flag metadata where the model’s confidence score is too low.
* Human Override Workflow: For high-impact fields (e.g., product category), an administrative override layer is critical. This ensures that manually curated data takes precedence over any AI-generated value when necessary.
* Schema Constraints: Utilize more advanced prompt engineering techniques to enforce a robust schema. This includes not just constrained values (like ‘MUST be one of…’) but also regular expressions or a more complex Pydantic-style output format.
* Durability and Governance: The design principles in this article—deciding where enrichment is orchestrated, validating generated values before they affect ranking, and keeping a human override path—remain relevant even as native product capabilities evolve.
Series Recap
Over these four articles, we built a complete visual search solution:
- Vector Embeddings: OpenSearch k-NN with Bedrock Titan
- IPE + Lambda: Automatic embedding generation
- Headless UI: React interface with Coveo Headless
- AI Metadata: Automated product enrichment
The result: users can search by text, image, or both—with rich faceted navigation powered by AI-extracted metadata and relevance optimized by Coveo ML.
Resources:
Coveo Commerce Fields: Create Additional Fields
Amazon Bedrock Documentation
Coveo Push API
GitHub Repository
Map Commerce Fields to Custom Fields (Catalog Source)
Coveo Commerce Fields Payload Example

