Weaviate Academy

Selecting the optimal embedding model is a significant achievement, but it's not a one-time decision. The AI landscape evolves rapidly, and your application requirements change over time, making periodic re-evaluation essential for maintaining peak performance.

Why re-evaluation matters

The embedding model ecosystem is highly dynamic:

Rapid model innovation: New models are released regularly, often offering substantial improvements in performance, efficiency, or capabilities.

Evolving requirements: Your application's data distribution, supported languages, user base, and performance requirements naturally evolve.

Integration learnings: Real-world deployment often reveals performance characteristics that weren't apparent during initial evaluation.

Establishing a re-evaluation framework

Monitor external developments

Benchmark leaderboards: Regularly check resources like MTEB to identify promising new models that significantly outperform your current choice.

Model releases: Follow announcements from major AI companies and research institutions for breakthrough models in your domain.

Community insights: Engage with relevant communities and forums where practitioners share real-world model performance experiences.

Track internal performance

Application metrics: Monitor your system's key performance indicators:

Query response times and throughput
User satisfaction and engagement metrics
Retrieval accuracy in production
System resource utilization

Data drift detection: Watch for changes in:

Query patterns and complexity
Document types and sources
Language distribution
Domain-specific terminology evolution

Performance degradation signals: Establish alerts for:

Declining retrieval quality scores
Increased user feedback about poor results
Growing latency or resource consumption
Higher error rates in downstream applications

Review requirement evolution

Business changes: Regular assessment of:

New markets or user segments
Additional languages or regions
Changed compliance or privacy requirements
Budget or infrastructure constraints

Technical evolution: Consider impacts from:

Scale changes (data volume, query load)
New application features or use cases
Infrastructure updates or migrations
Integration with new systems or models

Re-evaluation triggers

Establish clear criteria that trigger re-evaluation:

Performance thresholds: Define specific metrics that, when crossed, initiate review:

NDCG scores dropping below baseline thresholds
Latency exceeding acceptable limits
User satisfaction scores declining

Time-based reviews: Schedule regular evaluations:

Quarterly reviews for rapidly evolving applications
Annual reviews for stable, mature systems
Event-driven reviews for major business changes

Model landscape changes: Trigger reviews when:

New models achieve significantly better benchmark scores
Models become available that better match your requirements
Pricing or availability changes for current models

Re-evaluation process

When triggers activate, apply the same systematic approach used for initial selection:

1. Reassess requirements

Update your requirements document to reflect current needs:

Changed data characteristics
Evolved performance requirements
New operational constraints
Updated business priorities

2. Screen new candidates

Apply your screening heuristics to identify new candidates:

Recently released models
Models with improved benchmark performance
Options that better address current pain points

3. Comparative evaluation

Run focused benchmarks comparing:

Current model performance
New candidate models
Previous evaluation results for trend analysis

4. Migration planning

If a new model proves superior:

Plan transition strategy and timeline
Estimate migration costs and risks
Prepare rollback procedures
Design A/B testing for production validation

Best practices for ongoing evaluation

Maintain evaluation infrastructure: Keep your custom benchmark framework updated and ready for quick deployment.

Document decisions: Record why models were selected or rejected to avoid repeating evaluations unnecessarily.

Version control: Track model versions, evaluation datasets, and performance metrics over time.

Gradual transitions: When switching models, implement careful rollouts with monitoring and rollback capabilities.

Cost-benefit analysis: Balance potential improvements against migration effort and operational disruption.

Building a sustainable process

Automation where possible: Automate benchmark running, performance monitoring, and alert generation.

Team responsibility: Assign clear ownership for monitoring model performance and conducting re-evaluations.

Integration with development cycles: Align model evaluation with regular development and deployment cycles.

Knowledge sharing: Document lessons learned and share insights across teams working with embedding models.

The long-term perspective

Treating model selection as an ongoing process rather than a fixed decision provides several advantages:

Continuous optimization: Stay current with the best available technology for your use case.

Risk mitigation: Avoid performance degradation as requirements evolve or models become outdated.

Competitive advantage: Leverage improvements in AI technology faster than competitors who treat model selection as static.

Operational excellence: Build organizational capabilities in model evaluation and management that benefit all AI initiatives.

By establishing systematic re-evaluation processes, you ensure your embedding model choices continue serving your application effectively as both your needs and the available technology evolve.

Course conclusion

You now have a comprehensive framework for embedding model evaluation and selection:

Systematic requirements analysis across data, performance, operational, and business dimensions
Efficient candidate screening using proven heuristics
Thorough evaluation methodology with both standard and custom benchmarks
Ongoing re-evaluation processes to maintain optimal performance

This systematic approach helps you navigate the complex embedding model landscape confidently, making informed decisions that balance performance, cost, and operational requirements for your specific use case.

← Back to Lesson Overview

Periodic re-evaluation and conclusion

Why re-evaluation matters

Establishing a re-evaluation framework

Monitor external developments

Track internal performance

Review requirement evolution

Re-evaluation triggers

Re-evaluation process

1. Reassess requirements

2. Screen new candidates

3. Comparative evaluation

4. Migration planning

Best practices for ongoing evaluation

Building a sustainable process

The long-term perspective

Course conclusion

← Back to Lesson Overview

Periodic re-evaluation and conclusion

Why re-evaluation matters​

Establishing a re-evaluation framework​

Monitor external developments​

Track internal performance​

Review requirement evolution​

Re-evaluation triggers​

Re-evaluation process​

1. Reassess requirements​

2. Screen new candidates​

3. Comparative evaluation​

4. Migration planning​

Best practices for ongoing evaluation​

Building a sustainable process​

The long-term perspective​

Course conclusion​

Why re-evaluation matters

Establishing a re-evaluation framework

Monitor external developments

Track internal performance

Review requirement evolution

Re-evaluation triggers

Re-evaluation process

1. Reassess requirements

2. Screen new candidates

3. Comparative evaluation

4. Migration planning

Best practices for ongoing evaluation

Building a sustainable process

The long-term perspective

Course conclusion