Weaviate Academy

Now that you've seen how search types differ, let's build a framework for selecting the right one for your use case.

Ask these questions to guide your choice:

1. What type of queries will be used?

Natural language descriptions → Vector or Hybrid
Controlled/validated input → Keyword or Hybrid
Exact names/titles/IDs → Keyword or Hybrid
Mixed/unpredictable (e.g. user-typed queries) → Hybrid

2. What's your data like?

Single modality (text only) → Any type works
Multi-modal (text + images) → Vector (required)
Multi-lingual → Vector or Hybrid
Domain-specific terminology → Keyword

3. What's your priority?

Must find exact matches → Keyword or Hybrid
Need semantic understanding → Vector or Hybrid
Balance both → Hybrid

Real-world scenarios

Scenario 1: E-commerce product search

Context:

Users search with natural language ("red running shoes")
Some search by SKU or exact product name
Multi-lingual customer base
User typos common

Recommendation: Hybrid search (alpha=0.6-0.8)

Configuration:

products = client.collections.use("Products")

response = products.query.hybrid(
    query=user_query,
    alpha=0.75,  # 75% vector, 25% keyword
    query_properties=["title^2", "description", "sku^3"]  # Boost exact SKU matches
)

API docs

Why:

Handles both natural language AND exact SKU searches
Tolerates typos while rewarding exact matches
Property boosting ensures exact matches rank high

Scenario 2: Legal document search

Context:

Lawyers search for specific case citations
Exact legal terminology critical
Document language is formal and consistent
Predictable query patterns

Recommendation: Keyword search (BM25)

Configuration:

from weaviate.classes.config import Property, DataType

client.collections.create(
    name="LegalDocs",
    properties=[
        Property(
            name="citation",
            data_type=DataType.TEXT,
        ),
        Property(
            name="summary",
            data_type=DataType.TEXT,
        ),
        # Additonal settings not shown
    ]
)

documents = client.collections.use("LegalDocs")

response = documents.query.bm25(
    query="Diamond v. Diehr",
    query_properties=["citation^3", "summary"]
)

API docs

Why:

Exact matching is critical in legal context
Keyword search is predictable and precise
Faster than vector search for exact lookups
Transparency in why results match

Scenario 3: Content recommendation

Context:

Find similar articles to what user is reading
No explicit user query
Content in multiple languages
Semantic similarity needed

Recommendation: Vector search (near_object or near_text)

Configuration:

articles = client.collections.use("Articles")

response = articles.query.near_object(
    near_object=current_article_uuid,
    limit=10
)

API docs

Why:

Object-based similarity (no query text)
Vector search is the only appropriate choice
Works across languages
Captures semantic relationships

Quick reference

Use Case	Search Type	Key Reason
Multi-modal search	Vector	Only option for cross-modal
Domain-specific docs (legal, medical)	Keyword	Precision critical
E-commerce / general search	Hybrid	Mixed query types
Content recommendations	Vector	Similarity-based
Multi-lingual content	Vector/Hybrid	Cross-lingual capability
Controlled form inputs	Keyword	Predictable queries
User-facing search (typos likely)	Hybrid	Typo tolerance + flexibility

Practice scenarios

Consider these use cases and decide which search type to use:

Customer support FAQ - Users search for help articles
Academic paper search - Researchers search by topic or citation
Code repository search - Developers search for functions and files
Social media posts - Users search their own posts
Music recommendation - Find songs similar to current track

What's next?

You now have a comprehensive understanding of search strategies in Weaviate.

But, keep in mind that there's no universal "best" configuration. The right search strategy depends on:

Your data characteristics
Your users' query patterns
Your quality vs. performance requirements

Use this course as a foundation, then experiment and iterate with your specific use case.

Ready to build?

Try implementing these strategies in a small test project first. Start with the basics (search type selection, default tokenization), get it working, then optimize based on actual results. Good luck! 🚀

← Back to Lesson Overview

Selection framework

Real-world scenarios

Scenario 1: E-commerce product search

Scenario 2: Legal document search

Scenario 3: Content recommendation

Quick reference

Practice scenarios

What's next?

← Back to Lesson Overview

Selection framework

Real-world scenarios​

Scenario 1: E-commerce product search​

Scenario 2: Legal document search​

Scenario 3: Content recommendation​

Quick reference​

Practice scenarios​

What's next?​

Real-world scenarios

Scenario 1: E-commerce product search

Scenario 2: Legal document search

Scenario 3: Content recommendation

Quick reference

Practice scenarios

What's next?