Skip to main content

Hybrid search & filtering

Keyword search optimization

Keyword search can be optimized in a number of ways.

Property selection & boosting

You can select which properties to include in your keyword search, and boost certain properties to improve relevance. Property boosting lets you weight certain properties higher when calculating keyword search relevance scores.

Example: E-commerce product search

In a product search, you might want to boost the name and category of a product to have higher relevance than the description.

products = client.collections.use("Products")

# Without boosting - all properties weighted equally
response = products.query.bm25(
query=user_query,
limit=10,
query_properties=["name^4", "description", "category^2"]
)

Syntax: property_name^boost_factor

  • name^4 - Matches in name count 4x more
  • category^2 - Matches in category count 2x more
  • description - Baseline (equivalent to description^1)

BM25 parameter tuning

Keyword search is done in Weaviate using the BM25 algorithm. You can tune the BM25 parameters k1 and b to improve search quality.

BM25 parameters:

  • k1 (default: 1.2, typical range: 0-2): Controls term frequency saturation. Higher values mean repeated terms continue increasing relevance; lower values cause diminishing returns.
  • b (default: 0.75, range: 0-1): Controls document length normalization. Higher values penalize longer documents more; b=0 disables length normalization.

When to tune:

  • Increase k1 when term repetition indicates relevance (e.g., product catalogs)
  • Decrease b when document length shouldn't matter (e.g., tweets vs articles)
from weaviate.classes.config import Configure

client.collections.create(
name="Articles",
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
],
# Set BM25 parameters
inverted_index_config=Configure.inverted_index(
bm25_k1=1.4,
bm25_b=0.8,
)
)

Hybrid search optimization

Hybrid search combines vector and keyword search. Let's explore how to tune this combination for optimal results.

The alpha parameter

Alpha controls the balance between vector search and keyword search in hybrid queries. As a result, it affects the final result set.

Scale:

  • alpha=1.0 - Pure vector search (100% semantic)
  • alpha=0.75 - Mostly vector, some keyword
  • alpha=0.5 - Balanced (default)
  • alpha=0.25 - Mostly keyword, some vector
  • alpha=0.0 - Pure keyword search (100% BM25)

Alpha can be provided at query time.

movies = client.collections.use("Movies")

# Balanced hybrid search (default)
response = movies.query.hybrid(
query="action adventure",
alpha=0.5, # Balanced
limit=10
)

# Semantic-heavy hybrid search
response = movies.query.hybrid(
query="movies about overcoming challenges",
alpha=0.75, # Favor semantic understanding
limit=10
)

# Keyword-heavy hybrid search
response = movies.query.hybrid(
query="The Matrix", # Exact title
alpha=0.25, # Favor exact matching
limit=10
)

Choose the alpha that balances the importance of semantic and keyword search for your use case.

Fusion algorithms

Fusion algorithms determine how vector and keyword results are combined into a single ranked list.

Relative score fusion (default)

Combines results based on normalized scores from each search.

How it works:

  1. Vector search produces scores (normalized)
  2. Keyword search produces scores (normalized)
  3. Scores are combined using alpha weighting
  4. Results are merged by combined normalized score
from weaviate.classes.query import HybridFusion

movies = client.collections.use("Movies")

response = movies.query.hybrid(
query="space adventure",
alpha=0.5,
fusion_type=HybridFusion.RELATIVE_SCORE,
limit=10
)

Best for: When score magnitudes between search types vary significantly

Ranked fusion

Combines results based on their rank position in each search.

How it works:

  1. Vector search produces ranked list
  2. Keyword search produces ranked list
  3. Scores are calculated based on rank positions
  4. Results are merged by combined score
from weaviate.classes.query import HybridFusion

response = movies.query.hybrid(
query="space adventure",
alpha=0.5,
fusion_type=HybridFusion.RANKED, # Default
limit=10
)

Best for: Most use cases - stable and predictable

Re-rankers

In some cases you can design a multi-stage search pipeline to improve the search quality, using "re-rankers". Typically, re-rankers make use of "cross-encoder" models.

Cross-encoder models have higher retrieval performance; however they have a big downside in that embeddings cannot be created in advance. As a result, these models are often used for "re-ranking" an initial set of results, using a set of results retrieved from a first-stage search.

Re-ranking can can be added to any search type, including vector, keyword, and hybrid. Re-rankers can be a great way to improve the quality of the retrieval results. Weaviate integrates with a number of re-ranker models, such as those from Cohere, JinaAI, NVIDIA, and more.

How it works:

  1. Initial search retrieves candidates (e.g., top 100)
  2. Re-ranker model scores each candidate against the query
  3. Top n re-ranked results are returned

Trade-off: Better quality for top results, but adds latency

Configuration:

from weaviate.classes.config import Configure

client.collections.create(
name="Articles",
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
],
vector_config=Configure.Vectors.text2vec_weaviate(
source_properties=["title", "content"]
),
# Add a re-ranker module
reranker_config=Configure.Reranker.cohere(
model="rerank-english-v2.0"
)
)

Using a re-ranker in queries:

from weaviate.classes.query import Rerank

articles = client.collections.use("Articles")

response = articles.query.near_text(
query="artificial intelligence",
limit=10, # Final number of results
rerank=Rerank(
prop="content", # Property to re-rank on
query="artificial intelligence applications" # Optional; default is the search query
)
)

for obj in response.objects:
print(obj.properties["title"])

Try re-rankers when:

  • Top-k quality is critical
  • You can accept additional latency
  • Initial retrieval is broad (high recall)

Trade-off: Better quality for top results, but adds latency and additional cost.

What's next?

Next, we'll put it together to consider how to select the right search type for your use case.

Login to track your progress