Weaviate Academy

These example patterns show how architectural decisions combine to address specific industry needs. Use these as starting points for your own design decisions.

E-commerce product search

Context: Product catalog with diverse categories, user reviews, and dynamic inventory

Typical scale: 1M-1B+ products, high read-to-write ratio

Architecture pattern:

Collections: Products (single collection with categories as properties)
Vectors: 3 vectors per product
  - "description" vector (text embedder for semantic search)
  - "categories" vector (text embedder for semantic search)
  - "visual" vector (multi-modal embedder for image similarity)
Schema: Denormalized with category, brand, price, availability
Deployment: Serverless Cloud → Enterprise Cloud as traffic grows

Key decisions:

Single collection enables cross-category search ("bathroom accessory" search should find "bluetooth speakers" as well as "soap tray")
Denormalized brand/category data avoids slow cross-references during filtering
Multiple vectors support both text search and visual similarity
Quantization enabled to manage memory costs at scale

Common evolution path: Start with Serverless → migrate to Enterprise Cloud when hitting tenant limits or needing custom networking

Document management & knowledge base

Context: Internal company documents, PDFs, wikis with access control needs

Typical scale: 10K-10M+ documents, multiple departments/teams

Architecture pattern:

Collections: Documents (single-tenant to allow company-wide searches)
Vectors: Single vector optimized for semantic document search
Schema: Document metadata + chunked content
Deployment: Enterprise Cloud (for security/compliance requirements)
Index: HNSW with higher ef values (accuracy over speed)
Access control: RBAC for identity-based access control

Key decisions:

Document chunking stored as separate objects with parent document references
Higher HNSW ef values prioritize accuracy for knowledge work
Enterprise deployment meets security and compliance needs
RBAC provides granular access control

Access pattern: Heavy search, light ingestion → optimize for query performance

Online journaling SaaS

Context: Personal journal entries with AI-powered search and insights, strict user data isolation

Typical scale: 1K-100K users, 10K-1M entries per active user

Architecture pattern:

Collections: JournalEntries (multi-tenant by user)
Vectors: Single vector optimized for personal semantic search (could include image search)
Schema: Entry text, date, mood tags, location metadata
Deployment: Serverless Cloud (managed scaling for variable usage)
Index: Dynamic index (small users get flat, very active users get HNSW)
Access control: API keys (app-level auth, tenant isolation via multi-tenancy)

Key decisions:

Multi-tenancy ensures complete user data isolation
Dynamic indexing handles varying user activity levels efficiently
Single semantic vector optimized for personal writing patterns
App-level authentication with API keys, users never access Weaviate directly
Tenant state management (INACTIVE for dormant users to save resources)

Access pattern: Bursty writes, occasional searches → optimize for cost efficiency and data privacy

What's next?

Let's wrap up with a quick reference guide and common anti-patterns to avoid.

← Back to Lesson Overview

Industry patterns & use cases

E-commerce product search

Document management & knowledge base

Online journaling SaaS

← Back to Lesson Overview

Industry patterns & use cases

E-commerce product search​

Document management & knowledge base​

Online journaling SaaS​

E-commerce product search

Document management & knowledge base

Online journaling SaaS