Industry patterns & use cases
These example patterns show how architectural decisions combine to address specific industry needs. Use these as starting points for your own design decisions.
E-commerce product search
Context: Product catalog with diverse categories, user reviews, and dynamic inventory
Typical scale: 1M-1B+ products, high read-to-write ratio
Architecture pattern:
Collections: Products (single collection with categories as properties)
Vectors: 3 vectors per product
- "description" vector (text embedder for semantic search)
- "categories" vector (text embedder for semantic search)
- "visual" vector (multi-modal embedder for image similarity)
Schema: Denormalized with category, brand, price, availability
Deployment: Serverless Cloud → Enterprise Cloud as traffic grows
Key decisions:
- Single collection enables cross-category search ("bathroom accessory" search should find "bluetooth speakers" as well as "soap tray")
- Denormalized brand/category data avoids slow cross-references during filtering
- Multiple vectors support both text search and visual similarity
- Quantization enabled to manage memory costs at scale
Common evolution path: Start with Serverless → migrate to Enterprise Cloud when hitting tenant limits or needing custom networking
Document management & knowledge base
Context: Internal company documents, PDFs, wikis with access control needs
Typical scale: 10K-10M+ documents, multiple departments/teams
Architecture pattern:
Collections: Documents (single-tenant to allow company-wide searches)
Vectors: Single vector optimized for semantic document search
Schema: Document metadata + chunked content
Deployment: Enterprise Cloud (for security/compliance requirements)
Index: HNSW with higher ef values (accuracy over speed)
Access control: RBAC for identity-based access control
Key decisions:
- Document chunking stored as separate objects with parent document references
- Higher HNSW ef values prioritize accuracy for knowledge work
- Enterprise deployment meets security and compliance needs
- RBAC provides granular access control
Access pattern: Heavy search, light ingestion → optimize for query performance
Online journaling SaaS
Context: Personal journal entries with AI-powered search and insights, strict user data isolation
Typical scale: 1K-100K users, 10K-1M entries per active user
Architecture pattern:
Collections: JournalEntries (multi-tenant by user)
Vectors: Single vector optimized for personal semantic search (could include image search)
Schema: Entry text, date, mood tags, location metadata
Deployment: Serverless Cloud (managed scaling for variable usage)
Index: Dynamic index (small users get flat, very active users get HNSW)
Access control: API keys (app-level auth, tenant isolation via multi-tenancy)
Key decisions:
- Multi-tenancy ensures complete user data isolation
- Dynamic indexing handles varying user activity levels efficiently
- Single semantic vector optimized for personal writing patterns
- App-level authentication with API keys, users never access Weaviate directly
- Tenant state management (INACTIVE for dormant users to save resources)
Access pattern: Bursty writes, occasional searches → optimize for cost efficiency and data privacy
Let's wrap up with a quick reference guide and common anti-patterns to avoid.