Prerequisites
- Understanding of Context Arithmetic
- Data to contextualize
- Familiarity with the Alchemyst SDKs and APIs.
Introduction
Context Arithmetic is a very simple but powerful tool. Leveraging that system, we can derive context spaces, merge them, segregate them, or even apply access controls on top of them - all while maintaining the tractability and other benefits that Alchemyst offers. In an agentic environment where non-determinism exists not as a byproduct, but as a foundational principle, it offers verifiability across large systems. This gives enterprise teams control over their data usage and governance patterns and scopes. In this section we explore patterns that you can leverage to build robust context-aware AI systems using context arithmetic.Pattern 1: Hierarchical groupName Design
The Problem
Flat groupName structures become unmanageable at scale. You end up with searches that are either too broad (returning 1000+ documents) or too narrow (returning nothing).The Solution: Think in Layers
Design your groupName hierarchy like a file system - broad categories at the top, specific tags as you go deeper.Three-Layer Strategy
Layer 1: Organization/Domain (Required for all documents)"engineering","marketing","sales","support"- Purpose: Top-level access control
"q4_2024","product_alpha","codebase","customer_tickets"- Purpose: Logical grouping and temporal filtering
"auth","api","campaign","billing"- Purpose: Fine-grained filtering
Example: Engineering Team
Query Patterns
Pattern 2: Handling Document Updates
The Deduplication Gotcha
Alchemyst usesmetadata.fileName as the deduplication key. This means:
- Same fileName = Alchemyst treats it as an update attempt
- Update attempts without delete = 409 Conflict error
- This is by design to prevent accidental duplicates
Three Update Strategies
Strategy A: Delete-Then-Add (Recommended for true updates)
Strategy B: Versioned FileNames (Keep history)
Strategy C: Conversational Updates (Incremental context)
Decision Tree: Which Strategy?
π§ Advanced: Batch Updates
Pattern 3: Composable Context Strategies
The Dilemma
Should you store one big document or many small ones? This affects retrieval quality, performance, and maintenance.Rule of Thumb: Split by Access Pattern
When to COMBINE contexts:
Scenario 1: Tightly Coupled InformationWhen to SPLIT contexts:
Scenario 1: Different Access PatternsThe Goldilocks Size: 500-2000 words per document
Advanced: Cross-References
Pattern 4: Bulk Operations at Scale
The Performance Cliff
Optimal Batch Sizes
Based on production usage patterns:| Documents | Batches | Time | Recommendation |
|---|---|---|---|
| 100 | 1 batch | ~0.5s | β Single call |
| 1,000 | 1 batch | ~3s | β Single call |
| 10,000 | 10 batches | ~35s | β Optimal |
| 10,000 | 100 batches | ~90s | β οΈ Too fragmented |
| 100,000 | 100 batches | ~6min | β Good with progress tracking |
The 1000-Document Sweet Spot
π Advanced: Parallel Processing with Limits
Error Handling in Bulk Operations
π‘ Pro Tip: Progress Tracking for Large Ingests
Anti-Pattern 1: Over-segmentation
The Problem
Creating too many small, fragmented contexts that should be combined.Real Example: Support Ticket System
The Acid Test: Query Simulation
Ask yourself: βWhat will my users search for?βπ‘ Rule: The Paragraph Test
If your document content is less than a paragraph (< 100 words), youβre probably over-segmenting. Exception: Structured data with high retrieval valueAnti-Pattern 2: Metadata Bloat
The Problem
Storing too much or redundant information in metadata, slowing down queries and wasting storage.Real Example: Product Catalog
Decision Framework: Metadata vs Content
Store in METADATA if:- β
You filter or sort by it (
price,inStock,category) - β
You need exact matching (
productId,sku) - β
Itβs used for access control (
department,classification) - β
It changes frequently and independently (
stock,price)
- β
Itβs descriptive text (
description,features) - β
Itβs rarely filtered (
dimensions,weight) - β
Itβs only needed when document is retrieved (
warranty,origin) - β
Itβs part of the semantic meaning (
reviews,specifications)
π‘ Pro Tip: The 5-Field Rule
Start with maximum 5 metadata fields. Add more ONLY when you have a specific query pattern that requires it.Real Impact: Before & After
Before optimization (metadata bloat):- Storage: 2.3GB for 10,000 products
- Query time: 450ms average
- Index time: 12 minutes
- Maintenance: 3 fields out of sync per 100 updates
- Storage: 0.8GB for 10,000 products (65% reduction)
- Query time: 180ms average (60% faster)
- Index time: 4 minutes (67% faster)
- Maintenance: Consistency issues eliminated

