Your engineers ask for a bigger vector database. "We need to index more documents," they say. "Then retrieval will be better." So you double the index size. Query time gets slower. Results aren't noticeably better. You're still having the same problems. This is the wrong fix for the wrong problem.
The issue isn't data volume. It's data assembly.
An agent with access to 100 high-quality, perfectly assembled context items will outperform an agent with access to 10,000 items plus noise. The agent with 10,000 items has to filter, interpret, and prioritize constantly. It uses more tokens. It wastes more latency. It's distracted by irrelevant information.
Your vector database doesn't assemble context. It returns candidate results, ranked by mathematical distance. It doesn't understand that those 5 results contradict each other, or that one is outdated, or that they only make sense in a specific order.
This is why larger indexes often perform worse. You're adding more noise. You're making it harder for your retrieval system to distinguish signal from signal. You're asking your agent to assemble meaningful context from candidates a vector database chose based on embedding distance, which has nothing to do with what your agent actually needs.
Context assembly is different from retrieval. Retrieval finds candidates. Assembly shapes those candidates into something an agent can actually use.
Assembly means understanding relationships between results. If your agent is planning a workflow, you can't just return five similar documents ranked by cosine similarity. You need to return them in the order that respects dependencies and constraints.
Assembly means filtering for recency. A vector database has no concept of time. Six of ten results might be from when the API worked differently. You need temporal awareness.
Assembly means understanding user context. Same query from a senior engineer versus a new team member? They need different context. Same query from a user who's failed this operation five times versus someone doing it for the first time? Different assembly. Similarity search returns the same results for everyone.
When you treat vector databases as truth, you accept their limitations. You keep tuning embeddings and raising K. You're optimizing retrieval when the problem is assembly.
Your agent doesn't need more sources. It needs a system that understands how to assemble the sources you have into coherent context. It needs assembly that respects task structure, temporal constraints, user state, and entity relationships. That's not retrieval. That's something deeper.
Stop asking "do we have enough data?" Start asking "are we assembling context correctly?" The second question is where the real improvements come from.
FAQ
Should we try a better embedding model? Maybe, but better embeddings just optimize ranking at retrieval. They don't fix assembly. You're still starting from the wrong layer.
What if we add more reranking? Reranking ranks candidates better, but it doesn't assemble context. You're still getting back a flat list of documents.
How do I know if this is my bottleneck? Run the same agent on a small curated context set versus the same data through your vector database. Better performance on curated data? This is your problem.
Conclusion
Every dollar spent optimizing retrieval while ignoring assembly is a dollar not spent on the thing that actually matters. Your data is probably fine. Your indices are probably big enough. Your embedding model is probably good enough. The bottleneck is assembly.
The real constraint isn't similarity—it's context. Build systems that understand the difference between finding candidates and assembling meaning.
Your agent is waiting for context that actually makes sense.