What we learned building AI for investment teams
Most investment teams have moved past the “should we use AI” conversation. The harder question now is how to make it work inside investment workflows, with real data, at a level of quality that analysts and PMs can trust. Our CTO Craig Marvelley recently sat down with Blake Fischer, Director of Technology at UVIMCO, to talk through what that looks like in practice. Here are the themes that were discussed.Â
Data readiness matters more than model sophisticationÂ
The teams getting the best results from AI had their data in order before they started. Structured metadata, disciplined tagging, clean workflows. These are the foundations that allow AI to return accurate, verifiable answers. Teams that skipped this step found that the outputs were unreliable, and the root cause almost always traced back to data quality. The principle is simple: garbage in, garbage out, regardless of how capable the LLM is.Â
Context loss is this year’s problemÂ
A year ago, everyone was talking about hallucinations. That’s been largely addressed through citation grounding and Retrieval-Augmented Generation (RAG). The problem practitioners are solving for now is subtler. AI answers a question authoritatively but leaves out the substance that actually matters. Fixing this requires layered context: glossaries that teach the system your internal nomenclature, output templates that define what an IC memo should look like, and enough awareness of the user’s role and intent to prioritize the right information when the context window gets tight.Â
The decisions that matter most are invisible
Some of the highest-impact work in AI development happens well below the surface. This includes: Â
- Data access controls that extend to every LLM queryÂ
- Cost instrumentation that lets you project per-query spend across your entire client baseÂ
- Testing harnesses built to evaluate non-deterministic outputs, including LLM-as-a-judge approaches that can catch regressions even when responses aren’t identical word for wordÂ
None of this is visible to the end user, but all of it determines whether a product holds up at scale or breaks the moment you move past a handful of beta testers.Â
Security is table stakes. Governance is where it gets interesting.Â
Data privacy and model training opt-outs are effectively solved at this point, with every serious vendor handling them. The more pressing conversation now is what happens as AI moves into agentic territory. Â
This shift raises new questions:Â Â
- How do you audit an autonomous process running in the background? Â
- How do you regression test a system that makes its own decisions when the underlying model gets deprecated every few months? Â
Model deprecation cycles on Bedrock are a concrete example, as previous versions are already being sunset, and the turnaround window keeps shrinking. For teams building on top of these models, that accelerated pace introduces real governance risk.Â
Plan in short cyclesÂ
This theme kept coming up throughout the conversation. The speed of change in AI is extraordinary, but it makes long-range planning unreliable. A two-year roadmap built around today’s models could be irrelevant by this summer. Â
The practical response is to build modular systems that can absorb model changes without re-engineering the whole stack, favor short development cycles, and stay flexible on providers. At Bipsync, this means using microservices and lambdas specifically so AI components can be swapped out without impacting the core platform.Â
Â
Watch the full conversationÂ
There’s more in the full session, including a rapid-fire segment on AI fatigue, open source vs. closed source, regulation, and the most overhyped trend in AI right now.Â