AI in Investment Management

Practical lessons in building secure AI

Alright. Well, welcome everyone, and thank you for joining us today for this AI and investment discussion. I am joined today by Craig Marvelli of Bipsync. I wanna thank Bipsync for inviting me to participate as a moderator. I enjoy long form discussions such as these. I think we don't have enough of them out there, and I can give a better understanding and context for a lot of these issues. My name is Blake Fisher. I am the director of technology for University of Virginia's investment management company. We manage the university's endowments investments. I joined the VIMCO team about six years ago, and prior to that, ran my own solution design consulting practice. Been working in various technology management adjacent roles for roughly the last twenty years. And I'll let Craig briefly introduce him himself. Hello. Thank you, Blake. Yes. I'm Craig Marbelli. I'm the CTO of Bipsync. I I joined Bipsync roughly twelve years ago. And since then, I've been working on lots of different aspects of the platform. Initially, our mobile offering, through infrastructure stuff, our r r r m s app, back end systems, and most recently, our AI tools. And so I'm really excited to talk with you, Blake, today about AI in general and how we both see AI developing at the moment and where it could go in the future. Great. Yeah. I'm excited as well. Let's jump into it. So, you know, throughout building Bipsync AI, you've worked with a lot of institutional investors. And when you look across those investment teams experimenting with AI today, What differences stand out between the teams when it's, you know, used in a generally genuinely useful way versus those where AI kind of struggles to gain traction? Yeah. So we have been lucky to work with, you know, numerous companies now, a lot of them in-depth too through the beta process. And something that was very apparent during that process was the teams that had their their data set up in a way and their workflow set up in a way that engendered them to take advantage of AI properly just naturally fell into it. The the the companies that weren't really set up to start with, they're the ones that took a little bit longer to gain traction with and didn't didn't always gain traction. Because ultimately, if I think if the data isn't in the right format initially and or can't be gotten to the right format quickly, it is a case of garbage in and garbage out. And so we saw, for example, you know, went through building up our chatbot, that users expect a certain quality result to come back. But that to to to achieve that, you need to be able to make rational decisions about the data and provide it to an LLM in a in a format that it can understand. And if the data isn't really set up for that in the first place, like, if the metadata isn't there, if it isn't, like, structured properly, that gets really hard. Yeah. That's interesting. We've experienced some of that as well. I'm curious about your thoughts on the role that that data structure and workflow context play in whether investment teams end up trusting the outputs. Maybe we could kind of double click on that and get some more some of your perspective on that that topic. Yeah. Yeah. Absolutely. So, like, if you're a if you're an end user, you know, if, say, you're an analyst or something, you're you're asking questions of your data, you're potentially looking back over, like, ten plus years of information. And an ally must trying to search through all that and give you back a response, a a short response as well considering the length of data you're dealing with in a very short space of time. And, really, what you want is an answer that's accurate, but also an answer that allows you to to verify easily. And I think it's important, therefore, that we're able to surface not just the the response back from the system, but also the citations and the metadata that inform that response to give the end user comfort and confidence that it was sourced accurately and and and and it's a rational response at the end of the day. And that that and that goes back to what I was saying about how, you know, teams that have made sure that their processes are aligned to make sure that that method is in place, whether it's coming from external systems or it's part of the the input mechanism for an end user, like, some compliance or something. If if if that happens, then it allows us to then feed off that and provide that back to the end user. If it's not there to start with, then really the the user is left to try to trust a response with hardly any information to go on to try to verify it, and that's where it gets hard. Yeah. Yeah. That makes sense. What do you think is the least visible but most important work required to make AI reliable in those environments? So there's a few things. I mean, from our perspective, we put a lot of effort into the underlying technology to to give ourselves confidence that we are building something that is reliable. So that can be that can be something as simple as making sure that the the sharing model that is applied to all data dictates what a user can and can't see, that that that is extended to the LLM. So no data comes back or goes into the LLM that the user shouldn't be accessing. That's obviously very, very, very important. And building that in from the start, I think, is, you know, is is it was a was a massive thing for us. Because we if we can trust the data, then, you know, we can sort of pass that on then to to the end users too. We've put a lot of time into making sure that our testing infrastructure is adequate to cope with what is a very difficult thing to test. I'm sure you can appreciate that these systems using LMs are completely unpredictable. They'll give you a different response every time you try it. So to actually verify that the system is working accurately and giving the right responses is a complicated thing. So from our perspective on our end, you know, we we have very, very complex testing tools that often are just as complex as the systems themselves to do things like using LM as a judge to capture responses and compare them to previous ones and and still verify that they're accurate even if they aren't actually, like, a one for one match. But then also making that information available to the end user so they can feedback to us, was this response good or not? Why was it not good? So we can then feed that back into the process and continue to refine the tools and understand where they may be falling down in accuracy. Yeah. That has to be especially difficult given how broad your your client base is. Double clicking on something you alluded to with respect to architectural decisions and this idea of trust and hallucination. I'll I'll, put forward the notion here that maybe it is if last year's problem I'm not saying it's gone, but if last year's problem was this one of hallucinations, Largely, that has been resolved, right, through various mechanisms of grounding, citations, etcetera, you know, which you mentioned. If there's a new problem this year, it's context loss, errors of omissions, perhaps, you know, AI that answers a question authoritatively but gives you eighty five percent, leaving fifteen percent of the substance that you may actually need or that is relevant. So the real question is how do you implement AI that is secure, reliable, scalable, and embedded in investment workflows while minimizing that context loss, those errors of omission? Yeah. I mean, it's a tough problem. You know, we've we've seen the last year, as you've said, that the the sort of the principles that have been coming to the fore have reached, like, probably a point where they're kind of commonplace. Now, like, Rag, for example, is a, you know, is a pretty well acknowledged solution for doing something like we're doing and many other companies are doing in terms of providing a search across people's content, harnessing LLM as well, still providing that extra, like, layer of domain specific knowledge and up to date knowledge. Where do you go from there, I think, is is the thing. And as you said, context really is the key for this. So users ultimately want to not just get an answer to their question, but they also want to get an answer to the question that understands what their reason for asking it was in the first place. And that's difficult when you're limited. You can't, you know, download an entire human's understanding of their role and their organization's approach into an LLM. So trying to provide that to the system at runtime is really what the challenge is now. You know, some of the inroads made into that are providing a glossary to the system so it understands the domain better, the nomenclature that gets used in the organizations. And that's really important. So a user can talk to the system using the the words that they are familiar with and the terms they are familiar with, and it still understands what they're trying to express. Another challenge is understanding the outputs that a user wants to see. So if they ask for, say, an IC memo, it knows what that looks like and how to give it back to them or sort of format they're looking for, the types of data that's important in that context. I think that then helps to address the other issue that you mentioned, which is, like, context does get lost a lot of the time. You're constantly dealing with a narrow window in which to fit everything, and that will change over time, undoubtedly. There's there's always going to be a limit to it, and we're always gonna fight that battle to some extent because the the wider that window gets, the more we're gonna try to push into it anyway. So it becomes a question about trying to focus on the most important things. So like I said, trying to build, like, layers into the system around who the user is, what the organization is, what what they're trying to understand, what they their most important the most important aspects of the questions that they're asking are, and what matters to them. So we can try and prioritize those things and make sure those get through first and foremost. And then things do slip through the cracks. Hopefully, it's things that are less relevant to the end result. But it's a battle. I I don't I don't think it's the one we're gonna solve in a short space of time. Yeah. It's almost like you have to develop for tomorrow's model capabilities today. When you design AI for investment teams, what architectural decisions do you think tend to matter the most six or twelve months later, even if they're invisible now or at launch? So I mentioned sharing and, like, provenance, making sure that users can only see what they should see. You don't wanna find out, like, six months in that people are starting to be told things that they weren't supposed to. Like, sometimes some of the firms we work with, there's there's clear operational areas that need be put in place to prevent people from accessing knowledge that they absolutely should have access to. So, like, we we we bet heavily on that. Things like cost analysis, like making sure that the systems we're using now, you know, are scalable in the long term as we as you roll features out, like something that works quite easily on a on a single developer's machine as they build something, you know, talking, making a few calls back and forth, and the LLM service. When you scale that up to hundreds of thousands of users, maybe across, you know, hundreds of companies, that can get really expensive. So you've gotta, like, really put things in place early on to be able to track exactly what it is you're spending and what that that and project that across the data you know you have access to. Like, for example, we know, you know, on average scale, at least, how many documents and files our our clients have. We can then work backwards from there and try to determine, you know, for a given query, how many of those documents can be referenced, what's the likely cost going to be for a query. And then that allows us to project, you know, our overall cost and usage of the LLM services. So that was that was built in really early. And I think you can't really sort of, like, turn that on when you're in production to find out you're actually spending, like, thousands and thousands of dollars more than you thought you were going to. So there's a lot to that. Yeah. I mentioned testing earlier. Think that's still a really important one. You know, getting the getting the the harnesses in place to make sure that you understand the outputs as best as you possibly can. Don't rely on your users finding those problems for you. I think that's a really key thing. This this technology is so new. And, you know, from a user perspective, little little understood compared to the paradigms we've had for the last, like, twenty, thirty years in computing. You don't want the users to lose faith in the technology really early on because you haven't done the homework and making sure that the responses are good. So that was a really key for us is in making sure the testing is in place and we feel confident about the results. Yeah. Each implications in terms of, you know, road map design and, you know, kinda getting into this idea that, you know, I have with a lot of peers I'll you know, and and everyone experiences this to some extent the buy versus build question. So I'm curious, you know, in your conversations with clients, how do investment teams approach buy versus build decisions around AI? What do you think tends to influence those decisions most? And how do you decide what to build to support that given this seems like the, you know, nature of change is so rapid, you could come up with a plan, and then two weeks later, it might be outdated. Maybe it needs to be edited already. You know? How do you make those decisions? Yeah. I mean, I have been working this company for a long time now, and I think this isn't a problem that's new to AI. This is a a question that all companies ask themselves, us included. You know? I often see something, a software product, I think I could probably build that myself, you know, save ourselves, like, ten dollars a month. But it isn't always wise to do it. I mean, as much as it's it's often it should love to scratch. The the wise thing to do is to consider what the time to deployment is and whether the time whether you what you lose in the in the time that would take you to build, what it take you to build. I think the the main benefit for building something yourself ultimately is if you have something unique or you have a very clear vision for what you want and you know there's not much out there that can do it or for whatever reason you can't maybe can't afford to buy it or whatever. That's that's unlikely. But, nevertheless, there's there's a clear route to building something yourself, and it usually comes when you have a very clear vision and you think it's it's attainable fairly easily. Typically, and I I I again, I go back to, you know, what we use here for development and for all our other processes. The time to deployment is often so much quicker when you're buying something off the shelf, especially when you are leaning into services that are used by lots of VPs. Like, for example, say, take Jira, you know, for for product management, not massive fan of the product, but it's so ubiquitous. It has all the features you could ever possibly want. So it's almost like a no brainer for us to use it. I think that's probably the case for a lot of our clients as well. You know, there are enough products out there now like ours in this space that a lot of the requirements are already covered. And in some cases, can build on top of those then to provide the ones that aren't. So I think clients end up, you know, either going with a a buy or potentially a buy with a, you know, with a build on top to provide that that layer of, you know, proprietary functionality. And that usually works. I think that that seems to be the best way that I've seen it pan out. Lots of people who haven't gone down that route, I think, eventually do when they realize the problem they're trying to solve is a lot harder than they maybe managed. And that's that's very true for AI because I'm the I think one of the fallacies I see with LMs is you can talk to them and get a response back so quickly. It just looks like magic, and you think the entire process is magic. But there's a lot of stuff. I've mentioned some of it already in, like, cost control and data access and testing. Lots of the behind the scenes that is complicated and and needs to be there as, you know, as bootstrapping. Otherwise, you know, the the end result is never as accurate as you need it to be, and that's a bit of a hidden cost, I think. Yeah. Yeah. Shifting a little bit into, you know, security governance and risk. This AI wave, let's call it, looks has some parallels to me if you look back to the move to the cloud. You know? It's just happening at such a ridiculous pace compared to that transition along with any sort of shift in the underlying technology or infrastructure that underpins something, there are always renewed concerns around security risk governance. And I'm curious what concerns you hear come up most often when teams are considering deploying AI inside core investment workflows. To name a few, I mean, you have this concept of there's different use modalities, and we've all sort of become comfortable with the interactive chatbot. It'll be safe if you don't have your data training the model, but now there's this new emergent agentic consumption modality where the AI it's the same kind of technology, but it's being used inside of of a loop that allows the technology to actually execute tasks or control a browser and those sorts of things. What kinds of new security governance and risk concerns come up, you know, as a result of deploying agentic AI and in general when you consider deploying AI inside of these core investment workflows? Yeah. So that's a that's a big question. So I preface it by saying that currently, we don't offer an agentic solution, although it is one that we are interested in and working towards. There there's clear value in in those tools. I use them every day in software development. You know? Much of the team and and I I think, you know, a lot of developers in general now are starting to harness the capabilities of agentic coding solutions. And, naturally, you know, as we're using them ourselves, we we can't help but picture how this would apply in our product and to our users, and there's a clear there's a very clear use case for it. But as you said, it does throw up some questions, how to do it properly and safely. Security, I don't think is too much of a concern, at least compared to what AI is doing already. You know? You put the right guardrails in place. The LLM stays ALM. You make sure you you put the the right barriers around it to make sure the data doesn't go in where it shouldn't be. As you I think you mentioned, like, opt out of all the training and and so on to make sure that data is gonna stay in in the LLM afterward. You know, I think people have made peace with that, and they understand that. And I think the the LM vendors know that they're not gonna get anywhere if they don't allow people to opt out and make sure they're opted out. So that for me is a kind of a solved problem at this point. There are some nuances around safety, security things like prompt injection, that sort of thing. But much like with SQL injection, I think, again, that's a problem that would just become natural to solve. I'd imagine as frameworks continue to spring up around all this offering, they'll just take care of that for you a lot of the time and much like with SQL injection. And it almost be almost impossible to get that bit wrong. So security, I think, is is okay. Where I can where these tools, I think, probably are gonna pose a challenge, the automated agents, will be around things like governance. You know, like, for example, an audit trail. How how would you audit something that is doing something in the background potentially when you're not even paying attention to it? Like, you could be asleep, and it could be off, like, working through your investments and and things like that. I mean, I think we're a long way off that happening, but that's the kind of challenge I see is without a human in the loop, we have to be very careful about the operations that these tools are making. Perhaps some will be permanently off limits. But, you know, you could totally see somebody that, you know, if if it isn't happening already, probably is, like, working into in quants, wanting a tool to be automatically trading for it. The there there there are lots of ways that this technology could be incorporated into, you know, businesses like like like yours and our other clients. Some are gonna be safer than others. Going back to what was saying earlier about, like, evaluation, I think there's lots of building in building in order trails to make sure that people are always aware of what's happening and when it's happening, making sure there are opportunities to break out of the loop and and refer to human decision making where where necessary, I think, is obviously a must. And even smaller changes like, you know, being able to review how this thing operates when a a model upgrade happens. Like, probably seeing at the moment is, you know, model model upgrades happen so often, You know? And and deprecations happen so quickly. Amazon released the latest versions of, I thought, McLeod, to Bedrock, over the last few months, and the the previous model's already been deprecated. They're they're almost already gone. You're giving very little time to turn things around. And that's a challenge. It it's a challenge when you are when you are building features on top of these models that are, you know, predictable, like, for example, a summarization tool, that's an easy thing to regression test and then understand that your functionality is still as you'd expect. But when you're building an agent that has carte blanche to make decisions, it's hard to regression test a system like that when you're upgrading to a new model. And you might not find that a problem is in the system potentially days or weeks afterwards when you're maybe not even paying attention to it. You've already grown confident with it. So that's a very real risk is that the pace of the pace of AI is going to challenge us because it's gonna be hard for us to keep an eye on the systems as they change when they're operating with our instruction. Yes. And everything's moving from deterministic to inferential mode of operation. Yeah. I'm I'm curious. There's a lot to be excited about. You know, this is, for me, one of the it's gotta be the most exciting thing that's happened since the Internet. It seems like there's something new every week, every day almost. What are you most excited about looking at the year ahead in the context of AI? This will sound like a really boring start to this answer. But I think increased governance around, like, standards is is a huge thing. We've seen over the last year or so, standards start to take shape like MCP, for example, but I don't think they've still yet evolved to a point where there's enough consensus to feel confident in how things will pan out in the long term. And just going back to when, you know, the the AI explosion really happened, like, a year or two ago, we've been looking at AI before that, you know, for for quite a while exploring models and seeing how we could leverage, you know, models in our product. And this this they there just wasn't enough of a consensus and and standardization around the process for us to really be confident about it. And then it was when, you know, OpenAI and ChatTubeT happened and, you know, I think I was clawed on on the back of it, that it became viable for us to say, okay. This is something which we can sort of bet on for the near term. The next, like, two to three years look look solid. At least as far as using LLMs go, you know, directly, you know, information in response back. I think this next phase of this then will be, well, how do these things talk to each other? How do you integrate these systems into larger systems? That's really what everybody wants to do now is I've got this information over here. I've got that information over there. How do I get these things talking to each other? Because the question I want to ask and the answer on back doesn't just rely on having access to one pool of information. It requires everything. And you can help by putting that information in one place, and you can do that with something like, a data lake, or you can do this like Bipsync where you're trying to create lots of different systems together. You can do it with a with a shared drive. But, ultimately, for ease of use and easy connectivity, you want that to be a, yeah, a programmatic process. And MCP is getting there, but I think lots of clients don't yet understand what it is they want. MCP in of itself doesn't give you the answers, just gives you the technology and the way to hook things up. All of them, there's a seemingly taking different approach to how they wanna wanna manage it. So I'd like to see that just kind of become a little bit more condensed and realized so that we can really jump in on it and and and and make use of it and and then pass that opportunity onto our clients. I think that will happen over the the coming months. I'd be really surprised if by the end of the year, not there, the way things are going. Yeah. So I'd say that. It's a bit of a boring answer. But but I'm I'm also really excited, as I mentioned earlier about agentic coding. Like, gains that we are able to make now in terms of being able to turn around new features or, you know, address bugs or whatever it is, is it is absolutely incredible. Like, genuinely, like, game changing. I think we I I built my first, you know, sort of program with these tools roughly about a year ago. We had to we had to build a component for our Toro service to to do text extraction from files. I think it took about it took maybe, like, half a day to a day to do it. And this is a year ago, maybe probably even quicker now. And I I needed to use Java because of the the the library I wanted to use only available in Java. I hadn't really used Java in a number of years, so I was rusty to say the least. But, you know, within a day, had something working. I think, you know, it's achievable without but it would've taken me probably at least twice, three times that long, at least. And we're seeing that now replicated across everything we're doing, that that gain. And it it just really excites me about what we can do with the product. You know, we've got such a long road map, long backlog of things we wanna build, and, you know, we can't do everything, obviously. We've got a lean team, which I'm happy, though. I like working lean. And this is, like, the next level of that, you know, being able to, you know, leverage what humans are good at and pass on, you know, a lot of the heavy lifting to machines could mean such a massive evolution of our product. So, yeah, that that that also just thrills me, and I can't wait to see what we can do with that next. Yes. And it's wild how they keep getting better and better, and you think that they're gonna hit some sort of ceiling and they just haven't yet. It's amazing. Absolutely. Well said. You know, I'd like to if if you would indulge me here, maybe we could do a quick rapid fire question session. I'm gonna shoot five questions at you, play a little bit of a game, respond in, like, fifteen, twenty seconds max each. K? Not promised. Number one, AI fatigue. Real problem or just noise? Partly a problem, I'd say. It's it's everywhere. It's constantly in in your thoughts, work out of work. The the nature of our roles are changing. As I mentioned, using AI to AI to program now, I'm constantly reading outputs from from machines. So that that that in itself is tiring. So, yeah, there's a bit of fatigue. Yep. Will this wave of AI replace jobs or create new ones? It'll have to create jobs. That's beyond a doubt. We need people to manage this stuff and to put guardrails around in all the problems we haven't had to worry about before. So that's somebody's role to do. We may lose jobs. Jobs that LMs can handle easily or AI powered tools can handle easily might well change, but, hopefully, the productivity we gain from that will eventually lead to replacement positions. Fingers crossed. Open source versus closed source. Who wins? I love open source. But right now, I think closed source is well ahead in terms of functionality and applicability. In the long run, open source should catch up. There are enough players in this space that I think it's inevitable it'll catch up. And when it does catch up, the opportunities for us to use that technology for very specific targeted reasons is huge. And that's where I think closed source will always struggle because they're aimed at a very broad, you know, functionality set. Yep. Regulation, too much, too little, or too early to tell? We don't want we don't want lots of regulation. It depends on the use case, I suppose. If if what you're doing is just trying to pass on text, you don't wanna have to jump through hoops and sign forms to do that, you know, to extract dates from a document or whatever. But there are undeniably uses of this technology that worry me as a parent. Things like deepfaking or, you know, that that whole, like, aspect of can you trust what you see in a screen? It worries me personally, but it also worries me for our children's future because it's hard to understand the world if you can't trust it. Yep. It's complicated. What is the most overhyped AI trend right now? I'm I'm gonna say automated agents. Open Claw. Open Claw. Yeah. Not that I don't have a lot of enthusiasm and respect for what's gone into making that possible. The chap who developed Open Cloacci, Peter Steinberger, he previously ran a company. He's a bit a tantrum. He he ran a company which did PDF extraction. We use that in our product. It's fantastic, and he's a very clever guy. I think that that that technology will absolutely go somewhere. But right now, if you were to ask me to open up my emails and my banking application or whatever to an automated agent, I would run a mile. I don't think we're ready for that just yet. The risks of it going wrong far out benefits for me. But Somewhere between really interesting and very dangerous. Absolutely. Absolutely. So one day, yeah, but right now, I wouldn't touch it. Are we moving too fast or not enough? Not fast enough. Well, just going back to that, I suppose. I think we're moving really quickly. Maybe a little bit too quickly in in those respects. I'm I'm happy there are people out there on the frontiers, like, trying it and confident with it because they push they push us forward, technology wise, and we'll get there. I think we're moving fast enough in terms of what we believe is as speaking purely from a software engineer's perspective now. Like, what has happened in the last year has been incredible. I think that's that's amazing. I haven't seen really too many downsides apart from that fatigue that we talked about. It can be difficult, you know, working with these tools day in, day out sometimes because it just doesn't stop. You know, I find myself often at, like, ten o'clock at night just jumping back on to see what what's happened with what I the task I set it off to do like an hour before, you know. So there's that problem. But aside from that, I think things are proceeding at a good pace. And as long as we do let that regulation creep in what needs to creep in, and we put the processes in place where it's not regulated to make sure we protect ourselves, I think we're we're moving pretty well. Well said. We'll leave it on that. Craig, thank you. It was it's been a pleasure to chat with you today, and, that with that, we'll close out. Thank you very much, Blake. That was a that was excellent. Really appreciate it. Thank you.

Every firm is exploring AI in some capacity, but only some are turning that exploration into measurable value. In this session, Craig Marvelley, Bipsync CTO, and Blake Fischer, Director of Technology at UVIMCO, sit down to unpack the driving forces behind AI implementation, adoption, and long‑term impact based on our experience working directly with institutional investment teams to deploy AI into research and investment workflows.

In this fireside chat, you’ll learn:

Practical implementation considerations and the security fundamentals for institutional environments
How to approach buy‑vs‑build decisions from both a technical and operational perspective
Ways investment teams are navigating emerging challenges such as scale, context, and workflow integration
The considerations that may matter most as the industry enters the next phase of technology innovation