Any modern RMS needs to allow users to establish relationships between items of content in the system. Whether between funds and their managers; companies and their employees; or even ESG engagements for prospective investments, the ability to describe how two entities are associated with one another is a critical part of the research process.
Bipsync’s relationship features have grown steadily over the last decade, beginning with simple tagging of entities against notes and then evolving into the ability to associate any entity with another. Recently we reviewed our implementation of these features and realized that if we were starting afresh we’d likely do things differently, as is often the case when appraising code that has grown organically over time. We saw opportunities to simplify our approach, and create code that’s easier to understand and modify — and more performant too.
What came before
Prior to this change, relationships between entities in Bipsync could have been described as ‘situationships’, to use the vernacular — i.e. relationships that are not considered to be formal or established. This was because our highly dynamic schema had no explicit definition of relationships within it. Instead we used the field definitions for each entity to derive implicit relationships between them. This is best explained with an example.
Imagine two entity types: Fund Managers and Funds. A Fund Manager can be associated with many Funds, but a Fund can only be associated with one Manager — a classic “one to many” relationship. In order to establish this relationship in our database, we’d create a field definition for the Fund entity to manage the mapping. This field would be configured as a “lookup” type: an autocomplete text field which is able to suggest and only accept Fund Managers.
With this done, we have the ability to say, for a given Fund, what the associated Manager is. This relationship can be expressed when we display the Fund’s data, such as on a dashboard within the RMS, or on a report or in an email. This was our first step towards relationship management.
Building on the relationship
With a relationship established in the RMS, another of Bipsync’s features can come into play: data lookup fields. These fields essentially use lookups within the database (think SQL joins) to fetch data from related entities. There are a few different types of data lookup fields; some fetch specific properties, while others can perform more complex tasks, like the aggregation of values. Continuing our example from earlier, a common use case would involve the display of a value from a Manager on the related fund’s dashboard (e.g. this could be the city where the Manager is located, or the date they were last contacted, etc.).
Data lookup fields are used extensively, and aren’t just confined to dashboards: they appear in grids, reports, exports, emails, API responses… pretty much anywhere you can view data via the platform. As they’ve become more prevalent in our clients’ configurations, we noticed that performance wasn’t always what we wanted it to be. On investigation, the problem was quickly apparent.
Each data lookup field’s value is determined via a database query. When fetching these values for display, our code was doing something like this:
It’s a straightforward algorithm, and it works. But it’s not efficient. It executes a single query for each field, which would make sense if each field was referencing data located in an unconnected collection in MongoDB. But in reality, we’re often fetching data from the same collection each time: if we have 10 lookup fields all referencing a property on a Manager, we should be fetching those 10 properties in one single query to the Manager’s collection — not 10 queries. It was apparent that a smarter approach would bring faster response times through reduced database load and less code execution, a significant improvement.
To take this forward, the first thing that needed to change was the way relationships were loosely defined in the system. We realized that by defining them as configuration within the database, rather than resolving them dynamically at runtime, we could leverage those definitions to not only make queries more performant, but also construct an object graph which could then be navigated by anyone consuming the data. This was really easy to do: we added a relationship management UI to our schema editor:
We were then able to reference these relationships in data lookup field configuration:
Next, we used this configuration to improve those inefficient queries. The new algorithm looks like this:
As you can see, this approach doesn’t totally eradicate the need to process some fields in isolation. Some of data lookup field types need to be run independently because of their nature: often a complex aggregated query is involved. Even so, a large swathe of simpler fields would now have their values calculated in one go, and these field types do tend to be more commonly used. Indeed, after switching one client to use this new formal relationship schema we reduced the number of queries required to load one of their dashboards by 97%, which gives some indication of how many of these related property field types they were previously using.
Making it public
Once we had this feature in place we were keen to employ it. As of January 2023, all new clients have been configured to use the new relationship enhanced fields by default. To address existing clients’ installations we wrote a tool which can be run on demand to convert applicable fields from one version to another. We have a rollout plan in place and expect to complete the upgrade of all remaining client installations within the year.
One of Bipsync’s strengths is its flexible schema. True to our ethos, these new relationship definitions are completely customizable from client to client. But by adding a dash of formality to our object graph we’ve improved performance substantially, and we expect further improvements in the near future as a result of this work.