March 30, 2017
By: Dan Langevin
Data Distribution – Use Cases for API vs Flat Files
**Ideon is the company formerly known as Vericred. Vericred began operating as Ideon on May 18, 2022.**
For the past 10-15 years, APIs have been considered the “modern” way for two software systems to interact. But an API isn’t the solution to every problem.
At Vericred, we provide large volumes of health and benefit data to partners for integration into their platform or their product, and when we were developing our integration points, we were faced with a decision: do we go API-only or do we support other methods of data transfer? Ultimately, we landed on a hybrid approach where we provide product feature-level access to our data via API, and platform integration via a set of flat files. This approach has proved to be flexible for us, and has allowed us to develop deep integrations with minimal friction and start-up costs for our customers while minimizing bloat within our codebase.
Is this the right approach for you? Below I offer four considerations to keep in mind when deciding how you’ll integrate your data.
1. Usage of the Data (Platform vs. Product Feature)
Is the data going to be used to build a product feature or to power a platform? This speaks to the flexibility that our customers will need when using the data. For example, our provider-network search data answers a few simple but commonly asked questions. The primary questions is “is Dr. X in-network for this plan?” This lends itself nicely to an API endpoint (and, in fact, we only offer this data via API). We would consider this a product feature: it solves a very specific need. While it’s an important part of the user experience, its functionality doesn’t bleed over into too many other user journeys in our customers’ apps.
Conversely, for many customers, our health plan data is core to their platform. It’s displayed in multiple user journeys throughout their apps. This lends itself well to a bulk data transfer process. Our customers would rather have this data in their own database.
2. Volume and Frequency of Update
How frequently is this data updated? The cost, in terms of development and operations time, of pulling a large data set into their database is considerable for our customers. If the data set is updated extremely frequently (and if the number of updates is very large), these issues are magnified.
In the previous example, our provider-network data has hundreds of millions of records and changes very frequently. We see churn as high as 8% per month in certain networks. The volume of the dataset is an indicator that a flat file might not be an optimal solution.
Conversely, plan data is updated once a year and rate data is updated once per year in the individual market and once per quarter in the small group market. While in practice the data changes quite a bit more than that due to corrections, new data becoming available, and other factors, the volume and frequency of updates are far lower than provider-network data. This makes it a candidate for the flat file approach.
3. Relationships Between Entities in the Data
One of the key design principles of a REST API is that it is entity-based. While this has the advantage of a predictable location for each entity (e.g., Plan 123 always lives at /plans/123), it has the disadvantage of making it a bit more difficult to string together many related entities.
In the above example, if Plan 123 happens to cost $X in zip code 12345 and $Y in zip code 23456, and it also happens to be available in 12345 and 23456, but not in 34567, the customer would need to make additional API requests to determine all of that information. When the object graph is fairly large and the customer needs to access the entire object graph to persist it to their database, flat files tend to be a better choice than API-only.
4. Format Requirements
Many of Vericred’s customers have vastly different schemata, and in order to reduce friction and increase adoption of our data platform, we made the decision to offer customized formats to those customers. An API is, ideally, a single consistent representation of a set of resources. Maintaining multiple formats or schemata in a single API is complex, and will often accrue technical debt within a codebase.
We made the decision to push this complexity out of the API and further down the chain. The “standard” set of flat files we offer is generated from our API directly, but any customizations are post-processes that operate on our standard set of files. This allows us to build out features in our core API while still meeting the needs of customers who have specific format requirements.
As a data services company, we’ve learned through working with customers over the course of the past 2 ½ years that there isn’t a one-size-fits-all solution to data transfer. APIs are a great solution for many use-cases, but they are not the only solution. There are several cases where transfer via flat files has proven very useful and has allowed us to separate out the general and client-specific pieces of our architecture to provide us substantial flexibility in working with our customers.