APIs as Data Sources

Author: Ben Turner Published on: 12th of November, 2020

One of the great things about the internet is the freedom of information. APIs represent a fantastic opportunity to leverage that freedom, so long as everyone is playing nicely, and things stay stable.

Back in 2018, during development of the website APIBlocks I found that playing nicely wasn't actually that common, and that you can't rely on an API to stick around. Getting information out of a free public API was often roadblocked by account requirements, rate limiting, access credentials and novel access methods. Then when using private or paid APIs, you still had the issue of changing endpoints and disappearing providers.

This is fine for a project that uses one or two APIs, when an endpoint changes you're likely already on top of it and the transition goes smoothly. But when your value proposition is to pull arbitrary information for dozens of APIs daily, that fragility becomes unworkable quickly.

What I set out to build with APIBlocks was a site that let a relatively savvy user build a dashboard of any information they had access to via the web, without the use of scrapers or the need to write any code. Perhaps a user needed a quick overview of the weather, their stock tickers, or some currency exchange rates. They could easily build that with the site.

For APIBlocks I used a way to select and store a "pointer" to an arbitrary node in a JSON result, called JSON Path. The long term goal was to allow users to build up pipelines of data that could be manipulated and then displayed in different display blocks on the dashboard, like graphs or live tickers.

Although the site is still up, and I still believe in the idea. I do feel that the nature of the web makes a project like this unmaintainble long term. Even if you offload the work of managing the API endpoints to users, their dashboards would break constantly as endpoints disappear or change, and that's just not a good experience. Even the demo dashboard I maintained broke every month or so, and it was just 4 APIs.

The findings of this adventure seems to be that my wish to use the public web as a static set of data providers just didn't line up with reality. Even if the providers are there, it's just not possible to rely upon that many providers continuing to work.