Putting the API first
The Timeseries API handles 500 million calls a day. Let’s find out why they put their API first, how it benefits them - and how it could benefit you.
Imagine you wanted to call a friend to ask for help with a project you were working on. Instead of just pressing the call button in his contact, you had to figure out which phone he had, who his cell phone provider was or what technology his provider used.
Would you bother making the call if it wasn’t an important one? In a way, this was the case of our time series data: time-stamped measurements coming from devices such as sensors in the field.
“Until now, everyone who wanted to access time series data from our industry had to maneuver a lot of different systems and software - depending on the company, operator or even the installation. A key part of making this work easier is the Timeseries API.”
Geir Arne Rødde
This isn’t just any old API with a few calls here and there - the Timeseries API handles more than 500 million calls daily. Yeah, you read that right!
That’s a staggering amount of information flowing both ways, which gives the team some challenges to face and solve. Let’s find out how they got it done.
What is time series data?
Think of it as VQT: value, quality and timestamp.
Value is the sensor data, quality determines how trustworthy the value is and timestamp gives us the time data was recorded.
Stay in the Loop
Focusing on safety
The team started working in 2018 and Erik Aareskjold Idland was one of the main developers on the Timeseries API. While some might take an approach of assigning a bunch of people to “just get to work”, they took another approach entirely.
“We sat down, thought about and planned everything we wanted to do - before getting down to business and quickly writing the code. I think that’s why we were able to succeed like we have.”
Erik Aareskjold Idland
Naturally, one of the first steps and issues they had to solve was security - we don’t want everyone to be able to access all these data straight from our databases. What they did want however, is to make sure that the API was designed for externals right from the beginning.
“Our main strategy is to “close everything”, but then to open it up for those who need it, with Azure providing access and authorization. No one connects to our databases directly,” Erik says.
The use of Azure AD in Equinor also meant it was considerably easier to implement granular security, fine-tuning what level of access users would have.
“We can give access to data from individual sensors and tags if needed. When that was requested, we solved it almost instantly because of the way Azure authentication works and we had planned for this flexibility in our backend,” Erik explains.
The Timeseries API handles..
Timeseries API at a glance
- Access to sensor data from all Equinor operated plants, onshore and offshore.
- Designed for both internal (Equinor) and external (vendors, partners) usage.
- Swagger / OpenAPI
- Technology agnostic, so that backend technology stack may be changed without affecting consumers.
- Security in focus, granular access control at different levels.
- Consumers can create and share new timeseries, so that new insight can be gained.
- Optimized for high performance both for ingest and egress.
- API specification initially developed as part of collaboration with Cognite
Staying flexible means staying fast
Flexibility is a key part of why they view their API as their main product - even though they maintain the backend streams and databases that feed the API with data.
“There’s no guarantee that the database we have today will run on the same technology in the future. Things are always changing, but because of the API we can change everything when we need and users won’t notice a thing,” Erik says.
If they let people connect directly, it would make them a little “stuck” if they want to start using a different database or technology, developer Jan Henrik Endsjø Høiland explains:
“Chances are that a different database will behave differently, have a different interface, and not support the same features. That could mean unnecessary downtime for users and extra work for us. But because we put the API first, we can handle the differences in the API and the users should hopefully not be able to tell the difference.”
Jan Henrik Endsjø Høiland
Principles of API first
API First is one of our core API design principles and has two key elements:
- Define the API using a standard specification language before any line of code is written.
- Get feedback on the API definition from team members and client developers.
With this approach we can achieve:
- Evolving the API and learning about its usage efficiently - without having to write any code.
- Decoupling of API design and development. The API definition becomes a contract that teams can work on without having to wait for implementation to be completed. And implementation can be changed/replaced without impacting clients.
- Specifying APIs with a standard specification language facilitates usage of tools to generate documentation, mock code, automatic quality checks, API managements tools etc
A peek behind the curtain
Today, the database that gathers and contains all the data is an open source time series database.
It’s no small-scale operation that’s being carried out when it comes to collecting the data. If we look behind the Timeseries curtain, there’s an exporter running, which connects to 30+ systems to gather data using product specific APIs for those source systems.
“The exporter continuously exports time series datapoints from different sources, as process historians, into our ingestion pipeline, which then ends up in the database,” Jan Henrik says.
Have you ever clicked a link and it took forever to load the actual website? To put it simply, ain’t nobody got time for that.
That’s why optimizing throughput and keeping the latency down, such as optimizing the delay between a sensor producing a value it’s available to customers is a priority.
“Since we don’t connect directly to the platforms, the more of a gap we can close the better. Optimizing for throughput and latency every step of the way, in the exporter, pipeline and API, opens up for use cases one couldn’t have if there was too much delay.”
Jan Henrik Endsjø Høiland
“It can be a challenge to get the best performance out of the database in every use case since we’re not the ones who develop the database itself,” he adds.
Making data accessible
The most recent addition to the Timeseries team is Runar Ask Johannessen. While he might be fresh out of school, he was thrown into the deep end of the pool right away - with more than capable lifeguards close by. Runar specialized in machine learning during his studies and enjoys seeing a different side of things.
Good morning! Our developers behind our Timeseries API, which handles up to 500 million calls a day, are ready to tell you why you should take an "API first" approach in your work. Hit the link below to read the full story on Loop:
“To me, it’s quite meaningful to be providing the building blocks needed for machine learning through building these pipelines and making the data accessible."
Runar Ask Johannessen
“While the data isn’t that big in size, it’s an incredible amount of data steadily coming from every sensor. The sheer amount adds complexity by itself and means you need to find new ways of solving problems,” he adds.
One example was when they had to change how they stored data from yearly to monthly files for timeseries data stored in the Data Lake, which is an add-on to the API.
“It sounds really simple, but it took a lot of time to split these files, since we have an incredible number of tags and sensors that provide data. I had to learn new technologies and use them on a much bigger scale than I’ve been used to,” Runar says.
With the main issue being the number of tags to handle and not the complexity of the processing itself, the selected solution was to create Spark jobs that would be run on a Kubernetes cluster. With this, the relatively simple job of reading, sorting and writing data for each tag could easily be scaled out and processed in parallel.
“I was able to learn some cloud-based setups and create these jobs in a way they could easily scale up and go into the world. Being able to take tech I’ve used a little bit before, learn more and then take it to a new level is really exciting."
Runar Ask Johannessen
Equinor’s API strategy
The goal of our API strategy is to deliver several operational and strategic benefits to Equinor:
- Increased efficiency in software development: By being able to reuse existing APIs providing data and processing capabilities, development teams can develop new applications faster.
- Increased agility in software architecture: By building our applications on top of APIs and applying microservice architecture principles we can create a more agile software architecture, making our software systems more adaptable to change.
- Revitalize legacy applications: By building modern APIs on top of legacy systems, we can extract more value by making their abilities broadly and easily available.
- Enabler for innovation: By combining data in new ways, we can potentially gain new insights and build new services that bring added value to the company.
- New business opportunities: By strengthening APIs we can help build better relations with new or existing partners, expand business areas or provide APIs where users pay for consumption.
Read more about our API strategy on Github.
How to get valuable feedback
So, your API is up and running and ready to provide your users with the data they need and want - time to pat yourself on the back for a job well done and head home? Far from it! What happens outside of developing code API is just as important in an API-first mindset, Steffan Sørenes explains:
“Good products are products that people can easily understand and easily use. For us, the API is the main product - not just a cog in the machine. That’s why we put a lot of effort into making it as easy to use as possible.”
Steffan Sørenes, Leading Advisor IT Architecture
Steffan is our leading advisor in IT architecture and a part of the Timeseries API team, where he’s a “part time salesman”, as the team puts it. Pitching the product and getting hold of “customers” isn’t a part of the job you should take lightly - just like you should give new users a proper welcome.
“We run a 30-minute onboarding session for every new client, where we walk them through documentation and how to use the API. But we also tell them to give us any feedback they may have, both what needs improving and what works well,” Steffan says.
“Their feedback together with constant monitoring and logging gives us insight into how our users actually use the API - or what they don’t use. It lets us be proactive in how we improve the product,” he adds.
Straight from the user's mouth
The perhaps biggest user of their API is Equinor’s own Integrated Operations Center (IOC), which features daily monitoring of equipment and assisting in more long-term support for production optimization and energy efficiency.
IOC developers Harald Kjøde and Jo Lyshoel explains that the Timeseries API makes their work a whole lot easier:
“We don’t have to deal with every single data source and can retrieve data from one standardized API. It greatly simplifies both accessing and handling historical time series data, which makes it easier for us to focus on creating value out of the data it provides.”
Harald Kjøde, Equinor Integrated Operations Center developer
“Not only can we easily add more data sources through their API, but it also makes collaborating with external partners much easier,” Jo Lyshoel adds.
One of our external partners is 4subsea. Their industrial IOT data storage and analytics platform 4insight® gathers data from sensors and other systems and adds analytics to create insight. 4insight includes data connectors configured to communicate with various external APIs - making it possible to adapt to different clients’ infrastructure.
Their 4insight® platform gathers data from sensors and other systems and adds analytics to create insight. By connecting the APIs, they can share data and insights which contributes to empowering Equinor’s engineers to make informed critical decisions.
“When our partners' APIs provide easy access to data in a reliable and standardised way, it is easier for us to deliver a reliable service and quickly get set up to share data and insights. The Timeseries APIs is a great example of an innovative and future-oriented Operator – and is how we envisage working with our clients.”
Marie Austenaa, Chief Commercial Officer, 4subsea
Enabling a future of putting data to use
But work is far from over and the team is currently working on getting data from our wind farms into the service, as well as looking into ways of expanding and improving their API.
While in Equinor the strategy is to always think of the API first, the Timeseries developers all believe that this is something everyone should consider - especially if there’s a chance that it isn’t just one single person or frontend that can make use of the data. This approach makes it much easier to create value out of what you’re making, Jan Henrik explains.
“There’s plenty of times where I’ve come across data and looked for an API that would let me access it but didn’t find it - this means both we and others are missing out on potential added value from our data."
Jan Henrik Endsjø Høiland
So, if you’re working on a software project both big and small, APIs are the way to go - especially if you want to make life easier for your users.
With that, our little venture into the world of timeseries data and APIs is over. But make sure you subscribe to our Loop newsletter, and you’ll get a notification swooshing into your inbox as soon as we publish a new story.
Until next time, stay safe and take care!
Erik Areskjold Idland
Nils-Helge G. Hegvik
Jan Henrik Endsjø Høiland
Geir Arne Rødde
Runar Ask Johannessen