How Sumo searches for potential
How can we release the true potential in petabytes of data? With a cloud and a search engine, of course!
The subsurface is a key player in both extracting oil and gas and for storing captured carbon. Superman isn’t currently an Equinor employee, so we don’t have anyone with x-ray vision and superpowers to tell us what secrets lie beneath the Earth’s surface. What we do have, however, is a whole lot of data to help us know more.
We do this by running advanced calculations in a system called Fast Model Update, FMU for short. These calculations produce vast amounts of data in the shape of model results. By one press of a button, hundreds of thousands of files can be generated by our FMU users.
Some of the model results are used in further work, but most of them end up being stored on temporary disk space in folder structures, until they're written over by new data. The large amount and its cumbersome structure have made it challenging to make more use of the data. Now, that’s about to change – and a key part in the work is Sumo:
“Sumo is a tool that allows us to upload large numbers of data objects into the cloud and make the objects searchable via their metadata. This makes the data easier to find and navigate and provides a uniform mechanism for users and applications to access the data.”
Raymond Wiker, software developer
Raymond is part of the 5-developer strong team developing Sumo. While Sumo can be used for almost any type of data, we’re going to stick to the world of reservoir modelling for this leg of the journey.
Eager Loop readers may remember the story about ERT, the Ensemble Reservoir Tool, which is one of the tools to run FMU workflows. Sumo’s product owner, Per Olav Eide Svendsen, explains that Sumo can also help us do more than just make data available:
"Data availability coupled with cloud technology enables us to both use and manage these data in better ways. It can improve and change not just the way we work, but the way we think. An example of this is how Sumo can fuel services and how data can be shared with others in more effective ways."
Per Olav Eide Svendsen, Sumo product owner
Stay in the Loop
Splitting to increase traffic
In the early days of Sumo, many different technologies and ways of solving problems were tried, but none were a perfect fit – until software developers came on board with an idea.
The first leg of the journey is enabling FMU to push data into Sumo automatically. As there are hundreds of thousands of data outputs from an FMU workflow, including manual steps is simply not on the table.
“When I first heard of what Sumo wanted to do, a search engine approach seemed like the obvious way of doing it. While they used to require massive resources before, now they’re packaged technology you can implement almost anywhere,” Raymond explains.
Together with Øistein Haaland, Raymond is one of the original Sumo software developers. If you’re going to search through petabytes of data, you need metadata to help the search engine find the results you're after.
"We started with a proof of concept where each FMU run would be a root object with corresponding output files and metadata. Then, we defined a system to create a search index and implement a REST API before we began feeding it with data.”
Øistein Haaland, software developer
This simple and straightforward setup is still there, but naturally there’s been incremental changes made along the way. One of these changes is the amount of virtual machine nodes to handle the workload. Sumo started with one but today there are three, Raymond explains:
“Sharing the volume between three nodes means we split the network traffic and can push more objects at the same time. We have seen peak rates of 80 binary objects and metadata uploaded per second, and this may have been limited by the output rate of the FMU run.”
Creating new possibilities
While data is uploaded through Sumo together with corresponding metadata, such as what field the data is for, they don’t end up in the same spot. The data heads to an Azure Blob Storage, and the metadata head off into Elastic Search - their search engine of choice. Sumo has a dedicated, private link to Elastic Search, which helps in many aspects.
“This direct connection means we don’t have to route anything across nodes on the Internet and makes it a lot faster. It also means it’s more secure and not accessible for every random Internet user,” Raymond explains.
“Being able to search through and look at your data using Elastic Search opens up a world of opportunities. You can find the data again more easily, filter through it with ease and see the totality of your data in a whole new way.”
Øistein Haaland, software developer
To search through the data, you can either use the Sumo frontend interface (a web application) or use the REST API directly. If you use Python, there is a library available that hides some of the complexity of the REST API.
“We’re mainly using Sumo for FMU workflows, but it's been intentionally configured to work with almost any other data. Sumo can show you upload frequency, how much data it is and the size of the data, depending on what you want to see,” Øistein explains.
Designing a dashboard
Ådne Aarthun Jacobsen joined the Sumo team in August of 2021 and has been working on the frontend where improvements have been made with more on the way. The frontend mainly runs on React components and consists of elements from the Equinor Design System (EDS).
The challenge of designing a frontend for Sumo is that it must be easy to use and understand - preferably without any training. Consulting with UX designers, and a lot of trial and error leads the way to finding the best possible solution.
“We now have a new filtering- and search component, and we’re planning to make a Metrics-dashboard to display various information and graphs. Experimenting with the different components and just trying different approaches has really helped piece it together, especially with help from an UX designer.”
Ådne Aarthun Jacobsen, software developer
He’s been designing using Figma, where you can sketch different designs and prototypes quickly. Testing them proved to be a little bit of a challenge as there was no framework to run integration tests on in the app.
“I didn’t have much experience setting that up previously, but it was a nice challenge to learn from,” Ådne says.
Currently, data management for our FMU data happens through Unix terminals and command lines. With Sumo, an easy-to-use dashboard web interface meets the user – simplifying their work.
Calculating a distribution of surfaces
Having made all this data accessible through Sumo, the possibility of building services to make more use of them opens. One example is the aggregation service.
This uses the data that Sumo uploaded to do a wide range of calculations on reservoir model predictions in the form of surfaces, one of many data types generated by FMU (Fast Model Update).
"We have to think of this as a distribution of surfaces, similar to how we think of distributions of numbers,” Daniel Garip says.
This means describing that distribution in terms of statistical representations such as the mean, max/min, P10, P90 etc.
“Previously, these statistics have been calculated for all output as part of the model run. With model results available in Sumo, however, we can create a service that produces what we need, when we need it,” Daniel says.
Daniel has been working on the surface aggregation service together with Dafferianto Trinugroho, and building the aggregation service required a whole lot of testing, Daniel explains:
“I’ve tested the service in different environments like Radix and Azure, and it was quite the experiment to find the right balance between performance and costs when deciding where to host it,” Daniel says and adds:
“What’s been interesting about it is that the aggregation service really shows what Sumo can do and what new possibilities microservices can open up.”
Daniel Garip, software developer
After testing and experimenting, the choice fell on Radix - Equinor's own PaaS. Øistein had previously written an example of the aggregation service in Python, but as the team wanted to improve performance and minimize the resources used Daniel and Dafferianto then decided to experiment with Golang.
“Instead of having to copy data from a memory, we can use the pointer feature in Golang which points to the data instead of copying it and saves a lot of time,” Daniel says.
Increasing speed by more than 50%
While Golang is a relatively new programming language, it’s proven to be a solid choice for the aggregation service due to its ability to do things simultaneously without sacrificing speed - thanks to Goroutines.
“We’ve really put an emphasis on maintaining a high speed for the aggregation service. Adding a step of data validation could add milliseconds, which could really slow down the result. It’s been a really interesting focus to have,” Daniel says.
During development, Daniel and Dafferianto would each work on different tasks, but they did code review as pair programming.
“Pair programming let us see the bottle necks better. We managed to increase the speed of the calculations by more than 50% simply by doing it together. An extra set of eyes really helped in identifying issues or finding new ways to solve a problem.”
“It might seem like a waste of time and money to have two people looking at the same thing simultaneously, but it’s worth it,” Dafferianto adds.
Understanding the bigger picture
Getting an understanding of how reservoir models work, what metadata it needs or spotting what kind of possibilities the data provide us with is no easy task. Luckily, the Sumo team has got experts ready to answer their questions and help when needed. One of them is the product owner, Per Olav Eide Svendsen.
“Per Olav has done a great job in many areas. He created a specification for what metadata FMU needed to apply to the data, and he knows what the users need and want – and if he doesn’t know, he’ll find out. This helps ensure we won’t work on something that the users don’t want or need,” Raymond explains.
Developing Sumo has also meant learning a great deal about the world of reservoirs and the subsurface.
“We have to know more about the domain to get a better understanding of how Sumo can help its users. Learning more about how we produce oil and gas, how we create and use reservoir models and how we find new fields has been incredibly interesting."
Now, Sumo can enable data from FMU to be integrated into Equinor services like the Reservoir Experience Platform and Webviz. They both use Sumo to visualize different reservoir model results, but the team keeps close contact with several other services to create other use cases.
Rebuilding roads while driving
Work on Sumo also means that the FMU community is facing quite a change story. Not only do they have to deal with the sheer amount of data, but they also need to include metadata and refer to master and reference data for every single data output. A significant undertaking, especially when considering that FMU is a complex system and that it is in active use, every day, on more than 40 assets in Equinor.
"We are reconstructing the roads while a lot of people are driving on them, which is quite challenging,” says Jan C. Rivenæs.
Jan is heading the implementation of metadata functionality for FMU results and together with his colleagues, he is key to success for Sumo
"The FMU community is both our primary customers, and our most important collaborators. This is not about building and delivering a component. It’s about collaboration and continuously improving so that we can utilize advances in technology and so that we remain competitive," Per Olav says.
Creating value for FMU is key and ensuring that it keeps its flexibility has been a key requirement for the development of Sumo.
“Currently, our main goal is to learn about the problem we are solving by getting Sumo into production on more assets. We’re also looking into finding other use cases where Sumo could provide uploading, storage and help others make more out of their data.”
Elsa Mäyrä Irgens
While only Superman truly knows what the future holds for Sumo, we do know that if you want to stay up to date with all thing's software development in Equinor you should sign up for our newsletter below!
Then, we’ll let you know as soon as we share a new story about what our software developers are up to.
Until then, stay safe and take care!
Stay in the Loop
Ådne Aarthun Jacobsen
Elsa Mäyrä Irgens
Per Olav Eide Svendsen