If you've tried to perform nuanced analytics on data, you're familiar with spending 90% of your time getting the original material into a format that makes sense. Sadly, you can't Google '10 years of well-by-well oil production' and just dive right in.
Now, though, we're proud to offer cleaned and clear curations of a wide array of datasets at Enigma. Thanks to these curations, our lead time to answer, 'Is the North Dakota oil boom over?' was cut dramatically, as the data resides in a single table containing millions of rows of data about the country's oil and gas, publicly available for exploration.
Getting our data to be this clean was not a simple task.
States and regions offer open oil and gas data in dozens of different formats, from Microsoft spreadsheets and html tables, to ASP.NET servers and decades-old COBOL databases. Our first step was writing regularly-updating parsers for oil and gas data from 12 different states (and counting). Then we developed a common schema for these datasets, and mapped every well’s oil and gas production, (and its water usage), to every month it has recorded data, going back decades. We also included geocoded latitudes and longitudes for over 90% of the wells.
With this work behind us, it’s now almost trivial to query for questions like ‘How many barrels did McKenzie county North Dakota pull out of the ground last October?’ or, ‘Where exactly are all of the horizontal wells in Pennsylvania (and how close are they to water sources?), or even, ‘How much water are Southern California’s oil derricks using every summer?’
We chose to address the story of North Dakota’s oil boom, a subject that has been given plenty of careful thought and reportage, in our own precise way.
There have been doomsday warning signs about the North Dakota oil boom and bust cycle for basically as long as there’s been a North Dakota oil boom. With production data for each well, you don’t need to speculate, you can simply ask literal questions: Are North Dakota fracking wells still pulling the same amounts from the earth as five years ago? Are they pulling more? Is the unemployment rate still among the lowest in the country? Are wells still making the same amount of money?
The scale here is substantial: ten years of monthly data on 15,696 individual oil wells. Only a subset of the total North Dakota data, but still nearly a million rows of data. And some very clear writing on the wall: Financially, the bust has arrived. Profits are plummeting and unemployment is up, all while companies pull more oil from the ground than ever.
We invite you to visit the microsite around this project, and to ‘dive’ into the data - we include not only an overview of ten years of estimated profit and production data, but also the ability to click on a well to view its individual history over the same period. And of course, all the underlying datais available for you to draw your own conclusion.