Open Petroleum Data Lake
For as long as the oil industry has been using computers there has been a long-standing approach to data management that typically involves importing and storing data into a proprietary system that has punitive contract termination clauses that make it very difficult to move from one system to the next. In many cases, oil and gas companies feel held to ransom if they want their data back from a system at the end of a contract, which is often stored in an internal format that only the provider can unravel. Often, if oil companies want their data back or want to move to a new system, the costs of doing so, and getting the data back in a usable format are often cost prohibitive.
If you supply a great product or service, and take care of your customers, then you should never fear customers moving. However, if customers decide to move, then either you are not doing your job well enough, or someone has out-innovated you. By always keeping the client's desires and wants in mind, you should be able to stay on that innovation curve and address your client’s needs.
An Open Petroleum Data Lake (OPDL) is an enterprise storage platform that leverages the scalability and cost effectiveness of the public cloud, without the poor access, speed and lack of control that private cloud solutions of a similar type offer. The objective of the data lake is to consolidate and preserve oil and gas data sets (for example seismic, well or production data) along the full length of the data lifecycle, such that data can be strategically consumed by applying suitable structural and permission definitions and applying transformations as needed within the cloud itself.
The Tape Ark Petroleum Data Lake helps consolidate data from both the past and from tomorrow’s new data acquisition projects. It contrasts with a typical data warehouse by standardising the schema definitions and allowing data to be transported and transacted upon in an open system. The consolidation and "opening-up" of storage as a service enables businesses to define and execute any data insight strategy they require (ie., descriptive, diagnostic, predictive, prescriptive), knowing that data can be put to effective use when required and without actually moving the data from its current location or from one system to the next.
The Tape Ark Petroleum Data Lake
The Tape Ark Petroleum Data Lake is a common use, open standard data management system for the oil and gas industry that allows data to come and go as needed by the client. There are no proprietary formats, no restrictions on incoming or outgoing data, and at any time the client can choose to pick up and move their data or keep it in place and attach a new front end to the data lake, without the data need to move or reformat the data.
Typical Work Flow Comparison
The data lake allows oil companies to be able to use cloud based automation tool sets to access the repository and short circuit the usual workloads that would be done in a non-cloud based model.
As an example:
Oil Company A place a 20 year old legacy seismic data set into an offsite vault. Sometime later they decide to do a reprocessing job on that survey.
The typical work flow for that would be:
- Order and pick the tapes from offsite storage
- Data is then copied to higher density tapes and reformatted to SEGY for reprocessing
- Tapes are then shipped to a processing house with Navigation and obs logs
- Tapes are then read into the processing house's disk based system for processing
- Output products from the reprocessing are created and written to tape and sent to Oil Company A
- Final products are written to tape and passed to Oil Company A
- Oil Company A reads all of the tapes to internal disk and loads the final output products into their internal systems for interpretation
- Once complete the Oil Company A writes the project data to tape and puts them back into offsite storage along with all of the intermediate data sets, final products and original tapes of the field data
- If there are joint venture partners, the data has to be replicated on tapes and shipped to each JV partner that has an interest – creating multiple copies of the same thing for each party.
In this example 6 sets of data are written to tape in this process and if any of this data needs to be used again, the tapes have to be read again. Every few years, tapes need to be migrated to new tapes at great expense, and the client is in the never ending loop of moving data from platform to platform – and this is all replicated for each JV in the acreage.
Using the Tape Ark Petroleum Data Lake:
The 20 year old data is transferred into a cloud based system - Tape Ark will do this for free. The work flow is then dramatically different and far more cost effective than the above example.
- Select data to be reprocessed off the GIS map attached to your lake
- Share that data (via cloud – no tapes created) with the reprocessing company with a few clicks
- Reprocessing company can use map data for navigation and load the data direct into processing systems on the cloud
- Output data sets from all stages of processing can be sent back to the data repository via the cloud for automated entry onto the map, and into deep archive once complete (no tapes created)
- Data can be loaded into a cloud based interpretation system without making copies
- Sharing of data with JV can be done via the cloud. No costs, just share a link to the files and the map service. No tapes created
- The data is stored in standard file formats, no internal formats to protect the data from being moved (clients can move anything they wish, any time they like) or attach their own front end if they wish
- Future data access is available to this data instantly via the cloud archive for future processing, as are gathers and intermediate processing types. All available within hours instead of weeks or even months for some data
- The Tape Ark Petroleum Data Lake becomes the final resting place for data. You never need to make more tapes or high grade media. Data can be moved between low cost latent storage and higher cost active storage as required. Expiry of data can be automated and deleted as required. Sharing is a click of a button, and as the lake grows, we also see the governments eventually accepting data via cloud transfer.
Perhaps one of the best parts of Tape Ark Petroleum Data Lake is that it can be introduced at the data acquisition stage of exploration. Seismic acquisition companies can write direct to your account, meaning the data in your lake comes direct from the boat or field crew. No more tapes, no more waiting for couriers or helicopters to ship your data to you. The data is accessible and ready for use within hours, instead of months.