Prepare FMU workflow to produce rich metadata
In order to start using fmu-dataio and produce valid metadata for FMU output, some preparations are necessary to your workflow. Expected time consumption is less than an hour.
You will do the following:
- Find and enter some key model metadata into global_variables.yml
- Include an ERT workflow for establishing case metadata
- Include one script for data export
You may also find it helpful to look at the Drogon tutorial project for this. This is in the category “really easy when you know how to do it” so don’t hesitate to ask for help!
Insert key model metadata into global_variables.yml
In fmuconfig/input/, create _masterdata.yml. The content of this file shall be
references to master data. We get our master data from SMDA, so you need to do some
lookups there to find your masterdata references. In the FMU metadata, we currently use
4 master data entries: country, discovery, field, coordinate_system and stratigraphic_column.
Note
Master data are “data about the business entities that provide context for business transactions” (Wikipedia). In other words: Definitions of things that we refer to across processes and entities. E.g. if two software, in two parts of the company, are referring to the same thing, we need to agree on definitions of that specific thing and we need to record those definitions in a way that makes us certain that we are, in fact, referring to the same thing. An example is the country of Norway. Simply saying “Norway” is not enough. We can also refer to Norway as “Norge”, “Noreg”, “Norga”, “Vuodna” or “Nöörje”. So, we define a universally unique identifier for the entity of Norway, and we refer to this instead. And all that various names are properties on this commonly defined entity. These definitions, we store as master data because no single applications shall own this definition.
This is the content of _masterdata.yml from the Drogon example. Adjust to your needs:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_HAS_NO_STRATCOLUMN
uuid: 00000000-0000-0000-0000-000000000000
Note that country, discovery and field are lists. Most of us will only need one
entry in the list, but in some cases, more will be required. E.g. if a model is covering
more than one field, or more than one country.
To find the unique identifiers, go to https://smda.equinor.com/ -> viewer. You can usually
use the identifier (often just the name) to identify the correct entity.
Next, establish _access.yml. In this file, you will enter some information related
to how FMU results from your workflow are to be governed when it comes to access rights.
Example from Drogon:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
Under asset.name you will put the name of your asset. This is only relevant if you plan
to upload data to Sumo, and in that case, you will be told by the Sumo team what asset
should be.
Note
“I cannot find asset in SMDA, and why does asset not have a unique ID”?
Currently, “asset” is not covered by master data/SMDA. However, it is a vitally important piece of information that governs both ownership and access to data when stored in the cloud. Often, asset is identical to “field” but not always.
Under ssdl, you will enter some defaults regarding data sharing with the Subsurface Data Lake.
In the Drogon example, data are by default available to SSDL, but you may want to do differently.
Note that you can override this default setting at any point when exporting data, and also note that no data will be lifted to the lake without explicit action by the data owner.
Finally, establish _stratigraphy.yml. This is a bit more heavy, and relates to the
stratigraphic_column referred to earlier. In short, when applicable, stratigraphic intervals
used in the model setup must be mapped to their respective references in the stratigraphic column.
We do this, by listing the names used inside the model (usually the names reflected in RMS).
_stratigraphy.yml contains a dictionary per stratigraphic entity in the model, with keys
reflecting different properties of that stratigraphic level.
The key of each entry is identical to the name used in RMS. There are two required
values: name (the official name as listed in the stratigraphic column) and
stratigraphic (True if stratigraphic level is listed in the stratigraphic columns, False if not).
In example below, observe that “TopVolantis” is a home-made name for VOLANTIS GP. Top
and is in the stratigraphic column, while “Seabed” is not.
In addition, you may want to use some of the optional values: - alias is a list of known aliases for this stratigraphic entity. - stratigraphic_alias is a list of valid stratigraphic aliases for this entry, e.g. when a specific horizon is the top of both a formation and a group, or similar.
From the Drogon tutorial:
# HORIZONS
Seabed:
stratigraphic: False
name: Seabed
TopVolantis:
stratigraphic: True
name: VOLANTIS GP. Top
alias:
- TopVOLANTIS
- TOP_VOLANTIS
stratigraphic_alias:
- TopValysar
- Valysar Fm. Top
TopTherys:
stratigraphic: True
name: Therys Fm. Top
# ZONES/INTERVALS
Above:
stratigraphic: False
name: Above
Valysar:
stratigraphic: True
name: Valysar Fm.
Finally, in global_variables.yml we will do 2 things. First, we will enter a model
block which contains some information about the model setup. Then, we will include the
3 files made above. Example from Drogon:
[...]
(rest of global_variables.yml)
#===================================================================================
# Elements pertaining to metadata
#===================================================================================
model:
name: ff
revision: 22.1.0.dev
masterdata: !include _masterdata.yml
access: !include _access.yml
stratigraphy: !include _stratigraphy.yml
You are done with the first part! This is to a large degree a one-off thing, and you should not expect to have to do this again and again.
Workflow for creating case metadata
For each FMU case, a set of metadata is generated and temporarily stored on /scratch/<case_directory>/share/metadata/fmu_results.yml. The case metadata are read by individual export jobs, and, if you opt to upload data into Sumo, the case metadata are used to register the case.
Case metadata are made by a hooked ERT workflow running PRE_SIMULATION.
To make this, first create the workflow file in ert/bin/workflows/xhook_create_case_metadata.
Note
The “xhook” prefix is convention, but not mandatory. As all workflows will be included in the ERT GUI dropdown, the “hook” prefix signals that the workflow is not intended to be run manually. Further, the “x” makes it go to the bottom of the (alphabetically) sorted dropdown. If you have many workflows, this makes things a little bit more tidy.
The workflow calls a pre-installed workflow job: WF_CREATE_CASE_METADATA. Example script
from the Drogon workflow:
-- Create case metadata
-- ert_caseroot ert_configpath ert_casename ert_username
WF_CREATE_CASE_METADATA <SCRATCH>/<USER>/<CASE_DIR> <CONFIG_PATH> <CASE_DIR> <USER>
-- This workflow is intended to be ran as a HOOK workflow.
-- Arguments:
-- ert_caseroot (Path): The absolute path to the root of the case on /scratch
-- ert_configpath (Path): The absolute path to the ERT config
-- ert_casename (str): The name of the case
-- ert_user (str): The username used in ERT
-- Optional arguments:
-- --sumo: If passed, case will be registered on Sumo. Use this is intention to upload data.
-- --sumo_env (str): Specify Sumo environment. Default: prod
-- --global_variables_path (str): Path to global variables relative to CONFIG path
-- --verbosity (str): Python logging level to use
--
-- NOTE! If using optional arguments, note that the "--" annotation will be interpreted
-- as comments by ERT if not wrapped in quotes. This is the syntax to use:
-- (existing arguments) "--sumo" "--sumo_env" dev "--verbosity" DEBUG
Note
Note that there are references to Sumo in the script above. You don’t have to worry about that for now, but we will return to this if applicable.
Now, load this workflow in your ERT config file and make it a HOOK workflow:
-- Hook workflow for creating case metadata and (optional) registering case on Sumo
LOAD_WORKFLOW ../../bin/workflows/xhook_create_case_metadata
HOOK_WORKFLOW xhook_create_case_metadata PRE_SIMULATION
Note
In the Drogon example, you will notice that the loading is done in the install_custom_jobs.ert
include file, while the HOOK_WORKFLOW call is in the main config file.
You can now start ERT to verify that the workflow is loading and working. You should see
the workflow appear in the workflows dropdown, and when you run a case, you should see
case metadata appear in scratch/<field>/<casedir>/share/metadata/fmu_results.yml.
Include a data export job
To verify that data export now works, add one job to your workflow. Pick something simple, such as depth surfaces from the structural model or similar. Use one of the examples on the next page to get going, and/or have a look at the Drogon tutorial project.
What about Sumo Odds are that you are implementing rich metadata export so that you can start utilizing Sumo. Producing metadata with exported data is a pre-requisite for using Sumo. When you have undertaken the steps above, you are good to go! Head to the Sumo documentation to get going 👍