Examples
This is a collection of examples showing how fmu-dataio can be used in different contexts, and for different data types. The examples typically shows a Python script, together with a corresponding metadata file that would be produced when running the script.
If working inside RMS we often retrieve RMS data from the project itself. In the examples the syntax for that is commented out, but it is still shown so you can comment it out in your code.
The global variables used here
This is a snippet of the global_variables.yml file which holds the static metadata described in the
previous section. In real cases this file will be much longer.
# Autogenerated from global configuration.
# DO NOT EDIT THIS FILE MANUALLY!
# Machine st-linrgsxx.st.statoil.no by user nn, at 2021-06-23 07:35:54.428896, using fmu.config ver. 1.0.6.dev7+g6aee329
masterdata:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_2020
uuid: some-unique-id-to-be-provided-by-smda
access:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
model:
name: ff
revision: 21.1.0.dev
stratigraphy:
MSL:
stratigraphic: false
name: MSL
Seabase:
stratigraphic: false
name: Seabase
TopVolantis:
stratigraphic: true
name: VOLANTIS GP. Top
alias:
- TopVOLANTIS
- TOP_VOLANTIS
stratigraphic_alias:
- TopValysar
- Valysar Fm. Top
TopTherys:
stratigraphic: true
name: Therys Fm. Top
TopVolon:
stratigraphic: true
name: Volon Fm. Top
BaseVolon:
stratigraphic: true
name: Volon Fm. Base
BaseVolantis:
stratigraphic: true
name: VOLANTIS GP. Base
Mantle:
stratigraphic: false
name: Mantle
Above:
stratigraphic: false
name: Above
Valysar:
stratigraphic: true
name: Valysar Fm.
Therys:
stratigraphic: true
name: Therys Fm.
Volon:
stratigraphic: true
name: Volon Fm.
Below:
stratigraphic: false
name: Below
global:
GLOBAL_VARS_EXAMPLE: 99
OTHER: skipped
rms:
horizons:
TOP_RES:
- TopVolantis
- TopTherys
- TopVolon
- BaseVolantis
zones:
ZONE_RES:
- Valysar
- Therys
- Volon
Exporting fault polygons
Python script
"""Export faultpolygons via dataio with metadata."""
from pathlib import Path
import xtgeo
from fmu.config import utilities as utils
import fmu.dataio as dataio
CFG = utils.yaml_load("../../fmuconfig/output/global_variables.yml")
HORISONNAMES = CFG["rms"]["horizons"]["TOP_RES"]
# if inside RMS
RMS_POL_CATEGORY = "GL_faultlines_extract_postprocess"
# if running outside RMS using files that are stored e.g. on rms/output
FILEROOT = Path("../output/polygons")
def export_faultlines():
"""Return faultlines as both dataframe and original (xyz)"""
ed = dataio.ExportData(
config=CFG,
content="depth",
unit="m",
vertical_domain={"depth": "msl"},
timedata=None,
is_prediction=True,
is_observation=False,
tagname="faultlines",
verbosity="INFO",
workflow="rms structural model",
)
for hname in HORISONNAMES:
# RMS version for reading polygons from a project:
# poly = xtgeo.polygons_from_roxar(project, hname, RMS_POL_CATEGORY)
# File version:
poly = xtgeo.polygons_from_file((FILEROOT / hname.lower()).with_suffix(".pol"))
poly.name = hname
# Export both csv (keeping xtgeo column names) and irap text format
# The difference bewtween "csv" and "csv|xtgeo" is that the latter will keep
# xtgeo column names as-is while "csv" will force column names to "X Y Z ID"
for fmt in ["csv|xtgeo", "irap_ascii"]:
ed.polygons_fformat = fmt
ed.export(poly, verbosity="WARNING")
if __name__ == "__main__":
export_faultlines()
Press + to see generated YAML file.
$schema: https://main-fmu-schemas-dev.radix.equinor.com/schemas/0.8.0/fmu_results.json
version: 0.8.0
source: fmu
tracklog:
- datetime: '2023-08-08T13:00:33.263738'
user:
id: docs
event: created
class: polygons
fmu:
model:
name: ff
revision: 21.1.0.dev
context:
stage: realization
workflow:
reference: rms structural model
case:
name: xcase
user:
id: nn
description:
- Generated by nn at 2021-06-08T13:17:06.485667
uuid: 6fe60348-b386-4a68-9a0b-e8a4596af911
iteration:
id: 0
uuid: 8483547b-2a22-2d07-5406-13493e216649
name: iter-0
realization:
id: 0
uuid: 709a05f8-a58f-ed00-9774-b536541b3d86
name: realization-0
parameters:
SENSNAME: rms_seed
SENSCASE: p10_p90
RMS_SEED: 1000
THIS_FILE_IS_JUST_A_STUB: 1
SOME:
OTHER: 2
file:
relative_path: realization-0/iter-0/share/results/polygons/volantis_gp_top--faultlines.pol
absolute_path: /home/docs/checkouts/readthedocs.org/user_builds/fmu-dataio/checkouts/update-rtd-config/examples/s/d/nn/xcase/realization-0/iter-0/share/results/polygons/volantis_gp_top--faultlines.pol
checksum_md5: 7b85a0113a83a450eb945be37d75b537
data:
name: VOLANTIS GP. Top
stratigraphic: true
alias:
- TopVOLANTIS
- TOP_VOLANTIS
- TopVolantis
- TopVolantis
content: depth
tagname: faultlines
format: irap_ascii
layout: unset
unit: m
vertical_domain: depth
depth_reference: msl
spec:
npolys: 6
bbox:
xmin: 459480.7
xmax: 465790.141
ymin: 5929993.362
ymax: 5937737.446
zmin: 1557.431
zmax: 1825.486
undef_is_zero: false
is_prediction: true
is_observation: false
display:
name: ''
access:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
classification: internal
masterdata:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_2020
uuid: some-unique-id-to-be-provided-by-smda
Exporting average maps from grid properties
Python script
"""Export maps that holds certain average gridmodel properties.
The files on disk are:
facies_fraction_channels_volon.gri
klogh_average_valysar.gri
phit_average_therys.gri
We wan to use the file names here to extract some data (like name of formation,
e.g. Therys).
"""
from pathlib import Path
import xtgeo
from fmu.config import utilities as ut
import fmu.dataio as dataio
CFG = ut.yaml_load("../../fmuconfig/output/global_variables.yml")
# property attributes, the key is "pattern" and the value is generic name to be used:
TRANSLATE = {
"facies_fraction": "facies_fraction",
"phit": "porosity",
"klog": "permeability",
}
# name attributes, the key is "pattern" and the value is name to be used:
NAMETRANSLATE = {
"valysar": "Valysar",
"therys": "Therys",
"volon": "Volon",
}
INPUT_FOLDER = Path("../output/maps/grid_averages")
dataio.ExportData._inside_rms = True
def main():
"""Exporting maps from clipboard"""
files = INPUT_FOLDER.glob("*.gri")
for file in files:
surf = xtgeo.surface_from_file(file)
attribute = "unset"
for pattern, attr in TRANSLATE.items():
if pattern in str(file).lower():
attribute = attr
name = "unset"
for pattern, attr in NAMETRANSLATE.items():
if pattern in str(file).lower():
name = attr
ed = dataio.ExportData(
config=CFG,
name=name,
unit="fraction",
content={"property": {"attribute": attribute, "is_discrete": False}},
vertical_domain={"depth": "msl"},
timedata=None,
is_prediction=True,
is_observation=False,
tagname="average_" + attribute,
verbosity="INFO",
workflow="rms property model",
)
fname = ed.export(surf)
print(f"File name is {fname}")
if __name__ == "__main__":
main()
print("That's all")
Press + to see generated YAML file for metadata.
$schema: https://main-fmu-schemas-dev.radix.equinor.com/schemas/0.8.0/fmu_results.json
version: 0.8.0
source: fmu
tracklog:
- datetime: '2023-08-08T13:00:34.842502'
user:
id: docs
event: created
class: surface
fmu:
model:
name: ff
revision: 21.1.0.dev
context:
stage: realization
workflow:
reference: rms property model
case:
name: xcase
user:
id: nn
description:
- Generated by nn at 2021-06-08T13:17:06.485667
uuid: 6fe60348-b386-4a68-9a0b-e8a4596af911
iteration:
id: 0
uuid: 8483547b-2a22-2d07-5406-13493e216649
name: iter-0
realization:
id: 0
uuid: 709a05f8-a58f-ed00-9774-b536541b3d86
name: realization-0
parameters:
SENSNAME: rms_seed
SENSCASE: p10_p90
RMS_SEED: 1000
THIS_FILE_IS_JUST_A_STUB: 1
SOME:
OTHER: 2
file:
relative_path: realization-0/iter-0/share/results/maps/therys--average_porosity.gri
absolute_path: /home/docs/checkouts/readthedocs.org/user_builds/fmu-dataio/checkouts/update-rtd-config/examples/s/d/nn/xcase/realization-0/iter-0/share/results/maps/therys--average_porosity.gri
checksum_md5: 7b8d677044c2701d88f1a95e85aec142
data:
name: Therys Fm.
stratigraphic: true
alias:
- Therys
content: property
property:
attribute: porosity
is_discrete: false
tagname: average_porosity
format: irap_binary
layout: regular
unit: fraction
vertical_domain: depth
depth_reference: msl
spec:
ncol: 280
nrow: 440
xori: 461500.0
yori: 5926500.0
xinc: 25.0
yinc: 25.0
yflip: 1
rotation: 30.0
undef: 1.0e+30
bbox:
xmin: 456012.5
xmax: 467540.52719139645
ymin: 5926500.0
ymax: 5939492.128806534
zmin: 0.0
zmax: 0.38215649127960205
undef_is_zero: false
is_prediction: true
is_observation: false
display:
name: Therys
access:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
classification: internal
masterdata:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_2020
uuid: some-unique-id-to-be-provided-by-smda
Exporting 3D grids with properties
Python script
"""Export 3D griddata with properties."""
import pathlib
import xtgeo
from fmu.config import utilities as ut
import fmu.dataio as dataio
CFG = ut.yaml_load("../../fmuconfig/output/global_variables.yml")
FOLDER = pathlib.Path("../output/grids")
GFILE = "gg"
GNAME = "geogrid"
PROPS_SEISMIC = ["phit", "sw"]
PROPS_OTHER = ["klogh", "facies"]
VERBOSITY = "WARNING"
def export_geogrid_geometry():
filename = (FOLDER / GFILE).with_suffix(".roff")
grd = xtgeo.grid_from_file(filename)
ed = dataio.ExportData(
config=CFG,
name=GNAME,
content="depth",
unit="m",
vertical_domain={"depth": "msl"},
timedata=None,
is_prediction=True,
is_observation=False,
tagname="",
verbosity=VERBOSITY,
workflow="rms structural model",
)
out = ed.export(grd)
print(f"Stored grid as {out}")
def export_geogrid_parameters():
"""Export geogrid assosiated parameters based on user defined lists"""
props = PROPS_SEISMIC + PROPS_OTHER
print("Write grid properties...")
for propname in props:
filename = (FOLDER / (GFILE + "_" + propname)).with_suffix(".roff")
prop = xtgeo.gridproperty_from_file(filename)
ed = dataio.ExportData(
name=propname,
# parent={"name": GNAME},
config=CFG,
content="depth",
unit="m",
vertical_domain={"depth": "msl"},
timedata=None,
is_prediction=True,
is_observation=False,
verbosity=VERBOSITY,
workflow="rms property model",
)
out = ed.export(prop)
print(f"Stored {propname} as {out}")
if __name__ == "__main__":
export_geogrid_geometry()
export_geogrid_parameters()
print("Done.")
Press + to see generated YAML files for metadata.
$schema: https://main-fmu-schemas-dev.radix.equinor.com/schemas/0.8.0/fmu_results.json
version: 0.8.0
source: fmu
tracklog:
- datetime: '2023-08-08T13:00:36.298515'
user:
id: docs
event: created
class: cpgrid
fmu:
model:
name: ff
revision: 21.1.0.dev
context:
stage: realization
workflow:
reference: rms structural model
case:
name: xcase
user:
id: nn
description:
- Generated by nn at 2021-06-08T13:17:06.485667
uuid: 6fe60348-b386-4a68-9a0b-e8a4596af911
iteration:
id: 0
uuid: 8483547b-2a22-2d07-5406-13493e216649
name: iter-0
realization:
id: 0
uuid: 709a05f8-a58f-ed00-9774-b536541b3d86
name: realization-0
parameters:
SENSNAME: rms_seed
SENSCASE: p10_p90
RMS_SEED: 1000
THIS_FILE_IS_JUST_A_STUB: 1
SOME:
OTHER: 2
file:
relative_path: realization-0/iter-0/share/results/grids/geogrid.roff
absolute_path: /home/docs/checkouts/readthedocs.org/user_builds/fmu-dataio/checkouts/update-rtd-config/examples/s/d/nn/xcase/realization-0/iter-0/share/results/grids/geogrid.roff
checksum_md5: d8e0b3589ccc5ff6cd416aa2d5e73c7a
data:
name: geogrid
stratigraphic: false
content: depth
tagname: ''
format: roff
layout: cornerpoint
unit: m
vertical_domain: depth
depth_reference: msl
spec:
ncol: 92
nrow: 146
nlay: 65
xshift: 0.0
yshift: 0.0
zshift: 0.0
xscale: 1.0
yscale: 1.0
zscale: 1.0
subgrids:
subgrid_0: 19
subgrid_1: 29
subgrid_2: 14
bbox:
xmin: 456063.6875
xmax: 467489.3438
ymin: 5926551.0
ymax: 5939441.0
zmin: 1554.2631
zmax: 2001.8425
undef_is_zero: false
is_prediction: true
is_observation: false
display:
name: geogrid
access:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
classification: internal
masterdata:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_2020
uuid: some-unique-id-to-be-provided-by-smda
$schema: https://main-fmu-schemas-dev.radix.equinor.com/schemas/0.8.0/fmu_results.json
version: 0.8.0
source: fmu
tracklog:
- datetime: '2023-08-08T13:00:36.772342'
user:
id: docs
event: created
class: cpgrid_property
fmu:
model:
name: ff
revision: 21.1.0.dev
context:
stage: realization
workflow:
reference: rms property model
case:
name: xcase
user:
id: nn
description:
- Generated by nn at 2021-06-08T13:17:06.485667
uuid: 6fe60348-b386-4a68-9a0b-e8a4596af911
iteration:
id: 0
uuid: 8483547b-2a22-2d07-5406-13493e216649
name: iter-0
realization:
id: 0
uuid: 709a05f8-a58f-ed00-9774-b536541b3d86
name: realization-0
parameters:
SENSNAME: rms_seed
SENSCASE: p10_p90
RMS_SEED: 1000
THIS_FILE_IS_JUST_A_STUB: 1
SOME:
OTHER: 2
file:
relative_path: realization-0/iter-0/share/results/grids/facies.roff
absolute_path: /home/docs/checkouts/readthedocs.org/user_builds/fmu-dataio/checkouts/update-rtd-config/examples/s/d/nn/xcase/realization-0/iter-0/share/results/grids/facies.roff
checksum_md5: cae58c52b75d69a2385b7ff0f5a27edb
data:
name: facies
stratigraphic: false
content: depth
tagname: ''
format: roff
layout: cornerpoint
unit: m
vertical_domain: depth
depth_reference: msl
spec:
ncol: 92
nrow: 146
nlay: 65
undef_is_zero: false
is_prediction: true
is_observation: false
display:
name: facies
access:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
classification: internal
masterdata:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_2020
uuid: some-unique-id-to-be-provided-by-smda
Exporting volume tables RMS or file
Python script
"""Read volume table from RMS or file and export to CSV for SUMO.
In this example there is switch, IN_ROXAR which is set to True if using it inside
RMS (to demostrate how volume tables can be fetched via Roxar API).
For the file case, CSV files are read from disk. The dataio function is the same.
"""
import pathlib
import pandas as pd
import fmu.dataio
from fmu.config import utilities as ut
CFG = ut.yaml_load("../../fmuconfig/output/global_variables.yml")
IN_ROXAR = False
PRJ = None
if IN_ROXAR:
PRJ = project # type: ignore # noqa # pylint: disable=undefined-variable
VTABLES = ["geogrid_volumes", "simgrid_volumes"]
else:
VFOLDER = "../output/volumes/"
VFILES = ["geogrid_vol.csv", "simgrid_vol.csv"]
TAGNAME = "volumes"
VERBOSITY = "WARNING"
# renaming columns from RMS to FMU standard
RENAMING = {
"Proj. real.": "REALIZATION",
"Zone": "ZONE",
"Segment": "REGION",
"BulkOil": "BULK_OIL",
"PoreOil": "PORV_OIL",
"HCPVOil": "HCPV_OIL",
"STOIIP": "STOIIP_OIL",
"AssociatedGas": "ASSOCIATEDGAS_OIL",
"BulkGas": "BULK_GAS",
"PoreGas": "PORV_GAS",
"HCPVGas": "HCPV_GAS",
"GIIP": "GIIP_GAS",
"AssociatedLiquid": "ASSOCIATEDOIL_GAS",
"Bulk": "BULK_TOTAL",
"Pore": "PORV_TOTAL",
}
def volume_as_dataframe_files(vfile):
"""Read volume (CSV files) and return dataframe."""
# "geogrid_vol.csv" --> "geogrid" etc
gridname = vfile.replace("_vol.csv", "")
fname = pathlib.Path(VFOLDER) / vfile
dframe = pd.read_csv(fname)
return dframe, gridname
def volume_as_dataframe_rms(vtable):
"""Read volume table in RMS and return dataframe."""
# "geogrid_volumes" --> "geogrid" etc
gridname = vtable.replace("_volumes", "")
table = PRJ.volumetric_tables[vtable]
dtdict = table.get_data_table().to_dict()
dframe = pd.DataFrame.from_dict(dtdict)
dframe.rename(columns=RENAMING, inplace=True)
# skip REALIZATION
dframe.drop("REALIZATION", axis=1, inplace=True)
return dframe, gridname
def export_dataio(df, gridname):
"""Get the dataframe and export to SUMO via dataio."""
exp = fmu.dataio.ExportData(
name=gridname,
config=CFG,
content="volumetrics",
unit="m",
is_prediction=True,
is_observation=False,
tagname=TAGNAME,
verbosity=VERBOSITY,
workflow="Volume calculation",
)
out = exp.export(df)
print(f"Exported volume table for {gridname} to {out}")
if __name__ == "__main__":
if IN_ROXAR:
for vtable in VTABLES:
export_dataio(*volume_as_dataframe_rms(vtable))
else:
for vfile in VFILES:
export_dataio(*volume_as_dataframe_files(vfile))
$schema: https://main-fmu-schemas-dev.radix.equinor.com/schemas/0.8.0/fmu_results.json
version: 0.8.0
source: fmu
tracklog:
- datetime: '2023-08-08T13:00:38.159326'
user:
id: docs
event: created
class: table
fmu:
model:
name: ff
revision: 21.1.0.dev
context:
stage: realization
workflow:
reference: Volume calculation
case:
name: xcase
user:
id: nn
description:
- Generated by nn at 2021-06-08T13:17:06.485667
uuid: 6fe60348-b386-4a68-9a0b-e8a4596af911
iteration:
id: 0
uuid: 8483547b-2a22-2d07-5406-13493e216649
name: iter-0
realization:
id: 0
uuid: 709a05f8-a58f-ed00-9774-b536541b3d86
name: realization-0
parameters:
SENSNAME: rms_seed
SENSCASE: p10_p90
RMS_SEED: 1000
THIS_FILE_IS_JUST_A_STUB: 1
SOME:
OTHER: 2
file:
relative_path: realization-0/iter-0/share/results/tables/geogrid--volumes.csv
absolute_path: /home/docs/checkouts/readthedocs.org/user_builds/fmu-dataio/checkouts/update-rtd-config/examples/s/d/nn/xcase/realization-0/iter-0/share/results/tables/geogrid--volumes.csv
checksum_md5: f22adf8dca797f165c1c38fcc535e77c
data:
name: geogrid
stratigraphic: false
content: volumetrics
tagname: volumes
format: csv
layout: table
unit: m
vertical_domain: depth
depth_reference: msl
spec:
columns:
- 'Unnamed: 0'
- ZONE
- REGION
- BULK_OIL
- PORV_OIL
- HCPV_OIL
- STOIIP_OIL
- ASSOCIATEDGAS_OIL
- BULK_GAS
- PORV_GAS
- HCPV_GAS
- GIIP_GAS
- ASSOCIATEDOIL_GAS
- BULK_TOTAL
- PORV_TOTAL
size: 315
table_index:
- ZONE
- REGION
table_index_values:
ZONE: '[''Valysar'' ''Therys'' ''Volon'']'
REGION: "['WestLowland' 'CentralSouth' 'CentralNorth' 'NorthHorst' 'CentralRamp'\n\
\ 'CentralHorst' 'EastLowland']"
undef_is_zero: false
is_prediction: true
is_observation: false
display:
name: geogrid
access:
asset:
name: Drogon
ssdl:
access_level: internal
rep_include: true
classification: internal
masterdata:
smda:
country:
- identifier: Norway
uuid: ad214d85-8a1d-19da-e053-c918a4889309
discovery:
- short_identifier: DROGON
uuid: ad214d85-8a1d-19da-e053-c918a4889309
field:
- identifier: DROGON
uuid: 00000000-0000-0000-0000-000000000000
coordinate_system:
identifier: ST_WGS84_UTM37N_P32637
uuid: ad214d85-dac7-19da-e053-c918a4889309
stratigraphic_column:
identifier: DROGON_2020
uuid: some-unique-id-to-be-provided-by-smda
Using fmu-dataio for post-processed data
The example below show how fmu-dataio can be used in a post-processing context, here in a surface aggregation example.
When using ensemble-based methods for probabilistic modelling, the result is represented by the distribution of the realizations, not by the individual realizations themselves. In such a context, easy access to statistical representations of the ensemble is important. For surfaces, this typically includes point-wise mean, std, min/max, p10/p90 and others.
Aggregations in an FMU context is usually done by standalone Python scripts, but cloud services are also in the making (Sumo). The example below show how fmu-dataio can be used to simplify an existing aggregation service, as well as make de-centralized methods more robust by centralizing the definitions and handling of metadata.
Note
It is common that surfaces exported from RMS or other sources have undefined areas. For some surfaces, typically various thickness surfaces (e.g. HCPV thickness from RMS volumetric jobs), undefined values shall be treated as zero (0.0) when included in statistical calculations. Therefore, when exporting surfaces of this type, set the undef_is_zero flag to True when exporting. This tells later consumers of the surface that they should handle UNDEF as zero.
Python script
"""Use fmu-dataio for aggregated surfaces created by an aggregation service."""
from pathlib import Path
import logging
import yaml
import numpy as np
import xtgeo
import fmu.dataio
def main():
"""Aggregate one surface across X realizations from the example case and store the
results. In this example, we emulate that fmu-dataio is called by another service.
Two contexts are demonstrated:
1) We are in a classical FMU setting, running this aggregation as a stand-alone
Python script (directly, or wrapped in an ERT workflow). In this context,
we want the resulting files to be stored to disk within the existing case. In
this context, fmu-dataio is responsible for storing the results.
2) We are in a cloud service. In this context, we don't want to store anything
on disk, as the results are to be pushed to other storage, i.e. Sumo. Hence,
we want fmu-dataio to provide us with the generated metadata only. In this
context, the service itself is responsible for storing the results.
Note that this example is showing the usage of fmu-dataio, it is not to be seen as
an example for the actual aggregation. The aggregation service shown here is
simplistic and its sole purpose is to facilitate the fmu-dataio example.
"""
# First we get the input data (the individual surfaces from each realization), which
# we assume are stored in classical FMU style on /scratch disk folder structure.
# IRL, these variables would typically be arguments to the aggregation script.
casepath = Path("../xcase/").resolve()
iter_name = "iter-0"
relative_path = (
"share/results/maps/topvolantis--ds_extract_geogrid.gri" # exists in all reals
)
realization_ids = _get_realization_ids(casepath)
# gather source surfaces and their associated metadata
source_surfaces, source_metadata = _get_source_surfaces_from_disk(
casepath=casepath,
iter_name=iter_name,
realization_ids=realization_ids,
relative_path=relative_path,
)
# These are the operations we want to do
operations = ["mean", "min", "max", "std"]
# This is the ID we assign to this set of aggregations
aggregation_id = "something_very_unique" # IRL this will usually be a uuid
# We aggregate these source surfaces and collect results in list of dictionaries
aggregations = []
# Initialize an AggregatedData object for this set of aggregations
exp = fmu.dataio.AggregatedData(
source_metadata=source_metadata,
aggregation_id=aggregation_id,
casepath=casepath,
)
for operation in operations:
print(f"Running aggregation: {operation}")
# Call the aggregation machine and create an aggregated surface
# Note that this is not part of fmu-dataio - it is merely a mock-up
# aggregation service for the sake of this example.
aggregated_surface = _aggregate(source_surfaces, operation)
# ==============================================================================
# Example 1: We want fmu-dataio to export the file + metadata to disk
saved_filename = exp.export(aggregated_surface, operation=operation)
print(f"Example 1: File saved to {saved_filename}")
# ==============================================================================
# Example 2: We only want the metadata (e.g. we are in a cloud service)
metadata = exp.generate_metadata(aggregated_surface, operation=operation)
print(f"Example 2: Metadata generated")
# At this point, we have the surface, the operation and the metadata
# These can be collected into e.g. a list or a dictionary for further usage,
# or we can upload to Sumo as part of the loop.
# ======================================================================================
# This concludes the main examples. Below are utility functions used by the example.
# ======================================================================================
def _aggregate(source_surfaces, operation):
"""Aggregate a set of surfaces, return the result.
This is a very simplistic and minimalistic aggregation method meant to power this
example only. Do not use in production setting.
"""
if operation == "mean":
return source_surfaces.apply(np.nanmean, axis=0)
if operation == "min":
return source_surfaces.apply(np.min, axis=0)
if operation == "max":
return source_surfaces.apply(np.max, axis=0)
if operation == "std":
return source_surfaces.apply(np.std, axis=0)
# In a real aggregation service, more options would of course be supported. However,
# in this example, we do not include anything beyond the basics.
raise NotImplementedError(
f"Aggregation method {operation} is not implemented in this example."
)
def _parse_yaml(fname):
"""Parse the yaml-file, return dict.
Args:
fname (Path): Absolute path to yaml file.
Returns:
dict
"""
with open(fname, "r") as stream:
data = yaml.safe_load(stream)
return data
def _metadata_filename(fname):
"""From a regular filename, derive the corresponding metadata filename.
FMU standard: metadata filename = /path/.<stem.ext>.yml
"""
return Path(fname.parent, "." + fname.name + ".yml")
def _get_realization_ids(casepath):
"""Given a path to a case on the disk, get the individual realizations."""
# In reality we would traverse the disk to find out, use Sumo search, or preferably
# ask ERT for it. For the sake of the example we just hardcode it here.
return [0, 1, 9]
def _get_source_surfaces_from_disk(
casepath: Path, iter_name: str, realization_ids: list, relative_path: Path
):
"""Collect surfaces and metadata from disk.
This method collects the source surfaces from disk. Source surfaces are the
individual realization surfaces which shall be aggregated.
A similar method will exist for other sources than /scratch, e.g. Sumo.
Args:
casepath (Path): Absolute path to the case root.
iter_name (str): Name of the iteration (folder), e.g. "iter-0"
reals (list of ids): List of realization-ids, e.g. 0,1,2,3
relative_path (Path): Relative path below ERT RUNPATH to the surface
By combining the casepath and the relative path, parse the surfaces and metadata
and return surfaces as an xtgeo.Surfaces object and the metadata as a list of dicts.
"""
collected_data = []
for real in realization_ids:
surfacepath = casepath / f"realization-{real}" / iter_name / relative_path
surface = xtgeo.surface_from_file(surfacepath)
metadata = fmu.dataio.read_metadata(surfacepath)
# this example is minimalistic and super non-robust. In reality, there will
# be realizations missing which needs to be handled etc.
collected_data.append((surface, metadata))
source_surfaces = xtgeo.Surfaces([surface for surface, _ in collected_data])
source_metadata = [metadata for _, metadata in collected_data]
return source_surfaces, source_metadata
def _get_source_surfaces_from_sumo(
case_uuid: str, iter_name: str, realization_ids: list, relative_path: Path
):
"""Collect surfaces and metadata from Sumo.
Placeholder for a method getting surfaces and metadata from Sumo, complementing
the similar method for getting this from the disk. Only included since it is
referenced in the comments in main().
Not implemented.
"""
raise NotImplementedError()
if __name__ == "__main__":
main()