Usage

Custom exports require an instance of the ExportData class. When creating an instance of this class, some information must be provided, and this information is in part dependent upon the type of data you are exporting.

The basic usage pattern is then:

  1. Create an ExportData instance with relevant input values

  2. Use the export(data) method to export data with it

The ExportData class can be imported like so:

from fmu.dataio import ExportData

or,

from fmu import dataio
# You can use dataio.ExportData directly

The following are the currently supported input values when creating an instance of the ExportData class. They are ordered first by whether or not they are required to create valid metadata, and also by how frequently they are used.

After the ExportData instance has been created with its initial values, those values cannot and should not be changed. This means that whenever data that requires different values is being exported, a new instance of ExportData must be created with those different values.

Some data types (also referred to as content types) place a requirement on otherwise optional fields.

class ExportData[source]

This class provides context for the metadata generated when data is exported.

Here is a complete example of how it is used:

for name in ["TopOne", "TopTwo", "TopThree"]:
    poly = xtgeo.polygons_from_roxar(project, name, POL_FOLDER)

    ed = dataio.ExportData(
        config=CFG,
        content="depth",
        unit="m",
        vertical_domain="fault_lines",
        domain_reference="msl",
        timedata=None,
        is_observation=False,
        tagname="faultlines",
        workflow="rms structural model",
        name=name
    )
    out = ed.export(poly)

In general, fmu-dataio tries to take care of exporting data automatically to conventional and standard locations. In the documentation below you might find references to the following terms.

pwd

The present working directory. This is the directory a script or application is started from.

rootpath

The directory from which relative file names are relative to. This is auto-detected by fmu-dataio.

casepath

The path where the FMU case originates from (is started from). This should be equivalent to the rootpath in most circumstances.

Examples:

/project/foo/resmod/ff/2022.1.0/rms/model                   # pwd
/project/foo/resmod/ff/2022.1.0/                            # rootpath

A file:

/project/foo/resmod/ff/2022.1.0/share/results/maps/xx.gri   # example absolute
                                share/results/maps/xx.gri   # example relative

When running an Ert forward job using a normal Ert job (e.g. a script):

/scratch/nn/case/realization-44/iter-2                      # pwd
/scratch/nn/case                                            # rootpath

A file:

/scratch/nn/case/realization-44/iter-2/share/results/maps/xx.gri  # absolute
                 realization-44/iter-2/share/results/maps/xx.gri  # relative

When running an Ert forward job but here executed from RMS:

/scratch/nn/case/realization-44/iter-2/rms/model            # pwd
/scratch/nn/case                                            # rootpath

A file:

/scratch/nn/case/realization-44/iter-2/share/results/maps/xx.gri  # absolute
                 realization-44/iter-2/share/results/maps/xx.gri  # relative
config: dict[str, Any] | GlobalConfiguration

Required in order to produce valid metadata.

This global config must be provided either as an input value here or through an environment variable.

This value should be a dictionary with static settings. In the standard case this is read from FMU global variables produced by fmuconfig. The dictionary must contain some predefined main level keys to work with fmu-dataio.

Note

If missing or empty, an export() may still be done, but without any metadata produced.

content: str | dict | None = None

A required string describing the content of the data, e.g. "volumes".

Warning

Using the content argument as a dict to set both the content and the content metadata will be deprecated. Set the content argument to a valid content string, and provide the extra information through the content_metadata argument instead.

Some content types, like "seismic", require additional information. This should be provided through the content_metadata argument described below.

The list of content types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the content type.

class Content

The content type of a given data object.

Content.depth = 'depth'

A data object representing depth values.

Typically provided as an xtgeo.RegularSurface or xtgeo.Grid for export.

Content.facies_thickness = 'facies_thickness'

Thickness map representing facies thickness, derived from a 3D grid.

Typically provided as an xtgeo.RegularSurface for export.

Content.fault_lines = 'fault_lines'

Intersections between fault planes and horizons.

Typically provided as an xtgeo.Polygons for export.

Content.fault_surface = 'fault_surface'

A surface representing a fault plane.

Typically provided either as an RMS FaultRoom GeoJSON surface or an fmu-dataio TSurfData for export.

Content.fault_properties = 'fault_properties'

Properties, such as permeability and porosity, on a fault.

Typically provided as a GeoJSON file derived from RMS FaultRoom for export.

Content.field_outline = 'field_outline'

Polygons representing the outline of a field, initial (static) conditions.

Typically provided as an xtgeo.Polygons for export.

Content.field_region = 'field_region'

Delineated or named region within a field.

Typically provided as an xtgeo.Polygons for export.

Content.fluid_contact = 'fluid_contact'

Depth surface representing a fluid contact used per realization.

Typically provided as an xtgeo.RegularSurface for export.

Content.khproduct = 'khproduct'

The product of permeability (k) and reservoir thickness (h).

Typically provided as an xtgeo.RegularSurface for export.

Content.lift_curves = 'lift_curves'

Table representing the relationship between production rates and pressures.

Typically provided as a Pandas Dataframe for export.

Content.mapping = 'mapping'

Tabular cross-references used to translate between different naming conventions or identifiers.

Acts as a bridge to align data across different domains, such as: * Official stratigraphy to model zonation. * Static reservoir regions/zones to simulator-specific identifiers (e.g., FIPGRP). * Unique Well Identifiers (UWI) to simulation well names.

Typically provided as a Pandas Dataframe for export.

Content.named_area = 'named_area'

A named area within a field that is _not_ a region.

Typically provided as an xtgeo.Polygons for export.

Content.observations = 'observations'

ERT observations generated for the ensemble.

Typically provided as a Pandas Dataframe for export.

Tip

You should not export this manually. This is done automatically by the CREATE_CASE_METADATA ERT workflow.

Content.production_network = 'production_network'

Tabular data representing the production group structure.

Typically provided as a Pandas Dataframe.

Tip

You should not export this manually. Use SIM2SUMO.

Content.pinchout = 'pinchout'

Polygons designating a pinchout.

Typically provided as an xtgeo.Polygons for export.

Content.property = 'property'

A property, like permeability or porosity, belonging to a 3D grid.

Typically provided as an xtgeo.GridProperty.

Tip

This content type requires additional input in the content_metadata field.

Grid property data handling is still immature. More comprehensive data categorization will come in the future.

Content.pvt = 'pvt'

Tabular pressure-volume-temperature data.

Typically provided as a Pandas Dataframe for export.

Tip

You should not export this manually. Use SIM2SUMO.

Content.regions = 'regions'

Distinct areas within the field that have different characteristics.

Examples may be volume regions or contact regions.

Typically provided as an xtgeo.Polygons or xtgeo.GridProperty.

Content.relperm = 'relperm'

Tabular relative permeability data.

Typically provided as a Pandas Dataframe for export.

Tip

You should not export this manually. Use SIM2SUMO.

Content.rft = 'rft'

Tabular reservoir formation tests data.

Tip

You should not export this manually. Use SIM2SUMO.

Content.seismic = 'seismic'

Data that is seismic in nature, including seismic cubes and surface data derived from seismic cubes.

Typically provided as an xtgeo.Cube, xtgeo.RegularSurface, or other.

Tip

This content type requires additional input in the content_metadata field.

Seismic data handling is still immature. More comprehensive data categorization will come in the future.

Content.simulationtimeseries = 'simulationtimeseries'

Time-series data generated by a reservoir simulator like OPM Flow or Eclipse.

For example, a summary file parsed into a Pandas Dataframe by res2df.

Tip

You should not export this manually. Use SIM2SUMO.

Content.subcrop = 'subcrop'

Surface or polygon representing a subcrop area.

Typically provided as an xtgeo.RegularSurface or xtgeo.Polygons for export.

Content.thickness = 'thickness'

A thickness map.

Typically provided as an xtgeo.RegularSurface for export.

Content.time = 'time'

A seismic time surface or seismic cube in time domain.

Typically provided as an xtgeo.RegularSurface or xtgeo.Cube.

Content.transmissibilities = 'transmissibilities'

Tabular data containing transmissibilities (neighbour and non-neigbor-connections).

Typically provided as a Pandas Dataframe.

Tip

You should not export this manually. Use SIM2SUMO.

Content.velocity = 'velocity'

A seismic velocity map represented as a regular surface or a cube.

Typically provided as an xtgeo.RegularSurface or xtgeo.Cube for export.

Content.volumes = 'volumes'

Tabulated inplace volumes per grid, initial (static) conditions.

Typically provided as a Pandas Dataframe.

Content.well_completions = 'well_completions'

Tabular data representing well completions.

Typically provided as a Pandas Dataframe.

Tip

You should not export this manually. Use SIM2SUMO.

Content.wellpicks = 'wellpicks'

Tabular data representing wellpicks.

Typically provided as a Pandas Dataframe.

content_metadata: dict | None = None

Optional. Dictionary with additional information about the provided content. Only required for some content types, e.g. "seismic".

Example:

content_metadata={"attribute": "amplitude", "calculation": "mean"},
classification: str | None = None

Optional. Security classification level of the data object.

If present it will override the default found in the config.

The list of classification types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the classification type.

class Classification

The security classification for a given data object.

Classification.internal = 'internal'

Grants access to all users with READ access to the asset.

The READ role is an access role defined by the asset’s Unix and Sumo groups. This is the default for most data.

Classification.restricted = 'restricted'

Grants access to all users with WRITE access to the asset.

The WRITE role is an access role defined by the asset’s Unix and Sumo groups. This is the default for some sensitive data, like volumes, but in general must be explicitly set when restricted access is desired.

domain_reference: str = 'msl'

Optional. Reference to the vertical scale of the data.

The list of classification types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the classification type.

class DomainReference
DomainReference.msl = 'msl'

In reference to Mean Sea Level.

DomainReference.sb = 'sb'

In reference to Sea Bottom.

DomainReference.rkb = 'rkb'

In reference to Rotary Kelly Bushing (RKB).

Note

Use the vertical_domain key to set the domain (depth or time).

vertical_domain: str | dict = 'depth'

Optional. The vertical domain of the data.

The list of classification types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the classification type.

class VerticalDomain
VerticalDomain.depth = 'depth'

In the domain of depth.

VerticalDomain.time = 'time'

In the domain of time.

A reference for the vertical scale can be provided with the domain_reference value.

Note

If the content is "depth" or "time" this value will be set accordingly.

Warning

Providing a dictionary as a value is deprecated.

geometry: str | None = None

Optional. For grid properties only which need a reference to the 3D grid geometry object.

The value must point to an existing file which has already been exported with fmu-dataio, and hence has an associated metadata file. The grid name will be derived from the grid metadata, if present, and applied as part of the grid property file name.

Note

This value may replace the usage of both the parent value and the grid_model value in the near future.

is_observation: bool = False

If True then data will be exported to the share/observations/ directory.

By default this is False which will export results to the share/results/ directory.

However, if preprocessed is True, then the export directory will be set to share/preprocessed/ irrespective the value of is_observation.

is_prediction: bool = True

Indicates if the exported data is model prediction data.

timedata: list[str] | list[list[str]] | None = None

Optional. List of dates, where the dates are strings on form "YYYYMMDD".

timedata=["20200101"],
timedata=["20200101", "20180101"],

A maximum of two dates can be input. The oldest date will be set as t0 in the metadata and the latest date will be t1.

Note

It is also possible to provide a label to each date by using a list of lists, e.g. [["20200101", "monitor"], ["20180101", "base"]].

unit: str | None = ''

Optional. The measurement unit relevant to the exported data.

For example, "m" would be set if the measurement unit is meters.

Caution

This value is not currently controlled by a known list but will be in the future.

table_index: list[str] | None = None

Optional. A list of strings indicating the index columns for tabular data.

This value should be set for tabular data like Pandas data frames only.

Example:

table_index=["ZONE", "REGION"],

This can also be applied to points or polygons objects that are exported in table format to specify attributes that should act as index columns.

Tip

Index columns in tabular data refer to one or more columns that uniquely identify each row in the dataset. They serve as a reference point for data retrieval and manipulation, enabling simple and efficient access to specific rows.

preprocessed: bool = False

If True, data is exported to the "share/preprocessed/" directory.

This metadata can be partially re-used in an Ert model run using the ExportPreprocessedData class.

Note

Most data are not preprocessed data, and as such this key shouldn’t often be used. An example of preprocessed data is seismic data.

description: str | list[str] = ''

Optional. A multi-line description of the data either as a string or a list of strings.

Tip

You do not need to set this.

display_name: str | None = None

Optional. Set a display name for clients to use when visualizing.

Tip

You do not need to set this.

name: str = ''

Optional. The name of the data object being exported.

If not set, fmu-dataio infers it from object data type. If the name is found in the stratigraphy static metadata list, the official stratigraphic name will be used.

For example, if "TopValysar" is the model name and the actual name is "Valysar Top Fm.", the latter name will be used.

Tip

You do not need to set this.

tagname: str = ''

Optional. A short tag description which will be a part of the file name.

As an example, if exporting a fault polygon from a horizon named "TopVolantis",

tagname="faultlines",

The exported filename will be volantis_gp_top--faultlines.csv

Tip

You do not need to set this, but it may be useful for local workflows.

workflow: str | dict[str, str] | None = None

Optional. Short string description of workflow.

Warning

Providing a dictionary as a value is deprecated.

Tip

You do not need to set this.

forcefolder: str = ''

Optional. This value allows exporting to a non-standard directory relative to the casepath/rootpath.

Warning

Using this optional is generally not recommended.

This option is dependent upon the FMU context (case or realization) and the is_observation boolean value.

Example:

forcefolder="seismic",

This will replace the cubes/ standard directory for xtgeo.Cube output with seismic/.

Caution

Use with care and avoid if possible!

parent: str = ''

Optional. This value is required for datatype xtgeo.GridProperty, unless the geometry value is given.

“Parent” refers to the name of the grid geometry. It will only be added in the filename, and not as genuine metadata entry.

Warning

This value is a candidate for deprecation. Use geometry instead.

If both parent and geometry are given, the grid name derived from the geometry object will have precedence.

casepath: str | Path | None = None

Optional. Path to a case directory that contains valid case metadata fmu_case.yml in folder <CASE_DIR>/share/metadata/.

Tip

You typically do not need to set this.

Exporting Data

After creating the ExportData instance, you can then use the export() method to export data with it.

ExportData.export(obj, **kwargs)[source]

Export supported data objects with metadata.

This function exports data without changing the content of the data. The file format of the data may be determined by values set in the class.

A file containing metadata will be exported next to it. It will have the same name as the data, but will be prefixed with a .. This causes the metadata to not be visible by a standard ls command. The metadata is stored in a YAML file.

top_volantis--depth.gri
.top_volantis--depth.gri.yml
Parameters:

obj (Annotated[Cube | GridProperty | Grid | Points | Polygons | RegularSurface | DataFrame | FaultRoomSurface | TriangulatedSurface | MutableMapping | Table | Path | str]) – An xtgeo object, Pandas dataframe, or other supported object. A full list of supported data types can be found in the documentation.

Returns:

The full path to the exported item.

Return type:

str

Note

Providing **kwargs is deprecated and will be removed in a later version.

Supported Data Objects

fmu-dataio supports exporting most fundamental data types and objects used in reservoir modelling.

The following Python objects are supported by fmu-dataio. This means they can be passed to the export() method on an appropriately configured instance of ExportData.

xtgeo

The following xtgeo objects can be exported by fmu-dataio. Currently, this is all xtgeo types except for wells. These objects are documented in the xtgeo documentation.

Pandas Dataframes

Pandas dataframes representing tabular/csv data can be exported. This is the most common way to export tabular data.

Dataframes are exported as .csv files by default.

PyArrow Tables

PyArrow tables representing tabular/csv data can be exported.

PyArrow tables are exported as .parquet files.

Python Dictionaries

Python dictionaries containing structured data can be exported as well. This should be the last-case scenario, i.e. used only when other pre-defined data types do not meet your needs.

Python dictionaries are exported as JSON files.

FaultRoom Surfaces

FaultRoom is an RMS plugin used in some FMU workflows. FaultRoom surfaces are GeoJSON files that can be created and exported by the FaultRoom plugin and have a particular format that is understood by fmu-dataio.

FaultRoom surfaces are exported as JSON files.

Something Missing?

If you have a particular data type you would like to export with fmu-dataio, but it is not supported, please reach out via:

Examples

Proceed to the Examples section to see some complete scripts which use ExportData to export custom results.