dataio package¶
Top-level package for fmu-dataio
- exception DeprecationError[source]¶
Bases:
ValueErrorRaised when deprecated argument usage is invalid.
- class ExportData[source]¶
Bases:
objectThis class provides context for the metadata generated when data is exported.
Here is a complete example of how it is used:
for name in ["TopOne", "TopTwo", "TopThree"]: poly = xtgeo.polygons_from_roxar(project, name, POL_FOLDER) ed = dataio.ExportData( config=CFG, content="depth", unit="m", vertical_domain="fault_lines", domain_reference="msl", timedata=None, is_observation=False, tagname="faultlines", workflow="rms structural model", name=name ) out = ed.export(poly)
In general, fmu-dataio tries to take care of exporting data automatically to conventional and standard locations. In the documentation below you might find references to the following terms.
pwdThe present working directory. This is the directory a script or application is started from.
rootpathThe directory from which relative file names are relative to. This is auto-detected by fmu-dataio.
casepathThe path where the FMU case originates from (is started from). This should be equivalent to the
rootpathin most circumstances.
Examples:
/project/foo/resmod/ff/2022.1.0/rms/model # pwd /project/foo/resmod/ff/2022.1.0/ # rootpath
A file:
/project/foo/resmod/ff/2022.1.0/share/results/maps/xx.gri # example absolute share/results/maps/xx.gri # example relative
When running an Ert forward job using a normal Ert job (e.g. a script):
/scratch/nn/case/realization-44/iter-2 # pwd /scratch/nn/case # rootpath
A file:
/scratch/nn/case/realization-44/iter-2/share/results/maps/xx.gri # absolute realization-44/iter-2/share/results/maps/xx.gri # relative
When running an Ert forward job but here executed from RMS:
/scratch/nn/case/realization-44/iter-2/rms/model # pwd /scratch/nn/case # rootpath
A file:
/scratch/nn/case/realization-44/iter-2/share/results/maps/xx.gri # absolute realization-44/iter-2/share/results/maps/xx.gri # relative
- config: dict[str, Any] | GlobalConfiguration¶
Required in order to produce valid metadata.
This global config must be provided either as an input value here or through an environment variable.
This value should be a dictionary with static settings. In the standard case this is read from FMU global variables produced by
fmuconfig. The dictionary must contain some predefined main level keys to work with fmu-dataio.Note
If missing or empty, an
export()may still be done, but without any metadata produced.
- content: str | dict | None = None¶
A required string describing the content of the data, e.g.
"volumes".Warning
Using the
contentargument as adictto set both the content and the content metadata will be deprecated. Set thecontentargument to a valid content string, and provide the extra information through thecontent_metadataargument instead.Some content types, like
"seismic", require additional information. This should be provided through thecontent_metadataargument described below.The list of content types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the content type.
- class Content
The content type of a given data object.
- Content.depth = 'depth'
A data object representing depth values.
Typically provided as an
xtgeo.RegularSurfaceorxtgeo.Gridfor export.
- Content.facies_thickness = 'facies_thickness'
Thickness map representing facies thickness, derived from a 3D grid.
Typically provided as an
xtgeo.RegularSurfacefor export.
- Content.fault_lines = 'fault_lines'
Intersections between fault planes and horizons.
Typically provided as an
xtgeo.Polygonsfor export.
- Content.fault_surface = 'fault_surface'
A surface representing a fault plane.
Typically provided either as an RMS FaultRoom GeoJSON surface or an fmu-dataio
TSurfDatafor export.
- Content.fault_properties = 'fault_properties'
Properties, such as permeability and porosity, on a fault.
Typically provided as a GeoJSON file derived from RMS FaultRoom for export.
- Content.field_outline = 'field_outline'
Polygons representing the outline of a field, initial (static) conditions.
Typically provided as an
xtgeo.Polygonsfor export.
- Content.field_region = 'field_region'
Delineated or named region within a field.
Typically provided as an
xtgeo.Polygonsfor export.
- Content.fluid_contact = 'fluid_contact'
Depth surface representing a fluid contact used per realization.
Typically provided as an
xtgeo.RegularSurfacefor export.
- Content.khproduct = 'khproduct'
The product of permeability (k) and reservoir thickness (h).
Typically provided as an
xtgeo.RegularSurfacefor export.
- Content.lift_curves = 'lift_curves'
Table representing the relationship between production rates and pressures.
Typically provided as a Pandas
Dataframefor export.
- Content.mapping = 'mapping'
Tabular cross-references used to translate between different naming conventions or identifiers.
Acts as a bridge to align data across different domains, such as: * Official stratigraphy to model zonation. * Static reservoir regions/zones to simulator-specific identifiers (e.g., FIPGRP). * Unique Well Identifiers (UWI) to simulation well names.
Typically provided as a Pandas
Dataframefor export.
- Content.named_area = 'named_area'
A named area within a field that is _not_ a region.
Typically provided as an
xtgeo.Polygonsfor export.
- Content.observations = 'observations'
ERT observations generated for the ensemble.
Typically provided as a Pandas
Dataframefor export.Tip
You should not export this manually. This is done automatically by the CREATE_CASE_METADATA ERT workflow.
- Content.production_network = 'production_network'
Tabular data representing the production group structure.
Typically provided as a Pandas
Dataframe.Tip
You should not export this manually. Use SIM2SUMO.
- Content.pinchout = 'pinchout'
Polygons designating a pinchout.
Typically provided as an
xtgeo.Polygonsfor export.
- Content.property = 'property'
A property, like permeability or porosity, belonging to a 3D grid.
Typically provided as an
xtgeo.GridProperty.Tip
This content type requires additional input in the
content_metadatafield.Grid property data handling is still immature. More comprehensive data categorization will come in the future.
- Content.pvt = 'pvt'
Tabular pressure-volume-temperature data.
Typically provided as a Pandas
Dataframefor export.Tip
You should not export this manually. Use SIM2SUMO.
- Content.regions = 'regions'
Distinct areas within the field that have different characteristics.
Examples may be volume regions or contact regions.
Typically provided as an
xtgeo.Polygonsorxtgeo.GridProperty.
- Content.relperm = 'relperm'
Tabular relative permeability data.
Typically provided as a Pandas
Dataframefor export.Tip
You should not export this manually. Use SIM2SUMO.
- Content.rft = 'rft'
Tabular reservoir formation tests data.
Tip
You should not export this manually. Use SIM2SUMO.
- Content.seismic = 'seismic'
Data that is seismic in nature, including seismic cubes and surface data derived from seismic cubes.
Typically provided as an
xtgeo.Cube,xtgeo.RegularSurface, or other.Tip
This content type requires additional input in the
content_metadatafield.Seismic data handling is still immature. More comprehensive data categorization will come in the future.
- Content.simulationtimeseries = 'simulationtimeseries'
Time-series data generated by a reservoir simulator like OPM Flow or Eclipse.
For example, a summary file parsed into a Pandas
Dataframeby res2df.Tip
You should not export this manually. Use SIM2SUMO.
- Content.subcrop = 'subcrop'
Surface or polygon representing a subcrop area.
Typically provided as an
xtgeo.RegularSurfaceorxtgeo.Polygonsfor export.
- Content.thickness = 'thickness'
A thickness map.
Typically provided as an
xtgeo.RegularSurfacefor export.
- Content.time = 'time'
A seismic time surface or seismic cube in time domain.
Typically provided as an
xtgeo.RegularSurfaceorxtgeo.Cube.
- Content.transmissibilities = 'transmissibilities'
Tabular data containing transmissibilities (neighbour and non-neigbor-connections).
Typically provided as a Pandas
Dataframe.Tip
You should not export this manually. Use SIM2SUMO.
- Content.velocity = 'velocity'
A seismic velocity map represented as a regular surface or a cube.
Typically provided as an
xtgeo.RegularSurfaceorxtgeo.Cubefor export.
- Content.volumes = 'volumes'
Tabulated inplace volumes per grid, initial (static) conditions.
Typically provided as a Pandas
Dataframe.
- Content.well_completions = 'well_completions'
Tabular data representing well completions.
Typically provided as a Pandas
Dataframe.Tip
You should not export this manually. Use SIM2SUMO.
- Content.wellpicks = 'wellpicks'
Tabular data representing wellpicks.
Typically provided as a Pandas
Dataframe.
- content_metadata: dict | None = None¶
Optional. Dictionary with additional information about the provided content. Only required for some
contenttypes, e.g."seismic".Example:
content_metadata={"attribute": "amplitude", "calculation": "mean"},
- classification: str | None = None¶
Optional. Security classification level of the data object.
If present it will override the default found in the config.
The list of classification types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the classification type.
- class Classification
The security classification for a given data object.
- Classification.internal = 'internal'
Grants access to all users with
READaccess to the asset.The
READrole is an access role defined by the asset’s Unix and Sumo groups. This is the default for most data.
- Classification.restricted = 'restricted'
Grants access to all users with
WRITEaccess to the asset.The
WRITErole is an access role defined by the asset’s Unix and Sumo groups. This is the default for some sensitive data, like volumes, but in general must be explicitly set when restricted access is desired.
- domain_reference: str = 'msl'¶
Optional. Reference to the vertical scale of the data.
The list of classification types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the classification type.
- class DomainReference
- DomainReference.msl = 'msl'
In reference to Mean Sea Level.
- DomainReference.sb = 'sb'
In reference to Sea Bottom.
- DomainReference.rkb = 'rkb'
In reference to Rotary Kelly Bushing (RKB).
Note
Use the
vertical_domainkey to set the domain (depth or time).
- vertical_domain: str | dict = 'depth'¶
Optional. The vertical domain of the data.
The list of classification types that can be provided is controlled and input values are validated against a current list of them. In the following enumeration you would use only the string values of the classification type.
- class VerticalDomain
- VerticalDomain.depth = 'depth'
In the domain of depth.
- VerticalDomain.time = 'time'
In the domain of time.
A reference for the vertical scale can be provided with the
domain_referencevalue.Note
If the
contentis"depth"or"time"this value will be set accordingly.Warning
Providing a dictionary as a value is deprecated.
- geometry: str | None = None¶
Optional. For grid properties only which need a reference to the 3D grid geometry object.
The value must point to an existing file which has already been exported with fmu-dataio, and hence has an associated metadata file. The grid name will be derived from the grid metadata, if present, and applied as part of the grid property file name.
Note
This value may replace the usage of both the
parentvalue and thegrid_modelvalue in the near future.
- is_observation: bool = False¶
If
Truethen data will be exported to theshare/observations/directory.By default this is
Falsewhich will export results to theshare/results/directory.However, if
preprocessedisTrue, then the export directory will be set toshare/preprocessed/irrespective the value ofis_observation.
- is_prediction: bool = True¶
Indicates if the exported data is model prediction data.
- timedata: list[str] | list[list[str]] | None = None¶
Optional. List of dates, where the dates are strings on form
"YYYYMMDD".timedata=["20200101"],
timedata=["20200101", "20180101"],
A maximum of two dates can be input. The oldest date will be set as
t0in the metadata and the latest date will bet1.Note
It is also possible to provide a label to each date by using a list of lists, e.g.
[["20200101", "monitor"], ["20180101", "base"]].
- unit: str | None = ''¶
Optional. The measurement unit relevant to the exported data.
For example,
"m"would be set if the measurement unit is meters.Caution
This value is not currently controlled by a known list but will be in the future.
- table_index: list[str] | None = None¶
Optional. A list of strings indicating the index columns for tabular data.
This value should be set for tabular data like Pandas data frames only.
Example:
table_index=["ZONE", "REGION"],
This can also be applied to points or polygons objects that are exported in table format to specify attributes that should act as index columns.
Tip
Index columns in tabular data refer to one or more columns that uniquely identify each row in the dataset. They serve as a reference point for data retrieval and manipulation, enabling simple and efficient access to specific rows.
- preprocessed: bool = False¶
If True, data is exported to the
"share/preprocessed/"directory.This metadata can be partially re-used in an Ert model run using the
ExportPreprocessedDataclass.Note
Most data are not preprocessed data, and as such this key shouldn’t often be used. An example of preprocessed data is seismic data.
- description: str | list[str] = ''¶
Optional. A multi-line description of the data either as a string or a list of strings.
Tip
You do not need to set this.
- display_name: str | None = None¶
Optional. Set a display name for clients to use when visualizing.
Tip
You do not need to set this.
- name: str = ''¶
Optional. The name of the data object being exported.
If not set, fmu-dataio infers it from object data type. If the name is found in the
stratigraphystatic metadata list, the official stratigraphic name will be used.For example, if
"TopValysar"is the model name and the actual name is"Valysar Top Fm.", the latter name will be used.Tip
You do not need to set this.
- tagname: str = ''¶
Optional. A short tag description which will be a part of the file name.
As an example, if exporting a fault polygon from a horizon named
"TopVolantis",tagname="faultlines",
The exported filename will be
volantis_gp_top--faultlines.csvTip
You do not need to set this, but it may be useful for local workflows.
- workflow: str | dict[str, str] | None = None¶
Optional. Short string description of workflow.
Warning
Providing a dictionary as a value is deprecated.
Tip
You do not need to set this.
- forcefolder: str = ''¶
Optional. This value allows exporting to a non-standard directory relative to the casepath/rootpath.
Warning
Using this optional is generally not recommended.
This option is dependent upon the FMU context (case or realization) and the
is_observationboolean value.Example:
forcefolder="seismic",
This will replace the
cubes/standard directory forxtgeo.Cubeoutput withseismic/.Caution
Use with care and avoid if possible!
- parent: str = ''¶
Optional. This value is required for datatype
xtgeo.GridProperty, unless thegeometryvalue is given.“Parent” refers to the name of the grid geometry. It will only be added in the filename, and not as genuine metadata entry.
Warning
This value is a candidate for deprecation. Use
geometryinstead.If both
parentandgeometryare given, the grid name derived from thegeometryobject will have precedence.
- casepath: str | Path | None = None¶
Optional. Path to a case directory that contains valid case metadata
fmu_case.ymlin folder<CASE_DIR>/share/metadata/.Tip
You typically do not need to set this.
- aggregation: bool = False¶
- fmu_context: str | None = None¶
- rep_include: bool | None = None¶
- subfolder: str = ''¶
- undef_is_zero: bool = False¶
- case_folder: ClassVar[str] = 'share/metadata'¶
- polygons_fformat: ClassVar[str] = 'csv'¶
- points_fformat: ClassVar[str] = 'csv'¶
- table_fformat: ClassVar[str] = 'csv'¶
- access_ssdl: dict¶
- depth_reference: str | None = None¶
- realization: int | None = None¶
- reuse_metadata_rule: str | None = None¶
- runpath: str | Path | None = None¶
- verbosity: str = 'DEPRECATED'¶
- grid_model: str | None = None¶
- __init__(config=<factory>, content=None, content_metadata=None, classification=None, domain_reference='msl', vertical_domain='depth', geometry=None, is_observation=False, is_prediction=True, timedata=None, unit='', table_index=None, preprocessed=False, description='', display_name=None, name='', tagname='', workflow=None, forcefolder='', parent='', casepath=None, aggregation=False, fmu_context=None, rep_include=None, subfolder='', undef_is_zero=False, access_ssdl=<factory>, depth_reference=None, realization=None, reuse_metadata_rule=None, runpath=None, verbosity='DEPRECATED', grid_model=None)¶
- allow_forcefolder_absolute: ClassVar[bool] = False¶
- arrow_fformat: ClassVar[str | None] = None¶
- createfolder: ClassVar[bool] = True¶
- cube_fformat: ClassVar[str | None] = None¶
- filename_timedata_reverse: ClassVar[bool] = False¶
- grid_fformat: ClassVar[str | None] = None¶
- include_ertjobs: ClassVar[bool] = False¶
- legacy_time_format: ClassVar[bool] = False¶
- meta_format: ClassVar[Literal['yaml', 'json'] | None] = None¶
- surface_fformat: ClassVar[str | None] = None¶
- dict_fformat: ClassVar[str | None] = None¶
- table_include_index: ClassVar[bool] = False¶
- verifyfolder: ClassVar[bool] = True¶
- generate_metadata(obj, compute_md5=True, **kwargs)[source]¶
Generate and return the complete metadata for a provided object.
An object may be a map, 3D grid, cube, table, etc which is of a known and supported type.
Examples of such known types are XTGeo objects (e.g. a RegularSurface), a Pandas Dataframe, a PyArrow table, etc.
- Parameters:
obj (
Annotated[Cube|GridProperty|Grid|Points|Polygons|RegularSurface|DataFrame|FaultRoomSurface|TriangulatedSurface|MutableMapping|Table|Path|str]) – XTGeo instance, a Pandas Dataframe instance or other supported object.compute_md5 (
bool) – Deprecated, a MD5 checksum will always be computed.**kwargs (
Any) – Using other ExportData() input keys is now deprecated, input the arguments when initializing the ExportData() instance instead.
- Return type:
dict- Returns:
A dictionary with all metadata.
- export(obj, **kwargs)[source]¶
Export supported data objects with metadata.
This function exports data without changing the content of the data. The file format of the data may be determined by values set in the class.
A file containing metadata will be exported next to it. It will have the same name as the data, but will be prefixed with a .. This causes the metadata to not be visible by a standard ls command. The metadata is stored in a YAML file.
top_volantis--depth.gri .top_volantis--depth.gri.yml
- Parameters:
obj (
Annotated[Cube|GridProperty|Grid|Points|Polygons|RegularSurface|DataFrame|FaultRoomSurface|TriangulatedSurface|MutableMapping|Table|Path|str]) – An xtgeo object, Pandas dataframe, or other supported object. A full list of supported data types can be found in the documentation.- Returns:
The full path to the exported item.
- Return type:
str
Note
Providing
**kwargsis deprecated and will be removed in a later version.
- class ExportPreprocessedData[source]¶
Bases:
objectExport a preprocessed file and its metadata into a FMU run at case level.
The existing metadata will be validated and three fields will be updated - The ‘fmu’ block will be added with information about the existing FMU/ERT run - The ‘file’ block will be updated with new file paths. - The ‘tracklog’ block will be extended with a new event tagged “merged”.
Note it is important that the preprocessed data have been created upfront with the, ExportData class using the argument fmu_context=’preprocessed’. This ensures that the file and metadata are stored in the ‘share/preprocessed/’ folder.
- Parameters:
casepath (
str|Path) – Required casepath for the active ERT experiment. The case needs to contain valid case metadata i.e. the ERT workflow ‘WF_CREATE_CASE_METADATA’ has been run prior to using this class.is_observation (
bool) – Default is True. If True, then disk storage will be on the “casepath/share/observations” folder, otherwise on casepath/share/result.
- exception InvalidMetadataError[source]¶
Bases:
ExceptionRaised when valid metadata cannot be generated or returned.
- read_metadata(filename)[source]¶
Read the metadata as a dictionary given a filename.
If the filename is e.g. /some/path/mymap.gri, the assosiated metafile will be /some/path/.mymap.gri.yml (or json?)
- Parameters:
filename (
str|Path) – The full path filename to the data-object.- Return type:
dict- Returns:
A dictionary with metadata read from the assiated metadata file.
Subpackages¶
- dataio.export package
- Subpackages
- dataio.export.rms package
export_structure_depth_fault_lines()export_structure_depth_fault_surfaces()export_structure_depth_surfaces()export_structure_time_surfaces()export_grid_extracted_depth_surfaces()export_grid_model_static()export_structure_depth_isochores()export_inplace_volumes()export_rms_volumetrics()export_field_outline()export_fluid_contact_surfaces()export_fluid_contact_outlines()create_fipnum_property()- Submodules
- dataio.export.rms package
- Subpackages
- dataio.manifest package
Submodules¶
- dataio.dataio module
read_metadata()ExportDataExportData.configExportData.contentExportData.content_metadataExportData.classificationExportData.domain_referenceExportData.vertical_domainExportData.geometryExportData.is_observationExportData.is_predictionExportData.timedataExportData.unitExportData.table_indexExportData.preprocessedExportData.descriptionExportData.display_nameExportData.nameExportData.tagnameExportData.workflowExportData.forcefolderExportData.parentExportData.casepathExportData.aggregationExportData.fmu_contextExportData.rep_includeExportData.subfolderExportData.undef_is_zeroExportData.case_folderExportData.polygons_fformatExportData.points_fformatExportData.table_fformatExportData.access_ssdlExportData.depth_referenceExportData.realizationExportData.reuse_metadata_ruleExportData.runpathExportData.verbosityExportData.grid_modelExportData.__init__()ExportData.allow_forcefolder_absoluteExportData.arrow_fformatExportData.createfolderExportData.cube_fformatExportData.filename_timedata_reverseExportData.grid_fformatExportData.include_ertjobsExportData.legacy_time_formatExportData.meta_formatExportData.surface_fformatExportData.dict_fformatExportData.table_include_indexExportData.verifyfolderExportData.generate_metadata()ExportData.export()
- dataio.exceptions module
- dataio.preprocessed module
- dataio.types module
- dataio.version module