7.4.1.2 Schema mapping in SQL
If your data is in a relational database with multiple tables that must be joined to generate the GeoSciML-Portrayal fields, it may be easier to write an SQL query to do the mapping. The following example demonstrates a schema mapping query. The source schema is the USGS/AASG National Cooperative Geologic Mapping Program NCGMP09 database schema (NCGMP, 2010), and the target is a GeoSciML-Portrayal GeologicUnitView. This is meant as an illustrative example of the kind of processing that may be required. The level of difficulty in the schema mapping is largely determined by how similar the source relational database data model is to the GeoSciML conceptual model, and how much content the data provider wishes to make available in the portrayal view.
The below code example provides an SQL query that creates a table with fields conforming to the GeologicUnitView for the GeoSciML-Portrayal schema. The SQL is PostGIS dialect; notes on the query construction are at the end of the code example, and details for each field follow. This example is meant to be illustrative of the kind of processing that may be required. The level of difficulty in the schema mapping is largely determined by how similar the internal data model schema is to the GeoSciML conceptual model, and how much content there is that the data provider wishes to make available in the portrayal view.
Notes on schema mapping example
Line numbers refer to numbered lines of text in Code example 1. The Target table is in an ESRI Spatial Database Engine (SDE) geodatabase implemented in PostGIS.
- Line 1
- Result table is placed in PostGIS schema named sde (this is an ESRI geodatabase).
- Line 3
- User-defined unique polygon identifiers from the source dataset are carried into the interchange format view.
- Line 6
- Source data map units are not categorized into specific unit types (e.g. lithostratigraphic, chronostratigraphic, geophysical, etc.), so this field gets a constant value ‘Geologic Unit’, which is the preferred label for the corresponding concept in the CGI Geologic Unit Type vocabulary
- Line 7
- Rank is not assigned for the regional geologic units in the source dataset, so this is a constant that is the preferred label (‘Not Specified’) for the corresponding concept in the CGI Stratigraphic Rank vocabulary.
- Line 8
- Text descriptions of lithology and the age of the unit are derived from corresponding fields in the source data. The lithology description is mapped at the individual polygon level (via the polyextattr Postgres view), note the JOIN in line 22 that uses the .mapunitpolys_id as the foreign key
- Line 9
- The age text description is mapped at the map unit level, note the JOIN in line 24 that uses the .mapunit field as the foreign key.
- Line 10
- The text for the source field in the portrayal schema is calculated by concatenating two fields (.source and .notes) from the datasources table in the input dataset.
- Line 11
- Since all of the map units are classified simply as ‘Geologic Unit’, this URI is also a constant value from the CGI Geologic Unit Type vocabulary. Note that the URL in the example text has been abbreviated (...) to save space.
- Lines 12, 13
- Representative lithology and age categories are assigned on a polygon by polygon basis, thus the polyextattr table is the source of these URIs. Note the JOIN in line 22 that uses the .mapunitpolys_id as the foreign key
- Lines 14, 15
- Younger and older age bounds are assigned by map unit (not at the individual polygon level), so these fields are sourced from the mapunitages aggregation query in the database; this query uses the extendedattributes table to join map units with geologicevent records that contain the lower and upper age bounds for the unit. Note that GeoSciML-Portrayal v1 uses the XML element names ‘representativeUpperAge_uri’ and ‘representativeLowerAge_uri’; these are changed to ‘representativeOlderAge_uri’ and ‘representativeYoungerAge_uri’ respectively in GeoSciML-Portrayal v.2 used here to avoid ambiguity in the interpretation of ‘upper’ and ‘lower’.
- Line 16
- specification_uri is a link to a more complete description of the geologic unit that crops out in the polygon’s extent. This description can be viewed as a resource in the context of web architecture. The intention of the GeoSciML design team was that this URI should dereference to return a full GeoSciML GeologicUnit element instance. With the use of content negotiation on the web, this description might also have representations as a web page or other structured description (rdf, owl...). In the example instance, a more complete description is not available, so an OGC nil URI is used to indicate that the resource is missing (does not exist).
- Line 17
- metadata_uri identifies a metadata resource that contains more complete information on the provenance of the information in the feature element. This may be a metadata record that is scoped to the individual feature, or may identify a metadata record describing some collection of polygons that have similar enough provenance to document together. The text in the source field (Line 11) should be a succinct summary of the information in this metadata record pertinent to the containing feature.
- Line 18
- The genericSymbolizer field should contain an identifier for the symbol used to display this polygon in the default portrayal (legend) chosen by the data provider for the GeologicUnitView instance in the containing feature collection. This field can be used to capture the map unit assignment and legend color scheme from the original data from which the GeologicUnitView polygon was digitized. The legend may be encoded in an accompanying Styled Layer Descriptor (SLD) file, and if such an SLD exists it should be recorded as a related resource in the metadata record specified by metadata_uri.
- Line 19
- The shape field contains a representation of the geometry of the outcrop area described by the GeologicUnitView instance. The content of this field will be managed by the GIS and web map service server, and generally will not need to be manipulated by users outside the GIS environment. The shape field may need to be cast into a recognized geometry field in order to get PostGIS and ESRI SDE to recognize the output as a feature class. In our configuration at AZGS, we ended up using this: ’st_geometryfromtext(ST_AsText(shape)::text, 4326)::geometry as shape‘. More expert PostGIS users may have a better solution for this problem.
- Line 20
- In order for ArcGIS server to recognize this as a feature class that may be used to source a service an integer unique value field must be present. Note that the GeoSciML-Portrayal v.2 schema allows any xml elements to follow the shape field. Since the ObjectID field is not present in the GeoSciML-Portrayal model, but must be present to deploy the service using ArcGIS server, it is placed here in the instance document. Other server or client application required fields can be placed following this element. In the example, the field from the source dataset is used because it serves as an ESRI feature class unique integer identifier there. This field does not need to be set up as an autoincrement field because the resulting table is a read-only view of the data.
Section last modified: 8 October 2015