OSGeodata metadata exchange model

Aus Geoinformation HSR
Wechseln zu: Navigation, Suche

Proposal of an information model to exchange spatial metadata

Initial version by S.F. Keller. See also OSGeodata.

There is strong evidence that the architecture should be centered around data resources. Therefore metadata describes primary data resources. As a consequence, a probably specialized type attribute/XML element 'dc:type' contains data access values like File API, JDBC, HTTP GET, WMS, WFS, etc. These are all services which provide a programming interface to access data.

But there is also a need to describe services on its own, also called filter - finally as own metadata records: Normally you feed some data in and get data out. For these kinds of services it’s interesting to know what kind of data you can feed in and get back. Examples are format conversion or coordinate transformation services.

Idea: Take the smallest possible information model for geographic metadata which describes data and filter services. Dublin Core (DC) is the most well known model. This must be extended using the known DC specs.


Dublin Core and its interpretation in geographic metadata

This is the full list of (possibly repeatable) attributes from DC together with its proposed semantic interpretation and/or enumeration list (needs probably an own XML namespace):

  • Relation: reference - Reference to other metadata records - especially useful for map service types (WxS) pointing to data service types (= file, database?)
  • Type: text or enum - Protocol type, e.g. file, WMS layer, WFS feature set, etc... IMPORTANT FOR SERVICE DISCOVERY
  • Identifier: string - Unique id to identify an metadata record (use a URI for dc:identifier).
  • Title: string - Title
  • Coverage.box: - Rectangular box (mandatory) in WGS84
  • Coverage.name: String - (optionally) a geographic name
  • Description: string - Some free text
  • Subject: enum - Classification from ISO 19115 as enum type (original: The topic of the resource. Typically, subject will be expressed as keywords or phrases that describe the subject or content of the resource. The use of controlled vocabularies and formal classification schemas is encouraged.)
  • Language: enum - ISO Code
  • Format: string - File type or name of originating source system
  • Source: string - Lineage information
  • Date: date - Publication date or date of last change
  • Creator: string - Data owner, else: data capturer
  • Contributor: string - Leave unused?
  • Publisher: string - Distribution informatione
  • Rights: string - License information about the data
  • Audience: string - Not used or 'GIS' as a constant

Attributes to be discussed in detail

(Most are probably enums):

  • Relation (specialized for services): 'Relates to': Identifier of other data resource. Note that this may be hierarchical
  • Relation (specialized for data): 'See also': URL as hint to other data resource. Note this is an important feature for crawlers if there exists not (yet) a harvesting protocol.
  • Type (specialized): Protocol type, URL to WMS etc.
  • Identifier (specialized): URL (not URN as in DC)
  • Coverage (specialized): Coverage.box and Coverage.name
  • Subject (specialized for data): c.f. above.
  • Format (specialized for data): Not clear in unqualified DC how to point to original resource
  • Creator and Publisher (specialized): structured adress including webpage and/or mail and phone
  • Resource accepted (new for services): In case of a service: Information about the kind of data (e.g. schemas) it can process. Note that metadata about filter services are stand alone records.

Example

These are preliminary thoughts...

Given the attributes described above and the two resources, data and service resource, mentioned above, here are two metadata records as examples: One about POIs data of Rapperswil and one being a special service/filter:

  • Exerpt of a metadata record (instance) about a geographic data resource (showing only most important dc-Elements and dummy values):
    • Title := POIs of Rapperswil-Jona # a data, a set of geographic features
    • Coverage.name := Rapperswil SG
    • Language := de_ch
    • Date := 2006-08-14
    • Publisher := http:// www.hsr.ch/en/
    • Creator := S.F. Keller
    • Rights := LGPL for data
    • Type:= file
    • Subject := Economy
    • Format.path := georss http:// www.gis.hsr.ch/data/poi_data_rapperswil.georss # a file!
    • Format.path := shapefile http:// www.gis.hsr.ch/data/poi_data_rapperswil.shp # a file!
    • Format.path := wms http:// www.gis.hsr.ch/wms # a web service!
    • Relation.see_also := http:// www.gis.zh.ch/
    • ...
  • Exerpt of a metadata record (instance) about a geographic service resource, a sort of filter (showing only most important dc-Elements and dummy values):
    • Title := Geographic File Format Converter at HSR # a filter service
    • Coverage.name := Rapperswil SG
    • Language := de_ch
    • Date := 2006-08-14
    • Publisher := http:// www.hsr.ch/en/
    • Creator := S.F. Keller
    • Type := Converter
    • Format.path := converter http:// www.gis.hsr.ch/converter
    • Resource_accepted := URL to shape file
    • ...

Encoding?

  • Use GeoRSS (simple) encoding?