Geodata Discovery Specification - Discovery of Geodata and Geospatial Services
- Version 0.1, 2010-04-10, Stefan Keller -- First release of the specification.
The discovery of geodata and geospatial services is crucial of almost every project which involves geoprocessing and therefore also to every (geo) data infrastructure!
General purpose search engines are not capable of recognizing geodata formats and geospatial services. Thus, here is a need to support geospatial search services, especially repositories and webcrawlers. The solution is twofold: first help machines find explicitely resources they are looking for (through relation links) and, second, let users realize and control that they are browsing a web site which is enhanced with those machine readable relationships (through an icon).
Instead of a web of (HTML) documents, this is part of a vision about a web of services (and XML) which what (semantic) Web 2.0 is mostly about. "So it is vital at the GeoWeb align itself with the web and the multitude of sources and endpoints that the web is reaching into." (citation form High Earth Orbit blog).
Relationship to other standards
This specification doesn't require any change to the formats and standards mentioned except for OGC's WxS where it's an add-on. In fact, it's mostly a recommendation for a common usage of these formats.
robots.txt and sitemaps.xml from Google are somehow similar because they refer to (forbidden) contents. The harvesting protocol OAI-PMH includes also a "See also" element. Beyond this there is no similar standard for discovery known to us.
The Geodata Discovery standard consists of part A and B (both are pure standards extensions, called substandards):
- Part A specifies extensions to the syndication formats GeoRSS and Atom as well as protocol extensions to OGCs Webservices.
- Part B specifies how to use an icon in HTML webpages inclunding GetTags (and eventually microformats).
Syndication formats and protocols use xlinks to refer in a semantically clear way to geospatial webservices and geodata. These links provide semantics to help focussed crawlers to find OGC Services, like WMS, WFS, WCS, WPS and/or CSW.
The Geotag icon (i.e. its href HTML encoding) links back to the syndication formats and protocols.
Part A Syndication Format and Protocol Extensions (CONDITIONAL MANDATORY)
Specifies "See" and "See also" links (also called 'cross-links') in well known formats ("carrier formats") containing typified weblinks (whenever possible 'xlink' or "Atom link relation types").
- Syndication format extensions
- GeoRSS file format: Use of GeoRSS to syndicate KML content, similar to RSS support in HTML. See the proposal here.
- Atom file format: rel and type attributes for (See IETF Web Linking Draft, chap. 6.2)
<link rel="alternate" type="text/html" href="http://www.naturschutz.zh.ch/internet/bd/aln/ns/de/nsdaten/Geodaten/fns_wms.html"/> <link rel="alternate" type="application/vnd.google-earth.kml+xml" href="http://example.org/kml_georss.kml"/> <link rel="alternate" type="text/xml" href="http://www.gis.zh.ch/scripts/wmsFNSSVO.asp"/>
- Syndication extensions to OGC Web Services (protocol)
Extensions of GetCapabilities request with xlink tags ... (tbd.)!
'CONDITIONAL MANDATORY' means that at least one of the mentioned substandards needs to implemented in order to comply to the "Geodata Discovery" standard.
Part B Geotag Icon in HTML (OPTIONAL)
Specifies the placement of a Geotag icon (Microformat 'Geo') in a HTML webpage. This is for human readability and trustworthiness as well as for general purpose webcrawlers (Google's principle of visual control) pointing to the syndication format (specified in Part A).
tbd. Ideally there should be a website which lists conformances tests given a website which was enhanced according to this specifications.
At OGC there exists an interest group about discovery but there's no specification activity there yet.
This proposal does not compete with existing standards like CSW. In contrary, it is a supplement of those. It's the basis for better domain-specific search engines, like e.g. geocat.ch or geometa.info.
This proposal is based on own experiences and - among others - on the sources mentioned below.
- IETF Web Linking Draft (draft-nottingham-http-link-header-06) - Atom link relation types.
- XML Linking Language (XLink) Version 1.0.
- OpenSearch.org > OpenSearch Geo extension (Draft) - A leightweight harvesting interface (Remarks: "rel='search'" notifes an application that here is a service that it can query to get at additional resources. Applies often also to webapplications capable which can be called with Permalinks).
- Microformats.org 'Geo'
- GeoWeb Standards – Discoverability from High Earth Orbit Blog, 2009-08-28.
- "Proposed standard for web linking" from Sean Gillies Blog, 2010-01-25.
- Version 1.00, 19.4.2010, Stefan Keller -- First release of the specification.