GeoCSV: Unterschied zwischen den Versionen

Aus Geoinformation HSR
Wechseln zu: Navigation, Suche
K
K
Zeile 2: Zeile 2:
  
 
Date of last modification: ''see bottom'', Author: [[Stefan]]. ''(For notes and discussion see [[Diskussion:GeoCSV]])''.
 
Date of last modification: ''see bottom'', Author: [[Stefan]]. ''(For notes and discussion see [[Diskussion:GeoCSV]])''.
 +
 +
=== Introduction ===
 +
 +
GeoCSV (Comma Separated Values) is an extension of the well-known "human readable", tabular file format CSV.
 +
CSV is a spartanic format with possible information loss.
 +
Think about using more capable and elegant formats for desktop file exchange like e.g. [[GeoPackage]].
 +
 +
One the other hand it some potential since it's quite more capable as e.g. a [[Shapefile]]. See also [[TheShapefileChallenge]].
 +
 +
This format has
 +
* cluttered files, like .csvt and .prj
 +
* no layer name - except for the file name (which can be changed easily by others...).
  
 
=== CSV file format specification ===
 
=== CSV file format specification ===
Zeile 16: Zeile 28:
 
About file extensions, encoding and compression:
 
About file extensions, encoding and compression:
 
* Proposed file extension is '''.CSV (or .csv)'''.
 
* Proposed file extension is '''.CSV (or .csv)'''.
 +
* The file can be accompanied with following two files having the same file base name: for indicating filed types (schema) .csvt, for indication Coordinate Reference System [[CRS]] .prf (see [[Shapefiles]]).
 
* Compression in format .ZIP (or .zip) is also possible and encouraged.
 
* Compression in format .ZIP (or .zip) is also possible and encouraged.
 
* Encoding is UTF-8 by default.
 
* Encoding is UTF-8 by default.
Zeile 32: Zeile 45:
 
Option WKT:
 
Option WKT:
 
* It' one single column of type String containing a constructor, like for example: "POINT (8.8249 47.2274)".
 
* It' one single column of type String containing a constructor, like for example: "POINT (8.8249 47.2274)".
* This option supports Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon.
+
* This option supports Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon and even GeometryCollection and ARCs!
 
* [[WKT]] ("Well Known Text") is originally defined by the Open Geospatial Consortium (OGC) and described in their Simple Feature Access specification (also ISO SQL/MM). See e.g. http://en.wikipedia.org/wiki/Well-known_text
 
* [[WKT]] ("Well Known Text") is originally defined by the Open Geospatial Consortium (OGC) and described in their Simple Feature Access specification (also ISO SQL/MM). See e.g. http://en.wikipedia.org/wiki/Well-known_text
  

Version vom 1. Mai 2015, 13:57 Uhr

Specification of the tabular file format CSV (Comma Separated Values) with a geometry extension!

Date of last modification: see bottom, Author: Stefan. (For notes and discussion see Diskussion:GeoCSV).

Introduction

GeoCSV (Comma Separated Values) is an extension of the well-known "human readable", tabular file format CSV. CSV is a spartanic format with possible information loss. Think about using more capable and elegant formats for desktop file exchange like e.g. GeoPackage.

One the other hand it some potential since it's quite more capable as e.g. a Shapefile. See also TheShapefileChallenge.

This format has

  • cluttered files, like .csvt and .prj
  • no layer name - except for the file name (which can be changed easily by others...).

CSV file format specification

About the content:

  • First row contains attribute names separated by a => delimiter.
  • Following rows are contains values separated by a => delimiter.
  • Delimiter is semicolon (;) by default.
  • Strings are enclosed by parantheses, to allow delimiters inside (e.g. "string").
  • Data types (if supported from source or target system): See CSVT file format specification.
  • All rows have same number of attributes.
  • Calculations are possible in fields of type String (like "=A1+B1")

About file extensions, encoding and compression:

  • Proposed file extension is .CSV (or .csv).
  • The file can be accompanied with following two files having the same file base name: for indicating filed types (schema) .csvt, for indication Coordinate Reference System CRS .prf (see Shapefiles).
  • Compression in format .ZIP (or .zip) is also possible and encouraged.
  • Encoding is UTF-8 by default.
  • End-of-lines are: CR, LF or CR/LF.
  • Line Breaks in (String) fields are disallowed.

GeoCSV file format specification

GeoCSV is based on CSV. The extension comes with two variants: Options easting/northing and Options WKT.

Option "easting/northing" (longitude/latitude, similar to x/y in mathematics):

  • Geometry Point type as two neighboring columns of type Float: one containing the easting coordinate, and one containing northing coordinate separated by the common delimiter.
  • Example for the two easting/northing columnts "8.8249;47.2274".
  • This option supports only Points.

Option WKT:

  • It' one single column of type String containing a constructor, like for example: "POINT (8.8249 47.2274)".
  • This option supports Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon and even GeometryCollection and ARCs!
  • WKT ("Well Known Text") is originally defined by the Open Geospatial Consortium (OGC) and described in their Simple Feature Access specification (also ISO SQL/MM). See e.g. http://en.wikipedia.org/wiki/Well-known_text

Common restrictions:

  • Coordinate system is WGS84 (EPSG:4326) by default.
  • There is only one geometry column allowed per sheet.
  • All geometry values within one table are in the same coordinate reference system (CRS).

CSVT file format specification

Field/column types, case insensitive (if supported from source or target system):

  • Integer
  • Real
  • String
  • Date ("YYYY-MM-DD"), Time ("HH:MM:SS+nn") and DateTime (YYYY-MM-DD HH:MM:SS+nn)
  • (Lon/Lat)
  • (WKT)

Notes:

  • The geometry types are a kind of subtype: easting and northing values are stored as float, option WKT is stored in one column of type String.
  • See also http://www.gdal.org/drv_csv.html section with .csvt extension.

Software

Examples

CSV type file 'example1.csvt':

Integer;String,Real,String,WKT

CSV file 'example1.csv:

id;name;amount;remarks;geom
1;Kevin;2.1;Rapperswil;point(8.8249 47.2274)
2;Eva;2.2;Zürich;point(8.5435 47.3768)
3;"Jimmy;Muff";2.3;Berne;point(7.4397 46.9487)

...can be shown as following table:

id name amount remarks geom
1 Kevin 2.1 Rapperswil POINT(8.8249 47.2274)
2 Eva 2.2 Zürich POINT(8.5435 47.3768)
2 Jimmy;Muff 2.3 POINT(7.4397 46.9487)

Note the remarks string in row 2 and the empty string in row 3.