Diskussion:HSR Texas Geo Database Benchmark

Feel free to discusss 'HSR Texas Spatial Database Benchmark' issues here! Stefan 18:39, 22. Dez. 2009 (CET)

Inhaltsverzeichnis

Standardise data and input process (circumvent or minimize floating point problem)
Define the variables (@bbox and @point)

General:

MS SQL Server representatives report that geography types are 20-30% slower than geometry types.
About PostGIS 1.5 there was said, that geography types are somewhat faster (see http://blog.cleverelephant.ca/)
Secondary index optimization makes performance about 50% faster (from: MS SQL Server discussion)
The number of points of a query window is important for a secondary filter overhead.

Data and queries per se:

These queries return quite large percent of the data overall (50%-100% for points and polygons)
The Number of points for a query windows is very small.
The query windows are "aligned bounding boxes" which may put in favor RTree (e.g. Postgres) or spatial grid (SQL Server) index, depending on a position of the window. If the window is aligned with the grid, performance can be much faster than when it's slightly off.
Volker Mische reported that boundingbox queries with different boundingboxes (selecting 5%, 10%, 15%, 20%, 25% of all features) show the same performance in spatialite.

Should we take an average of multiple runs with different values (@point, @bbox)?
Should we vary the queries' result sizes? (see observations)
Should we vary the shape and alignment of the query window? (see observations)
Should we run the benchmark with parallelized queries?

Query 1 actually shouldn't be included in the benchmark, data copying and preparation isn't really done often in a spatial database.