Diskussion:HSR Texas Geo Database Benchmark: Unterschied zwischen den Versionen

Aktuelle Version vom 6. September 2010, 10:02 Uhr

Feel free to discusss 'HSR Texas Spatial Database Benchmark' issues here! Stefan 18:39, 22. Dez. 2009 (CET)

Inhaltsverzeichnis

1 To do
2 Observations
3 Discussion Points
4 Review

To do

Standardise data and input process (circumvent or minimize floating point problem)
Define the variables (@bbox and @point)

Observations

General:

MS SQL Server representatives report that geography types are 20-30% slower than geometry types.
About PostGIS 1.5 there was said, that geography types are somewhat faster (see http://blog.cleverelephant.ca/)
Secondary index optimization makes performance about 50% faster (from: MS SQL Server discussion)
The number of points of a query window is important for a secondary filter overhead.

Data and queries per se:

These queries return quite large percent of the data overall (50%-100% for points and polygons)
The Number of points for a query windows is very small.
The query windows are "aligned bounding boxes" which may put in favor RTree (e.g. Postgres) or spatial grid (SQL Server) index, depending on a position of the window. If the window is aligned with the grid, performance can be much faster than when it's slightly off.
Volker Mische reported that boundingbox queries with different boundingboxes (selecting 5%, 10%, 15%, 20%, 25% of all features) show the same performance in spatialite.

Discussion Points

Should we take an average of multiple runs with different values (@point, @bbox)?
Should we vary the queries' result sizes? (see observations)
Should we vary the shape and alignment of the query window? (see observations)
Should we run the benchmark with parallelized queries?

Review

Query 1 actually shouldn't be included in the benchmark, data copying and preparation isn't really done often in a spatial database.

@@ Zeile 1: / Zeile 1: @@
 Feel free to discusss 'HSR Texas Spatial Database Benchmark' issues here! [[Benutzer:Stefan|Stefan]] 18:39, 22. Dez. 2009 (CET)
-== ToDo ==
+== To do ==
-* standardise data and input process (circumvent or minimize floating point problem)
+* Standardise data and input process (circumvent or minimize floating point problem)
-* define the variables (@bbox and @point)
+* Define the variables (@bbox and @point)
-** should we take an average of multiple runs with different values?
---[[Benutzer:Dominik|Dominik]] 15:12, 12. Jan. 2010 (UTC)
 == Observations ==
@@ Zeile 19: / Zeile 17: @@
 * The Number of points for a query windows is very small.
 * The query windows are "aligned bounding boxes" which may put in favor RTree (e.g. Postgres) or spatial grid (SQL Server) index, depending on a position of the window. If the window is aligned with the grid, performance can be much faster than when it's slightly off.
+* Volker Mische reported that boundingbox queries with different boundingboxes (selecting 5%, 10%, 15%, 20%, 25% of all features) show the same performance in spatialite.
 == Discussion Points ==
-Typically the bbox variable is "grid aligned" - What's the impact? Is this typical?
+* Should we take an average of multiple runs with different values (@point, @bbox)?
+* Should we vary the queries' result sizes? (see observations)
+* Should we vary the shape and alignment of the query window? (see observations)
+* Should we run the benchmark with parallelized queries?
-tdb.
+== Review ==
+* Query 1 actually shouldn't be included in the benchmark, data copying and preparation isn't really done often in a spatial database.

Diskussion:HSR Texas Geo Database Benchmark: Unterschied zwischen den Versionen

Aktuelle Version vom 6. September 2010, 10:02 Uhr

Inhaltsverzeichnis

To do

Observations

Discussion Points

Review

Navigationsmenü

Meine Werkzeuge

Namensräume

Varianten

Ansichten

Mehr

Suche

Navigation

Weblinks

Werkzeuge