Diskussion:HSR Texas Geo Database Benchmark: Unterschied zwischen den Versionen
Aus Geoinformation HSR
Stefan (Diskussion | Beiträge) (→Observations) |
(→Observations) |
||
(9 dazwischenliegende Versionen von 2 Benutzern werden nicht angezeigt) | |||
Zeile 1: | Zeile 1: | ||
Feel free to discusss 'HSR Texas Spatial Database Benchmark' issues here! [[Benutzer:Stefan|Stefan]] 18:39, 22. Dez. 2009 (CET) | Feel free to discusss 'HSR Texas Spatial Database Benchmark' issues here! [[Benutzer:Stefan|Stefan]] 18:39, 22. Dez. 2009 (CET) | ||
− | == | + | == To do == |
− | * | + | * Standardise data and input process (circumvent or minimize floating point problem) |
− | * | + | * Define the variables (@bbox and @point) |
− | |||
− | |||
== Observations == | == Observations == | ||
Zeile 19: | Zeile 17: | ||
* The Number of points for a query windows is very small. | * The Number of points for a query windows is very small. | ||
* The query windows are "aligned bounding boxes" which may put in favor RTree (e.g. Postgres) or spatial grid (SQL Server) index, depending on a position of the window. If the window is aligned with the grid, performance can be much faster than when it's slightly off. | * The query windows are "aligned bounding boxes" which may put in favor RTree (e.g. Postgres) or spatial grid (SQL Server) index, depending on a position of the window. If the window is aligned with the grid, performance can be much faster than when it's slightly off. | ||
+ | * Volker Mische reported that boundingbox queries with different boundingboxes (selecting 5%, 10%, 15%, 20%, 25% of all features) show the same performance in spatialite. | ||
== Discussion Points == | == Discussion Points == | ||
− | + | * Should we take an average of multiple runs with different values (@point, @bbox)? | |
+ | * Should we vary the queries' result sizes? (see observations) | ||
+ | * Should we vary the shape and alignment of the query window? (see observations) | ||
+ | * Should we run the benchmark with parallelized queries? | ||
− | + | == Review == | |
+ | * Query 1 actually shouldn't be included in the benchmark, data copying and preparation isn't really done often in a spatial database. |
Aktuelle Version vom 6. September 2010, 10:02 Uhr
Feel free to discusss 'HSR Texas Spatial Database Benchmark' issues here! Stefan 18:39, 22. Dez. 2009 (CET)
Inhaltsverzeichnis
To do
- Standardise data and input process (circumvent or minimize floating point problem)
- Define the variables (@bbox and @point)
Observations
General:
- MS SQL Server representatives report that geography types are 20-30% slower than geometry types.
- About PostGIS 1.5 there was said, that geography types are somewhat faster (see http://blog.cleverelephant.ca/)
- Secondary index optimization makes performance about 50% faster (from: MS SQL Server discussion)
- The number of points of a query window is important for a secondary filter overhead.
Data and queries per se:
- These queries return quite large percent of the data overall (50%-100% for points and polygons)
- The Number of points for a query windows is very small.
- The query windows are "aligned bounding boxes" which may put in favor RTree (e.g. Postgres) or spatial grid (SQL Server) index, depending on a position of the window. If the window is aligned with the grid, performance can be much faster than when it's slightly off.
- Volker Mische reported that boundingbox queries with different boundingboxes (selecting 5%, 10%, 15%, 20%, 25% of all features) show the same performance in spatialite.
Discussion Points
- Should we take an average of multiple runs with different values (@point, @bbox)?
- Should we vary the queries' result sizes? (see observations)
- Should we vary the shape and alignment of the query window? (see observations)
- Should we run the benchmark with parallelized queries?
Review
- Query 1 actually shouldn't be included in the benchmark, data copying and preparation isn't really done often in a spatial database.