PostgreSQL optimieren: Unterschied zwischen den Versionen

Aus Geoinformation HSR
Wechseln zu: Navigation, Suche
(Die Seite wurde neu angelegt: „Siehe auch: * PostGIS, PostGIS - Tipps und Tricks * PostgreSQL, PostgreSQL - Tipps und Tricks * Vortrag PGConf.DE 2011, Stefan Keller. Basiert a…“)
 
Zeile 1: Zeile 1:
 +
PostgreSQL Performance Optimierung und Tuning.
 +
 
Siehe auch:
 
Siehe auch:
 
* [[PostGIS]], [[PostGIS - Tipps und Tricks]]
 
* [[PostGIS]], [[PostGIS - Tipps und Tricks]]
 
* [[PostgreSQL]], [[PostgreSQL - Tipps und Tricks]]
 
* [[PostgreSQL]], [[PostgreSQL - Tipps und Tricks]]
 +
* [[PostGIS optimieren]]
 
* Vortrag PGConf.DE 2011, Stefan Keller.
 
* Vortrag PGConf.DE 2011, Stefan Keller.
  
 +
Siehe auch:
 +
* [http://wiki.postgresql.org/wiki/Performance_Optimization PostgreSQL Performance Optimization] on Wiki
 +
* [http://www.revsys.com/writings/postgresql-performance.html RevSys (Hardware, Tuning, Explain, etc.)], [http://groups.google.ch/group/EtoE/browse_thread/thread/eaabeda775e65260/1d3c809e723c6b07?hl=de&lnk=st&q=Best+CPU+for+PostgreSQL#1d3c809e723c6b07 PostgreSQL 8 Performance Checklist (genaue Erklärung der Variablen in postgresql.conf], [http://www.pcguide.com/ref/hdd/perf/raid/ RAID Hinweise]. Für kleinere bis mittlere PostgreSQL Installationen mit primär Leseoperationen empfiehlt sich RAID 0  oder RAID 1. RAID 0 hat jedoch keine Redundanz. RAID 5 hat schlechte Performanz, ausser wenn 6 oder mehr Disks. SCSI Ultra 320 Disks sind zwar teurer aber immer noch wesentlich schneller als S-ATA/IDE Disks.
 +
* Das Tool pgtune macht Verbesserungsvorschläge der Konfiguration.
 +
* CLUSTER und FILLINDEX: http://www.postgresonline.com/journal/index.php?/archives/10-How-does-CLUSTER-ON-improve-index-performance.html
 +
* [[http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm 5-Minute Introduction to PostgreSQL Performance]] by Greg Smith
 +
 +
== Tipps ==
 +
 +
Bevor man beginnt und postgresql.conf etc. verändert...
 +
* ANALYSE und CLUSTER-Befehl ausführen: Siehe [http://wiki.postgresql.org/wiki/VACUUM_FULL#What_to_use_instead].
 +
* Den aktuellen Zustand testen mit dem Tool [http://wiki.postgresql.org/wiki/Regression_Testing_with_pgbench pgbench].
 +
 +
Tipps zur Optimierung (Optimization), Performance, Tuning von PostgreSQL...:
 +
# Make sure you have (spatial) indexes on your tables.
 +
# Cluster on (spatial) indexes tends to improve performance but mostly good for static spatial data.
 +
# PostGIS: A spatial index is more effective on small geometries than larger geometries even for MySQL.
 +
# EXPLAIN ANALYZE is your friend -- especially PgAdmin III Graphical Explain - will pinpoint when indexes are not being used and what navigation strategies are being used.
 +
# posgresql.conf - fiddling with that a lot has been written -- particularly work_mem is important.
 +
# Then there is a whole science on how you write sql queries. In general for PostgreSQL - you can force PostgreSQL to materialize subselects by putting in an ORDER BY and in general a subselect tends to force the planner to evaluate those as a single unit of work, which tends to improve speed in many cases.
 +
 +
SQL:
 +
* Postgres kennt "Partial index": Beispiel: "create index i on table2(field2) where field2 <> 0;" Optimiert z.B. folgende Query: select * from table2 where field2<>0; (vgl. [http://www.postgresql.org/docs/8.4/static/indexes-partial.html Doc.]).
 +
 +
=== Tuning durch Konfiguration von postgresql.conf ===
 +
 +
Konfiguration von postgresql.conf (siehe z.B. [http://www.varlena.com/GeneralBits/Tidbits/perf.html]) durch Parameter heraufsetzen:
 +
* Speziell bei genügend Memory und wenigen Applikationen und Benutzern:
 +
** effective_cache_size (default 128MB )
 +
** work_mem – default 1MB is wayyy small
 +
** maintenance_work_mem – default 16MB small
 +
** shared_buffers (default 24MB): This is a real killer... bump it to 1-2GB, at least, on a server w/ >4G RAM, up to 8GB (don't go above that w/o good testing..).
 +
* sort_memory - speziell bei order-by oder group-by Klauseln
 +
* random_page_cost – speziell bei schnellen Disks wie SCSI und RAID
 +
 +
Auto Vacuum einschalten.
 +
 +
== Tuning Read-Only Databases ==
 +
 +
Steps in order to secure and speedup such PostgreSQL/PostGIS instance:
 +
 +
1. Re-configure PostgreSQL server parameters (postgresql.conf) as following:
 +
 +
fsync=off
 +
synchronous_commit=off
 +
full_page_writes=off
 +
default_transaction_read_only
 +
 +
Greg Smith wrote (19. April 2011 16:30 on pgsql-performance): "The only other thing you probably want to do is set checkpoint_segments to a big number.  Shouldn't matter normally, but when doing this (vacuum) freeze operation it will help that execute quickly.  You want a lower maintenance_work_mem on a read-only system than the master too, possibly a higher shared_buffers as well.  It's all pretty subtle beyond the big parameters you already identified."
 +
 +
  checkpoint_segments=?
 +
  maintenance_work_mem=?
 +
  shared_buffers=?
 +
 +
2. Restart server, login as dbadmin, create database, create an app.-user.
 +
 +
3. Load dataset...:
 +
 +
3.1. with owner 'app.-user' (in schema PUBLIC or other?).
 +
 +
3.2. create indexes and cluster:
 +
 +
CREATE INDEX mytable_idx
 +
ON mytable
 +
-- USING btree (upper(last_name), upper(first_name)) -- tbd.
 +
WITH (FILLFACTOR=100);
 +
 +
ALTER TABLE member CLUSTER ON member_name_idx;
 +
 +
3.3. Config database:
 +
 +
ALTER DATABASE x SET default_transaction_read_only = on;
 +
 +
3.4. Issue a VACUUM command:
 +
 +
VACUUM FREEZE ANALYZE;
 +
 +
4. Create a 'read-only' user (login role) with only read access to user defined tables:
 +
GRANT SELECT ... TO read_only_user
 +
...
 +
 +
5. Optimize and secure session by following parameters:
  
Basiert auf [[PostGIS optimieren]]
+
SET STATEMENT_TIMEOUT TO 60000;
 +
SET transaction_read_only TO TRUE; --?
 +
SET TRANSACTION READ ONLY;
  
tbd.
+
6. Go to step 3 in case of new data or a harddisk crash.

Version vom 16. Oktober 2011, 20:37 Uhr

PostgreSQL Performance Optimierung und Tuning.

Siehe auch:

Siehe auch:

Tipps

Bevor man beginnt und postgresql.conf etc. verändert...

  • ANALYSE und CLUSTER-Befehl ausführen: Siehe [1].
  • Den aktuellen Zustand testen mit dem Tool pgbench.

Tipps zur Optimierung (Optimization), Performance, Tuning von PostgreSQL...:

  1. Make sure you have (spatial) indexes on your tables.
  2. Cluster on (spatial) indexes tends to improve performance but mostly good for static spatial data.
  3. PostGIS: A spatial index is more effective on small geometries than larger geometries even for MySQL.
  4. EXPLAIN ANALYZE is your friend -- especially PgAdmin III Graphical Explain - will pinpoint when indexes are not being used and what navigation strategies are being used.
  5. posgresql.conf - fiddling with that a lot has been written -- particularly work_mem is important.
  6. Then there is a whole science on how you write sql queries. In general for PostgreSQL - you can force PostgreSQL to materialize subselects by putting in an ORDER BY and in general a subselect tends to force the planner to evaluate those as a single unit of work, which tends to improve speed in many cases.

SQL:

  • Postgres kennt "Partial index": Beispiel: "create index i on table2(field2) where field2 <> 0;" Optimiert z.B. folgende Query: select * from table2 where field2<>0; (vgl. Doc.).

Tuning durch Konfiguration von postgresql.conf

Konfiguration von postgresql.conf (siehe z.B. [2]) durch Parameter heraufsetzen:

  • Speziell bei genügend Memory und wenigen Applikationen und Benutzern:
    • effective_cache_size (default 128MB )
    • work_mem – default 1MB is wayyy small
    • maintenance_work_mem – default 16MB small
    • shared_buffers (default 24MB): This is a real killer... bump it to 1-2GB, at least, on a server w/ >4G RAM, up to 8GB (don't go above that w/o good testing..).
  • sort_memory - speziell bei order-by oder group-by Klauseln
  • random_page_cost – speziell bei schnellen Disks wie SCSI und RAID

Auto Vacuum einschalten.

Tuning Read-Only Databases

Steps in order to secure and speedup such PostgreSQL/PostGIS instance:

1. Re-configure PostgreSQL server parameters (postgresql.conf) as following:

fsync=off
synchronous_commit=off
full_page_writes=off
default_transaction_read_only

Greg Smith wrote (19. April 2011 16:30 on pgsql-performance): "The only other thing you probably want to do is set checkpoint_segments to a big number. Shouldn't matter normally, but when doing this (vacuum) freeze operation it will help that execute quickly. You want a lower maintenance_work_mem on a read-only system than the master too, possibly a higher shared_buffers as well. It's all pretty subtle beyond the big parameters you already identified."

 checkpoint_segments=?
 maintenance_work_mem=?
 shared_buffers=?

2. Restart server, login as dbadmin, create database, create an app.-user.

3. Load dataset...:

3.1. with owner 'app.-user' (in schema PUBLIC or other?).

3.2. create indexes and cluster:

CREATE INDEX mytable_idx
ON mytable
-- USING btree (upper(last_name), upper(first_name)) -- tbd.
WITH (FILLFACTOR=100);

ALTER TABLE member CLUSTER ON member_name_idx;

3.3. Config database:

ALTER DATABASE x SET default_transaction_read_only = on;

3.4. Issue a VACUUM command:

VACUUM FREEZE ANALYZE;

4. Create a 'read-only' user (login role) with only read access to user defined tables:

GRANT SELECT ... TO read_only_user
...

5. Optimize and secure session by following parameters:

SET STATEMENT_TIMEOUT TO 60000;
SET transaction_read_only TO TRUE; --?
SET TRANSACTION READ ONLY;

6. Go to step 3 in case of new data or a harddisk crash.