PostgreSQL optimieren: Unterschied zwischen den Versionen
Stefan (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „Siehe auch: * PostGIS, PostGIS - Tipps und Tricks * PostgreSQL, PostgreSQL - Tipps und Tricks * Vortrag PGConf.DE 2011, Stefan Keller. Basiert a…“) |
Stefan (Diskussion | Beiträge) |
||
Zeile 1: | Zeile 1: | ||
+ | PostgreSQL Performance Optimierung und Tuning. | ||
+ | |||
Siehe auch: | Siehe auch: | ||
* [[PostGIS]], [[PostGIS - Tipps und Tricks]] | * [[PostGIS]], [[PostGIS - Tipps und Tricks]] | ||
* [[PostgreSQL]], [[PostgreSQL - Tipps und Tricks]] | * [[PostgreSQL]], [[PostgreSQL - Tipps und Tricks]] | ||
+ | * [[PostGIS optimieren]] | ||
* Vortrag PGConf.DE 2011, Stefan Keller. | * Vortrag PGConf.DE 2011, Stefan Keller. | ||
+ | Siehe auch: | ||
+ | * [http://wiki.postgresql.org/wiki/Performance_Optimization PostgreSQL Performance Optimization] on Wiki | ||
+ | * [http://www.revsys.com/writings/postgresql-performance.html RevSys (Hardware, Tuning, Explain, etc.)], [http://groups.google.ch/group/EtoE/browse_thread/thread/eaabeda775e65260/1d3c809e723c6b07?hl=de&lnk=st&q=Best+CPU+for+PostgreSQL#1d3c809e723c6b07 PostgreSQL 8 Performance Checklist (genaue Erklärung der Variablen in postgresql.conf], [http://www.pcguide.com/ref/hdd/perf/raid/ RAID Hinweise]. Für kleinere bis mittlere PostgreSQL Installationen mit primär Leseoperationen empfiehlt sich RAID 0 oder RAID 1. RAID 0 hat jedoch keine Redundanz. RAID 5 hat schlechte Performanz, ausser wenn 6 oder mehr Disks. SCSI Ultra 320 Disks sind zwar teurer aber immer noch wesentlich schneller als S-ATA/IDE Disks. | ||
+ | * Das Tool pgtune macht Verbesserungsvorschläge der Konfiguration. | ||
+ | * CLUSTER und FILLINDEX: http://www.postgresonline.com/journal/index.php?/archives/10-How-does-CLUSTER-ON-improve-index-performance.html | ||
+ | * [[http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm 5-Minute Introduction to PostgreSQL Performance]] by Greg Smith | ||
+ | |||
+ | == Tipps == | ||
+ | |||
+ | Bevor man beginnt und postgresql.conf etc. verändert... | ||
+ | * ANALYSE und CLUSTER-Befehl ausführen: Siehe [http://wiki.postgresql.org/wiki/VACUUM_FULL#What_to_use_instead]. | ||
+ | * Den aktuellen Zustand testen mit dem Tool [http://wiki.postgresql.org/wiki/Regression_Testing_with_pgbench pgbench]. | ||
+ | |||
+ | Tipps zur Optimierung (Optimization), Performance, Tuning von PostgreSQL...: | ||
+ | # Make sure you have (spatial) indexes on your tables. | ||
+ | # Cluster on (spatial) indexes tends to improve performance but mostly good for static spatial data. | ||
+ | # PostGIS: A spatial index is more effective on small geometries than larger geometries even for MySQL. | ||
+ | # EXPLAIN ANALYZE is your friend -- especially PgAdmin III Graphical Explain - will pinpoint when indexes are not being used and what navigation strategies are being used. | ||
+ | # posgresql.conf - fiddling with that a lot has been written -- particularly work_mem is important. | ||
+ | # Then there is a whole science on how you write sql queries. In general for PostgreSQL - you can force PostgreSQL to materialize subselects by putting in an ORDER BY and in general a subselect tends to force the planner to evaluate those as a single unit of work, which tends to improve speed in many cases. | ||
+ | |||
+ | SQL: | ||
+ | * Postgres kennt "Partial index": Beispiel: "create index i on table2(field2) where field2 <> 0;" Optimiert z.B. folgende Query: select * from table2 where field2<>0; (vgl. [http://www.postgresql.org/docs/8.4/static/indexes-partial.html Doc.]). | ||
+ | |||
+ | === Tuning durch Konfiguration von postgresql.conf === | ||
+ | |||
+ | Konfiguration von postgresql.conf (siehe z.B. [http://www.varlena.com/GeneralBits/Tidbits/perf.html]) durch Parameter heraufsetzen: | ||
+ | * Speziell bei genügend Memory und wenigen Applikationen und Benutzern: | ||
+ | ** effective_cache_size (default 128MB ) | ||
+ | ** work_mem – default 1MB is wayyy small | ||
+ | ** maintenance_work_mem – default 16MB small | ||
+ | ** shared_buffers (default 24MB): This is a real killer... bump it to 1-2GB, at least, on a server w/ >4G RAM, up to 8GB (don't go above that w/o good testing..). | ||
+ | * sort_memory - speziell bei order-by oder group-by Klauseln | ||
+ | * random_page_cost – speziell bei schnellen Disks wie SCSI und RAID | ||
+ | |||
+ | Auto Vacuum einschalten. | ||
+ | |||
+ | == Tuning Read-Only Databases == | ||
+ | |||
+ | Steps in order to secure and speedup such PostgreSQL/PostGIS instance: | ||
+ | |||
+ | 1. Re-configure PostgreSQL server parameters (postgresql.conf) as following: | ||
+ | |||
+ | fsync=off | ||
+ | synchronous_commit=off | ||
+ | full_page_writes=off | ||
+ | default_transaction_read_only | ||
+ | |||
+ | Greg Smith wrote (19. April 2011 16:30 on pgsql-performance): "The only other thing you probably want to do is set checkpoint_segments to a big number. Shouldn't matter normally, but when doing this (vacuum) freeze operation it will help that execute quickly. You want a lower maintenance_work_mem on a read-only system than the master too, possibly a higher shared_buffers as well. It's all pretty subtle beyond the big parameters you already identified." | ||
+ | |||
+ | checkpoint_segments=? | ||
+ | maintenance_work_mem=? | ||
+ | shared_buffers=? | ||
+ | |||
+ | 2. Restart server, login as dbadmin, create database, create an app.-user. | ||
+ | |||
+ | 3. Load dataset...: | ||
+ | |||
+ | 3.1. with owner 'app.-user' (in schema PUBLIC or other?). | ||
+ | |||
+ | 3.2. create indexes and cluster: | ||
+ | |||
+ | CREATE INDEX mytable_idx | ||
+ | ON mytable | ||
+ | -- USING btree (upper(last_name), upper(first_name)) -- tbd. | ||
+ | WITH (FILLFACTOR=100); | ||
+ | |||
+ | ALTER TABLE member CLUSTER ON member_name_idx; | ||
+ | |||
+ | 3.3. Config database: | ||
+ | |||
+ | ALTER DATABASE x SET default_transaction_read_only = on; | ||
+ | |||
+ | 3.4. Issue a VACUUM command: | ||
+ | |||
+ | VACUUM FREEZE ANALYZE; | ||
+ | |||
+ | 4. Create a 'read-only' user (login role) with only read access to user defined tables: | ||
+ | GRANT SELECT ... TO read_only_user | ||
+ | ... | ||
+ | |||
+ | 5. Optimize and secure session by following parameters: | ||
− | + | SET STATEMENT_TIMEOUT TO 60000; | |
+ | SET transaction_read_only TO TRUE; --? | ||
+ | SET TRANSACTION READ ONLY; | ||
− | + | 6. Go to step 3 in case of new data or a harddisk crash. |
Version vom 16. Oktober 2011, 20:37 Uhr
PostgreSQL Performance Optimierung und Tuning.
Siehe auch:
- PostGIS, PostGIS - Tipps und Tricks
- PostgreSQL, PostgreSQL - Tipps und Tricks
- PostGIS optimieren
- Vortrag PGConf.DE 2011, Stefan Keller.
Siehe auch:
- PostgreSQL Performance Optimization on Wiki
- RevSys (Hardware, Tuning, Explain, etc.), PostgreSQL 8 Performance Checklist (genaue Erklärung der Variablen in postgresql.conf, RAID Hinweise. Für kleinere bis mittlere PostgreSQL Installationen mit primär Leseoperationen empfiehlt sich RAID 0 oder RAID 1. RAID 0 hat jedoch keine Redundanz. RAID 5 hat schlechte Performanz, ausser wenn 6 oder mehr Disks. SCSI Ultra 320 Disks sind zwar teurer aber immer noch wesentlich schneller als S-ATA/IDE Disks.
- Das Tool pgtune macht Verbesserungsvorschläge der Konfiguration.
- CLUSTER und FILLINDEX: http://www.postgresonline.com/journal/index.php?/archives/10-How-does-CLUSTER-ON-improve-index-performance.html
- [5-Minute Introduction to PostgreSQL Performance] by Greg Smith
Tipps
Bevor man beginnt und postgresql.conf etc. verändert...
Tipps zur Optimierung (Optimization), Performance, Tuning von PostgreSQL...:
- Make sure you have (spatial) indexes on your tables.
- Cluster on (spatial) indexes tends to improve performance but mostly good for static spatial data.
- PostGIS: A spatial index is more effective on small geometries than larger geometries even for MySQL.
- EXPLAIN ANALYZE is your friend -- especially PgAdmin III Graphical Explain - will pinpoint when indexes are not being used and what navigation strategies are being used.
- posgresql.conf - fiddling with that a lot has been written -- particularly work_mem is important.
- Then there is a whole science on how you write sql queries. In general for PostgreSQL - you can force PostgreSQL to materialize subselects by putting in an ORDER BY and in general a subselect tends to force the planner to evaluate those as a single unit of work, which tends to improve speed in many cases.
SQL:
- Postgres kennt "Partial index": Beispiel: "create index i on table2(field2) where field2 <> 0;" Optimiert z.B. folgende Query: select * from table2 where field2<>0; (vgl. Doc.).
Tuning durch Konfiguration von postgresql.conf
Konfiguration von postgresql.conf (siehe z.B. [2]) durch Parameter heraufsetzen:
- Speziell bei genügend Memory und wenigen Applikationen und Benutzern:
- effective_cache_size (default 128MB )
- work_mem – default 1MB is wayyy small
- maintenance_work_mem – default 16MB small
- shared_buffers (default 24MB): This is a real killer... bump it to 1-2GB, at least, on a server w/ >4G RAM, up to 8GB (don't go above that w/o good testing..).
- sort_memory - speziell bei order-by oder group-by Klauseln
- random_page_cost – speziell bei schnellen Disks wie SCSI und RAID
Auto Vacuum einschalten.
Tuning Read-Only Databases
Steps in order to secure and speedup such PostgreSQL/PostGIS instance:
1. Re-configure PostgreSQL server parameters (postgresql.conf) as following:
fsync=off synchronous_commit=off full_page_writes=off default_transaction_read_only
Greg Smith wrote (19. April 2011 16:30 on pgsql-performance): "The only other thing you probably want to do is set checkpoint_segments to a big number. Shouldn't matter normally, but when doing this (vacuum) freeze operation it will help that execute quickly. You want a lower maintenance_work_mem on a read-only system than the master too, possibly a higher shared_buffers as well. It's all pretty subtle beyond the big parameters you already identified."
checkpoint_segments=? maintenance_work_mem=? shared_buffers=?
2. Restart server, login as dbadmin, create database, create an app.-user.
3. Load dataset...:
3.1. with owner 'app.-user' (in schema PUBLIC or other?).
3.2. create indexes and cluster:
CREATE INDEX mytable_idx ON mytable -- USING btree (upper(last_name), upper(first_name)) -- tbd. WITH (FILLFACTOR=100); ALTER TABLE member CLUSTER ON member_name_idx;
3.3. Config database:
ALTER DATABASE x SET default_transaction_read_only = on;
3.4. Issue a VACUUM command:
VACUUM FREEZE ANALYZE;
4. Create a 'read-only' user (login role) with only read access to user defined tables:
GRANT SELECT ... TO read_only_user ...
5. Optimize and secure session by following parameters:
SET STATEMENT_TIMEOUT TO 60000; SET transaction_read_only TO TRUE; --? SET TRANSACTION READ ONLY;
6. Go to step 3 in case of new data or a harddisk crash.