PostgreSQL optimieren
PostgreSQL Performance Optimierung und Tuning.
Siehe auch:
- PostGIS, PostGIS - Tipps und Tricks
- PostgreSQL, PostgreSQL - Tipps und Tricks
- PostGIS optimieren
- Vortrag PGConf.DE 2011, Stefan Keller.
Siehe:
- pgTune https://pgtune.leopard.in.ua/
- PostgreSQL Performance Optimization on Wiki
- RevSys (Hardware, Tuning, Explain, etc.), PostgreSQL 8 Performance Checklist (genaue Erklärung der Variablen in postgresql.conf, RAID Hinweise. Für kleinere bis mittlere PostgreSQL Installationen mit primär Leseoperationen empfiehlt sich RAID 0 oder RAID 1. RAID 0 hat jedoch keine Redundanz. RAID 5 hat schlechte Performanz, ausser wenn 6 oder mehr Disks. SCSI Ultra 320 Disks sind zwar teurer aber immer noch wesentlich schneller als S-ATA/IDE Disks.
- Das Tool pgtune macht Verbesserungsvorschläge der Konfiguration.
- CLUSTER und FILLINDEX: http://www.postgresonline.com/journal/index.php?/archives/10-How-does-CLUSTER-ON-improve-index-performance.html
- [5-Minute Introduction to PostgreSQL Performance] by Greg Smith
Tipps
Bevor man beginnt und postgresql.conf etc. verändert...
- ANALYZE und CLUSTER-Befehl ausführen: Siehe [1].
- Den aktuellen Zustand testen mit dem Tool pgbench.
- Es sind nicht so viele Parameter, u.a. random_page_cost='1.1', shared_buffers='60GB', synchronous_commit='off', temp_buffers='120MB', wal_buffers='16MB', max_wal_size='5GB', work_mem='6GB', fsync='off' (Voricht!).
Tipps zur Optimierung (Optimization), Performance, Tuning von PostgreSQL...:
- Make sure you have (spatial) indexes on your tables.
- Cluster on (spatial) indexes tends to improve performance but mostly good for static spatial data.
- PostGIS: A spatial index is more effective on small geometries than larger geometries even for MySQL.
- EXPLAIN ANALYZE is your friend -- especially PgAdmin III Graphical Explain - will pinpoint when indexes are not being used and what navigation strategies are being used.
- posgresql.conf - fiddling with that a lot has been written -- particularly work_mem is important.
- Then there is a whole science on how you write sql queries. In general for PostgreSQL - you can force PostgreSQL to materialize subselects by putting in an ORDER BY and in general a subselect tends to force the planner to evaluate those as a single unit of work, which tends to improve speed in many cases.
SQL:
- Postgres kennt "Partial index": Beispiel: "create index i on table2(field2) where field2 <> 0;" Optimiert z.B. folgende Query: select * from table2 where field2<>0; (vgl. Doc.).
Performance Optimierung durch Konfiguration von postgresql.conf
Konfiguration von postgresql.conf (siehe z.B. [2]) durch Parameter heraufsetzen:
- Speziell bei genügend Memory und wenigen Applikationen und Benutzern:
- shared_buffers (default 24MB): This is a real killer... bump it to 1-2GB, at least, on a server w/ >4G RAM, up to 8GB (don't go above that w/o good testing..).
- effective_cache_size (default 128MB ) - Should always be >= shared buffers
- work_mem – default 1MB is wayyy small
- maintenance_work_mem – default 16MB small
- sort_memory - speziell bei order-by oder group-by Klauseln
- random_page_cost – speziell bei schnellen Disks wie SCSI und RAID
Auto Vacuum einschalten.
Performance Optimierung von Read-Only Datenbanken
Steps in order to secure and speedup such PostgreSQL/PostGIS instance:
1. Re-configure PostgreSQL server parameters (postgresql.conf) as following:
fsync=off synchronous_commit=off full_page_writes=off default_transaction_read_only
See Gavin M. Roy PGConf 2010 ("Running with Scissors"). Greg Smith wrote (19. April 2011 16:30 on pgsql-performance): "The only other thing you probably want to do is set checkpoint_segments to a big number. Shouldn't matter normally, but when doing this (vacuum) freeze operation it will help that execute quickly. You want a lower maintenance_work_mem on a read-only system than the master too, possibly a higher shared_buffers as well. It's all pretty subtle beyond the big parameters you already identified."
checkpoint_segments=? maintenance_work_mem=? shared_buffers=?
2. Restart server, login as dbadmin, create database, create an app.-user.
3. Load dataset...:
3.1. with owner 'app.-user' (in schema PUBLIC or other?).
3.2. create indexes and cluster:
CREATE INDEX mytable_idx ON mytable -- USING btree (upper(last_name), upper(first_name)) -- tbd. WITH (FILLFACTOR=100); ALTER TABLE member CLUSTER ON member_name_idx;
3.3. Config database:
ALTER DATABASE x SET default_transaction_read_only = on;
3.4. Issue a VACUUM command:
VACUUM FREEZE ANALYZE;
4. Create a 'read-only' user (login role) with only read access to user defined tables:
GRANT SELECT ... TO read_only_user ...
5. Optimize and secure session by following parameters:
SET STATEMENT_TIMEOUT TO 60000; SET transaction_read_only TO TRUE; --? SET TRANSACTION READ ONLY;
6. Go to step 3 in case of new data or a harddisk crash.