PostgreSQL - Tipps und Tricks: Unterschied zwischen den Versionen
Stefan (Diskussion | Beiträge) K (→Full Text Search) |
Stefan (Diskussion | Beiträge) K (→Backup und Restore) |
||
Zeile 144: | Zeile 144: | ||
If going to backup a single database use pg_dump including globals: | If going to backup a single database use pg_dump including globals: | ||
− | # pg_dump -U postgres -h localhost -p | + | # pg_dump -U postgres -h localhost -p 5432 --clean --file=sandbox.sql sandbox |
− | # pg_dumpall -U postgres -h localhost -p | + | # pg_dumpall -U postgres -h localhost -p 5432 --clean --globals-only --file=globals.sql |
To restore, load the globals first, then the database dump: | To restore, load the globals first, then the database dump: |
Version vom 16. Januar 2013, 17:22 Uhr
Siehe auch:
- PostgreSQL, PostgreSQL optimieren und PostgreSQL-Datentyp hstore
- PostGIS und PostGIS - Tipps und Tricks
Inhaltsverzeichnis
Stefan's Notizen...
Passwörter
- Setzen/Ändern Passwort für eine Datenbank:
% psql -d unsere_db -c "ALTER USER unsere_db WITH PASSWORD 'neues_passwort';"
- TIPP: PG Programme haben nicht gern, wenn das PW fehlt!
- TIPP: Das Setzen des Passworts für Super User postgres auf PW postgres verlangt template1 und nicht postgres:
% psql -d template1 -c "ALTER USER postgres WITH PASSWORD 'postgres';"
- psql und wohl alle weiteren PostgreSQL-Tools (dropdb, createdb, pg_dump, etc.) u.a. brauchen Angaben zu Benutzer und Passwort. Der Benutzer wird mit Parameter -U mitgegeben, das Passwort hingegen kann (mit Absicht) nicht mitgegeben werden. Es gibt aber die Möglichkeit, Umgebungsvariablen zu setzen:
SET PGUSER=postgres SET PGPASSWORD=<meinpasswort>
- Start pgsl-Client mit Umlauten (Windows):
> cmd /c chcp 1252 > psql -d gisdb -U postgres postgres=# SET CLIENT_ENCODING TO 'LATIN1'; postgres=# <do whatever you must do in psql...> > cmd /c chcp 850
Create "Read-only User"
From: [1]
If you only grant CONNECT to a database, the user can connect but has no other privileges. In PostgreSQL 8.x you have to grant USAGE on namespaces (schemas) and SELECT on tables and views individually. So something like:
GRANT CONNECT ON DATABASE mydb TO xxx; -- This assumes you're actually connected to mydb.. GRANT USAGE ON SCHEMA public TO xxx; GRANT SELECT ON mytable TO xxx;
Granting 'SELECT' to each table/view individually can be a bore. Something like this can help:
SELECT 'GRANT SELECT ON ' || relname || ' TO xxx;' FROM pg_class JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE nspname = 'public' AND relkind IN ('r', 'v')
This should output the relevant GRANT commands to GRANT SELECT on all tables and views in public, for copy-n-paste.
PostgreSQL 9:
In PostgreSQL 9.0 its easier. You still need to grant USAGE permissions on schemas, but you can grant permissions on all tables, views etc. in the schema using a single command rather than having to generate them:
GRANT SELECT ON ALL TABLES IN SCHEMA public TO xxx;
This only affects tables that have already been created, so it's the equivalent of the generated commands above. More powerfully, you can automatically have default roles assigned to new objects in future:
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO osmgis;
Note that by default this will only affect objects (tables) created by the user that issued this command: although it can also be set on any role that the issuing user is a member of. However, you don't pick up default privileges for all roles you're a member of when creating new objects... so there's still some faffing around. If you adopt the approach that a database has an owning role, and schema changes are performed as that owning role, then you should assign default privileges to that owning role. IMHO this is all a bit confusing and you may need to experiment to come up with a functional workflow.
Daten laden/importieren
Siehe http://wiki.postgresql.org/wiki/COPY
COPY Syntax:
- Tab-Delimited
COPY geometry_columns (f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, type) FROM stdin; catalog public planet_osm_point way 2 900913 POINT catalog public planet_osm_line way 2 900913 LINESTRING \.
Foreign Data Wrappers
Ab Version 9.1 werden Foreign Data Wrappers (FDW) und ein Teil des SQL/MED-Standards unterstützt.
file_fdw ist eine offizielle Extension (contrib module): [2].
Siehe auch PostgreSQL Wiki:
PostgreSQL SQL
Siehe auch im Wiki der Vorlesung Datenbanken (Dbs1) von Prof. S. Keller.
Version:
select version();
Anfrage des Geometrie-Typs ('the_geom' ist der Name des Geometrie-Felds):
SELECT * FROM sometable WHERE GeometryType(the_geom) = 'GEOMETRYCOLLECTION'
Problem mit Datenfelder (for < 8.3):
SELECT the_geom ... (end_date - NOW() >= 10) AS GREEN AND (end_date - NOW() < 0) AS RED ... FROM ...
Casting changes in >= 8.3:
SELECT the_geom ... (end_date - NOW() >= '10 DAYS'::INTERVAL) AS GREEN AND (end_date - NOW() < '0'::INTERVAL) AS RED ... FROM ...
SELECT the_geom ... EXTRACT(DAY FROM (end_date - NOW()) >= 10) AS GREEN AND EXTRACT(DAY FROM (end_date - NOW()) < 0) AS RED ... FROM ...
hstore
Stichworte: Key-Value-Attributes, Hashtable, nosql-Database.
Siehe auch:
- PostgreSQL docs.
- Seminar thesis from M. Ott about "Key/Value Pair Versus hstore - Benchmarking Entity-Attribute-Value Structures in PostgreSQL" in [5]
- Beispiele: PostGIS Terminal.
Binary Large Objects (LOB, BLOB)
Siehe PostgreSQL - Binary Large Objects.
Data Warehousing mit PostgreSQL
Stichworte: PostgreSQL-Module cube, PostgreSQL-Module tablefunc (mit crosstab).
tbd.
Konfiguration
Die pg_hba.conf kann so eingestellt werden, dass bei der Ausführung von PostgreSQL-Kommandozeilen-Tools kein Passwort übertragen werden muss.
Bsp mit 127.0.0.1: Anstelle
# IPv4 local connections: host all all 127.0.0.1/32 md5
neu:
# IPv4 local connections: host all all 127.0.0.1/32 trust
Backup und Restore
Dump whole PostgreSQL cluster ('instance') with pg_dumpall. This includes the globals--information about users and groups, tablespaces, and so on. Outputs are just text files.
# pg_dumpall -U postgres -h localhost -p 5432 --clean --file=dump.sql
If going to backup a single database use pg_dump including globals:
# pg_dump -U postgres -h localhost -p 5432 --clean --file=sandbox.sql sandbox # pg_dumpall -U postgres -h localhost -p 5432 --clean --globals-only --file=globals.sql
To restore, load the globals first, then the database dump:
# psql -U postgres -h localhost -p 5433 < globals.sql # psql -U postgres -h localhost -p 5433 < sandbox.sql
Here's another example:
# Backup full database pg_dump -h host -U user -W database > backup.sql # Restore full database psql -d database -f backup.sql
# Backup database schema pg_dump -sv prueba -O > backup.schema.sql
# Backup database data pg_dump -Fc -f backup.data.dump -a --disable-triggers database
Tipps:
- Dont't store data in schema 'public' for several reasons. One is, that until PG 8, all db additional objects (functions, triggers) are maintainted in schema public.
- Eventually switch off Privileges. See also Tab „Dump Options #1“ > Don’t Save Privilege in pgAdmin3.
Doku.:
- 'PostgreSQL 9.0 pg_dump, pg_dumpall, pg_restore cheatsheet' from postgresonline.com.
- PostGIS Hard Upgrade
- PostgreSQL Chapter 24. Backup and Restore
Weitere Tipps...
Convert text to number
See [6]
DOMAIN, ENUM or Lookup-Table?
DOMAIN is ANSI SQL 92, ENUM is PostgreSQL specific. If compatibility is no issue then I'd prefer ENUM.
See also Blog about "FK, CHECK, ENUM or DOMAIN. That is the question." by Joshua Drake, Jan 21st, 2009.
DATE or TIMESTAMP?
DATE is year, month and day, whereas TIMESTAMP contains minutes, hours and seconds - and timezone. DATE is easier to work with for arithmetic (e.g. something reoccurring at a random interval of days), takes less storage space, and doesn't trail "00:00:00" strings you don't need when printed. However, TIMESTAMP is far better for real calendar calculations (e.g. something that happens on the 15th of each month or the 2nd Thursday of leap years). See http://wiki.postgresql.org/wiki/Working_with_Dates_and_Times_in_PostgreSQL
What is the best way to insert data?
INSERT INTO sometable VALUES (13, 'test', 13, '2012-12-02'); INSERT INTO sometable VALUES (14, 'test', 14, '2012-12-02'::timestamp); INSERT INTO sometable VALUES (15, 'test', 15, timestamp '2012-12-02');
First line does not check datetype, so I'd prefer one of the other two.
SQL MERGE/UPSERT Command
The PostgreSQL documentation provides a solution for an MERGE/INSERT function. 'UPSERT' is the PostgreSQL slang for ANSI SQL 'Merge' and means UPDATE-SELECT. But this solution proposal hast concurrency issues which have been avoided by this proposal (by Depesz's Blog [7]).
PL/pgSQL-Funktion für ein MERGE/UPSERT mit einem Attribut ('in_whatever'):
-- All function parameters and all the update and insert -- statements in the body need to be adapted for specific tables CREATE OR REPLACE FUNCTION merge_db( key INT, data TEXT ) RETURNS void AS $$ BEGIN -- first try to update with primary key as id UPDATE db SET b=data WHERE a = key; --UPDATE db SET b=data,etc. WHERE a = key; IF FOUND THEN RETURN; END IF; -- not there, so try to insert the key -- if someone else inserts the same key BEGIN INSERT INTO db (a,b) VALUES (key, data); --INSERT INTO db (a,b,etc.) VALUES (a,b,etc.); EXCEPTION WHEN OTHERS THEN -- try another time UPDATE db SET b=data WHERE a = key; END; RETURN; END; $$ LANGUAGE plpgsql;
Unfortunately, there's no other generic function around which would accept any table structure.
Triggers
Tipps:
- Prefer triggers over rules (since 9.0).
- ...
Q: Is there a restriction that an insert trigger can't modify own or other rows of the table that fired it?
A: There's no such restriction. The best way to change the inserted row is to do it before the insertion, though. Trying to modify the inserted row can have slightly strange effects.
Full Text Search
Siehe auch: http://delicious.com/sfkeller/fulltext
Freie Text-Corpora (Corpus):
- Project Gutenberg: http://www.gutenberg.org/
- Wikipedia: http://en.wikipedia.org/wiki/Wikipedia:Database_download and http://dumps.wikimedia.org/
- DBPedia: http://dbpedia.org/About
- And...: Stackoverflow, Shakespeare, Library of US Congress?
Zur Konfiguration von Full Text Search (FTS).
Wenn man...
- (allgemeine) englische Texte durchsuchen will, muss man nichts umkonfigurieren (ausser den Index).
- fachspezifische Texte hat, dann ist ein sep. Thesaurus zu konfigurieren (.syn + .thes laden).
- (allgemeine) deutsche Texte hat, dann kann man für's Erste den Stemmer umkonfigurieren (plus Index).
- ...
Die effiziente Textsuche kann verschiedene Ziele verfolgen:
- Suche in deutschen (ev. auch gemischten de+en Dokumenten) allgemein.
- Suche in fach- oder branchen-spezifische Texten.
- Exakte Suche oder unscharfe Suche.
Als unscharfe Suche kommen in PG in Frage:
- soundex, metaphone, dmetaphone (built-in Fn.) und Levenshtein (Modul fuzzystrmatch).
- Trigramm (zusammen mit dem speziellen GiST/kNN-Index).