PostgreSQL - Tipps und Tricks

Aus Geoinformation HSR
Version vom 8. Januar 2012, 01:01 Uhr von Stefan (Diskussion | Beiträge) (Large objects (LO))

Wechseln zu: Navigation, Suche

Siehe auch:


Meine unsortierten Notizen...

Tools und Konfiguration

  • psql und wohl alle weiteren PostgreSQL-Tools (dropdb, createdb, pg_dump, etc.) u.a. brauchen Angaben zu Benutzer und Passwort. Der Benutzer wird mit Parameter -U mitgegeben, das Passwort hingegen kann (mit Absicht) nicht mitgegeben werden. Es gibt aber die Möglichkeit, Umgebungsvariablen zu setzen:
 SET PGUSER=postgres
 SET PGPASSWORD=<meinpasswort>
  • Start pgsl-Client:
 > psql -h localhost -p 5432 template1 "postgres"
 oder
 > cmd /c chcp 1252
 > psql -d gisdb -U postgres
 postgres=# SET CLIENT_ENCODING TO 'LATIN1';
 postgres=# <do whatever you must do in psql...>
 > cmd /c chcp 850

Create "Read-only User"

From: [1]

If you only grant CONNECT to a database, the user can connect but has no other privileges. In PostgreSQL 8.x you have to grant USAGE on namespaces (schemas) and SELECT on tables and views individually. So something like:

 GRANT CONNECT ON DATABASE mydb TO xxx;
 -- This assumes you're actually connected to mydb..
 GRANT USAGE ON SCHEMA public TO xxx;
 GRANT SELECT ON mytable TO xxx;

Granting 'SELECT' to each table/view individually can be a bore. Something like this can help:

 SELECT 'GRANT SELECT ON ' || relname || ' TO xxx;'
 FROM pg_class JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace
 WHERE nspname = 'public' AND relkind IN ('r', 'v')

This should output the relevant GRANT commands to GRANT SELECT on all tables and views in public, for copy-n-paste.

PostgreSQL 9:

In PostgreSQL 9.0 its easier. You still need to grant USAGE permissions on schemas, but you can grant permissions on all tables, views etc. in the schema using a single command rather than having to generate them:

 GRANT SELECT ON ALL TABLES IN SCHEMA public TO xxx;

This only affects tables that have already been created, so it's the equivalent of the generated commands above. More powerfully, you can automatically have default roles assigned to new objects in future:

 ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT ON TABLES TO osmgis;

Note that by default this will only affect objects (tables) created by the user that issued this command: although it can also be set on any role that the issuing user is a member of. However, you don't pick up default privileges for all roles you're a member of when creating new objects... so there's still some faffing around. If you adopt the approach that a database has an owning role, and schema changes are performed as that owning role, then you should assign default privileges to that owning role. IMHO this is all a bit confusing and you may need to experiment to come up with a functional workflow.


Daten laden/importieren

Siehe http://wiki.postgresql.org/wiki/COPY

COPY Syntax:

  • Tab-Delimited

 COPY geometry_columns (f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, type) FROM stdin;
 catalog	public	planet_osm_point	way	2	900913	POINT
 catalog	public	planet_osm_line	way	2	900913	LINESTRING
 \.

Foreign Data Wrappers

Ab Version 9.1 werden Foreign Data Wrappers (FDW) und ein Teil des SQL/MED-Standards unterstützt.

file_fdw ist eine offizielle Extension (contrib module): [2].

Siehe auch PostgreSQL Wiki:

  • Foreign_data_wrappers [3]
  • SQL/MED [4]

PostgreSQL SQL

Siehe auch im Wiki der Vorlesung Datenbanken (Dbs1) von Prof. S. Keller.

Version:

 select version();

Anfrage des Geometrie-Typs ('the_geom' ist der Name des Geometrie-Felds):

 SELECT *
   FROM sometable
   WHERE GeometryType(the_geom) = 'GEOMETRYCOLLECTION'

Problem mit Datenfelder (for < 8.3):

 SELECT the_geom ... (end_date - NOW() >= 10) AS GREEN 
   AND (end_date - NOW() < 0) AS RED ... FROM ...

Casting changes in >= 8.3:

 SELECT the_geom ... (end_date - NOW() >= '10 DAYS'::INTERVAL) AS GREEN 
   AND (end_date - NOW() < '0'::INTERVAL) AS RED ... FROM ...
 SELECT the_geom ...
   EXTRACT(DAY FROM (end_date - NOW()) >= 10) AS GREEN
   AND EXTRACT(DAY FROM (end_date - NOW()) < 0) AS RED ... FROM ...

hstore

Stichworte: Key-Value-Attributes, Hashtable, nosql-Database.

Siehe auch:

  • PostgreSQL docs.
  • Seminar thesis from M. Ott about Key/Value Pair Versus hstore - Benchmarking Entity-Attribute-Value Structures in PostgreSQL in [5]
  • Beispiele: PostGIS Terminal.

Binary Large Objects (LOB, BLOB)

In PostgreSQL there exist three ways to manage Binary Large Objects:

  1. Data type BYTEA
  2. LO facility 'pg_largeobject' (a system table)
  3. Data type DATALINK (experimental)

PostgreSQL also supports a storage system called "TOAST" that automatically stores values larger than a single database page (typically 8 KB) into a secondary storage area per table. This defines the size limit of of any column/field to 1 GB.

BYTEA

The BYTEA data type allows storage of binary strings.

  • It stores a LOB within the table, respectively using TOAST.
  • It is thus limited to 1 GB
  • The storage is octal and allows non printable characters (in contrast to character strings which don't, but allow locale settings).
  • The input/output format is HEX (as of PostgreSQL 9.0).

Notes:

  • BYTEA comes close to the SQL standard binary string type 'BLOB'. The functions and operators provided by BYTEA are mostly the same, while HEX input format of BYTEA is different.
  • BYTEA is slower for lengths >20 MB than the LO facility (it has no random accees).

See PostgreSQL doc Binary Data Types.

Large objects (LO)

Large objects (LO) can also placed in a single system table called 'pg_largeobject' which has to be accessed via identifiers of data type OID.

  • There is a read/write API which offers client (= modules written in C) and server-side (= SQL) functions.
  • The main related SQL functions are: lo_creat(), lo_create(), lo_unlink(), lo_import(), and lo_export(). lo_import() and lo_export() need permissions of the database's owning user (i.e. superuser).
  • LO are broken into "chunks" and stored in btree-indexed rows.
  • LO allows values up to 2 GB in size, whereas TOASTed fields (like BYTEA) can be at most 1 GB.
  • LO entries can be randomly modified using a read/write API that is more efficient than performing such operations using TOAST (and e.g BYTEA).

Note:

  • When PostgreSQL doc. mentions 'lo' (LO = Large Object) it typically refers to this facility.
  • In contrast to e.g. BYTEA - LO is not a data type on its own but a table, a 'facility'.

IMPORTANT NOTE when using JDBC BLOB (or @Lob annotation in Hibernate):

  • Since PostgreSQL considers a LO entry as an object on it's own, pg_largeobjects grows infinitely unless a separate cleaning is made.
  • To prevent this, typically a trigger needs to be added which deletes entries in pg_largeobject as descriebd in module 'lo'.
  • See additional PostgreSQL modules 'lo' and 'vaccumlo' in the PostgreSQL docs.

See PostgreSQL doc 'Large Objects' and JDBC data type BLOB: [6].

Example: Typical usage of the server-side (= SQL) functions (from the Postgres docs):

 CREATE TABLE image (
   name            text,
   raster          oid
 );
 
 SELECT lo_creat(-1);       -- returns OID of new, empty large object.
 
 SELECT lo_create(43213);   -- attempts to create large object with OID 43213.
 
 SELECT lo_unlink(173454);  -- deletes large object with OID 173454.
 
 INSERT INTO image (name, raster)
   VALUES ('beautiful image', lo_import('/etc/motd'));
 
 INSERT INTO image (name, raster)  -- same as above, but specify OID to use.
   VALUES ('beautiful image', lo_import('/etc/motd', 68583));
 
 SELECT lo_export(image.raster, '/tmp/motd') FROM image  -- need superuser permission.
   WHERE name = 'beautiful image';

DATALINK

The DATALINK type maintains a link to a specific file in external storage.

Example:

 CREATE TABLE mydata (
    myid Integer, 
    myimage DATALINK(40) 
    FILE LINK CONTROL INTEGRITY ALL
 );

Weblinks:

  • DATALINK wiki page
  • MSE-Seminar Thesis (2011) from Florian Schwendener about SQL/MED and More - Management of External Data in PostgreSQL and Microsoft SQL Server: [7]

Data Warehousing mit PostgreSQL

Stichworte: PostgreSQL-Module cube, PostgreSQL-Module tablefunc (mit crosstab).

tbd.

Konfiguration

Die pg_hba.conf kann so eingestellt werden, dass bei der Ausführung von PostgreSQL-Kommandozeilen-Tools kein Passwort übertragen werden muss.

Bsp mit 127.0.0.1: Anstelle

 # IPv4 local connections:
 host    all         all         127.0.0.1/32          md5

neu:

 # IPv4 local connections:
 host    all         all         127.0.0.1/32          trust

Weitere Tipps...

  • Code snippets auf Postgres Wiki