PostgreSQL
Basic setup
Let's create and explore a test cluster before using it for something important. So, how does this work then ? Well, lets create one with the following config. Click the for more information.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster (1)
metadata:
name: test-postgres-cluster
spec:
postgresVersion: 16 (2)
instances:
- name: instance1
replicas: 2 (3)
dataVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi (4)
storageClassName: rook-block
(5)
affinity: (6)
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: test-postgres-cluster
postgres-operator.crunchydata.com/instance-set: instance1
backups:
pgbackrest: (7)
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.47-2
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 2Gi
storageClassName: rook-block
- Remember operators extend Kubernetes functionality, this is a new resource type.
- We can set the PostgreSQL version here.
- Define how many instances we want to run
- Define the disk size of each instance
- In theory, that's all we need to define. But let's go further for bonus points.
-
This section is wht gives us atual high-availability. It's not special to the PGO, but native Kubernetes functionality. It tells k8s to spread the instances across different hostnames if possible. That way loss of one k8s node does not take down all of our instances.
-
pgBackRest is the only option, and is a required setting.
Connect and look around the cluster
Check out what the above manifest has created first:
The output will similar to the following:
NAME READY STATUS RESTARTS AGE
test-backup-znmx-zzf84 1/1 Running 0 87s
test-instance1-f74j-0 4/4 Running 0 107s
test-instance1-pf8x-0 4/4 Running 0 106s
test-repo-host-0 2/2 Running 0 106s
We can see the two instances we requested, one backup repository and one active backup job already running.
So what else has been setup for us ? We said this is highly available, and the PGO uses Patroni to manage that. So lets look at that status:
kubectl get pods -l postgres-operator.crunchydata.com/instance-set=instance1 -o=custom-columns='NAME:.metadata.name,ROLE:.metadata.labels.postgres-operator\.crunchydata\.com/role'
You should see something like this:
We can see one master, and one replica which seems sensible. If we had more instances, there would be more replicas.
What about connecting ? Well, the PGO also creates an initial user and database for us.
You'll find the following secrets:
NAME TYPE DATA AGE
test-cluster-cert Opaque 3 26m
test-instance1-f74j-certs Opaque 6 26m
test-instance1-pf8x-certs Opaque 6 26m
test-pgbackrest Opaque 5 26m
test-pguser-test Opaque 8 26m
test-replication-cert Opaque 3 26m
We can see a number of Kubernetes native secrets have been created. The one we need for connection and user info is test-pguser-test. Let's examine that, run:
and you will see the following, or similar:
apiVersion: v1
data:
dbname: dGVzdA==
host: dGVzdC1wcmltYXJ5LmRlZmF1bHQuc3Zj
jdbc-uri: amRiYzpwb3N0Z3Jlc3FsOi8vdGVzdC1wcmltYXJ5LmRlZmF1bHQuc3ZjOjU0MzIvdGVzdD9wYXNzd29yZD1mJTVEOVBkLjVDSWVYZSU0MGFpJTI5UC5oVjZFeCU1RSZ1c2VyPXRlc3Q=
password: Zl05UGQuNUNJZVhlQGFpKVAuaFY2RXhe
port: NTQzMg==
uri: cG9zdGdyZXNxbDovL3Rlc3Q6ZiU1RDlQZC41Q0llWGUlNDBhaSUyOVAuaFY2RXglNUVAdGVzdC1wcmltYXJ5LmRlZmF1bHQuc3ZjOjU0MzIvdGVzdA==
user: dGVzdA==
verifier: U0NSQU0tU0hBLTI1NiQ0MDk2OkdYMG0vRDNFVjRBbFIxa2NCRkN6QlE9PSRVcGpyTlpHekJYNHlsODBYWTZOYjdXZGZ0QjRaT0dSVEs3VGlaMnhpTmlBPTpIakFQdmg5MzA0R0VvY2RwZENURDA2NVJsamJpRTc2OWVoRFBzNjlaZjlNPQ==
kind: Secret
metadata:
creationTimestamp: "2024-03-10T07:09:01Z"
labels:
postgres-operator.crunchydata.com/cluster: test
postgres-operator.crunchydata.com/pguser: test
postgres-operator.crunchydata.com/role: pguser
name: test-pguser-test
namespace: default
ownerReferences:
- apiVersion: postgres-operator.crunchydata.com/v1beta1
blockOwnerDeletion: true
controller: true
kind: PostgresCluster
name: test
uid: 994f3f91-6aa3-4b2e-a716-3174233f2520
resourceVersion: "43635550"
uid: 765cce42-c1f7-4fea-b28b-bbd43fcebf52
type: Opaque
For our purposes, we want the uri value.
and you'll see yours output as follows:
To use this from outside of the cluster, we'll need to setup a port-forward and then use the URI string to connect
PG_CLUSTER_PRIMARY_POD=$(kubectl get pod -o name \
-l postgres-operator.crunchydata.com/cluster=test,postgres-operator.crunchydata.com/role=master)
kubectl port-forward "${PG_CLUSTER_PRIMARY_POD}" 5432:5432
Now fetch the secrets and connect:
PG_CLUSTER_USER_SECRET_NAME=test-pguser-test
PGPASSWORD=$(kubectl get secrets "${PG_CLUSTER_USER_SECRET_NAME}" -o go-template='{{.data.password | base64decode}}')
PGUSER=$(kubectl get secrets "${PG_CLUSTER_USER_SECRET_NAME}" -o go-template='{{.data.user | base64decode}}')
PGDATABASE=$(kubectl get secrets "${PG_CLUSTER_USER_SECRET_NAME}" -o go-template='{{.data.dbname | base64decode}}')
PGPASSWORD=${PGPASSWORD} psql -h localhost -U ${PGUSER} ${PGDATABASE}
And you should finally be connected and see something similar to the following:
psql (16.2 (Homebrew), server 16.1)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.
test=>
Connection pooling
The Crunchy PGO has the ability to install and configure PgBouncer instance for us, to do so add the following stanza to your postgresql config.
proxy:
pgBouncer:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.21-0
That's all you need to do, easy as. Once you've added that you'll find some new connection parameters added to the secret we've used in the previous step. The new parameters are:
- pgbouncer-host
- pgbouncer-port
- pgbouncer-uri
- pgbouncer-jdbc-uri
Backup and disaster recovery
Configuration
Using pgBackRest there are four different ways to store backups:
- Azure
- GCS
- s3
- Kubernetes volume
You can mix, match and have as many different storage repositories as you would like. We've set up a Kuberenetes volume in our configuration, and while the configuration is specific to storage type the operations are the same.
Backup operations and management
pgBackRest has three types of backup mode available, aimed at getting your recovery time (Recovery Time Object, or RTO) as short as possible:
- full
- differential
- incremental
Selecting the appropriate strategy is outside the scope of our document, but lets show an example using PGO configuration.
Schedule
Schedules are cron-formatted strings that dictate the backup type, and time to execute.
Let's say that our backup policy is to take a full backup weekly on Sunday at 1am and take differential backups daily at 1am on every day except Sunday. We would want to add configuration to our spec that looks similar to:
spec:
backups:
pgbackrest:
repos:
- name: repo1
schedules:
full: "0 1 * * 0"
differential: "0 1 * * 1-6"
The PGO will create the necessary Kubernetes CronJobs to perform the descibed backups.
Retention
PGO lets you set backup retention on full and differential backups. When a full backup expires, either through your retention policy or through manual expiration, pgBackRest will clean up any backup and WAL files associated with it. For example, if you have a full backup with four associated incremental backups, when the full backup expires, all of its incremental backups also expire.
There are two different types of backup retention you can set:
count: This is based on the number of backups you want to keep. This is the default. time: This is based on the total number of days you would like to keep a backup. Let's look at an example where we keep full backups for 14 days. The most convenient way to do this is through the spec.backups.pgbackrest.global section:
One-off manual backup
There are times where you may want to take a one-off backup, such as before major application changes or updates. This is not your typical declarative action -- in fact a one-off backup is imperative in its nature! -- but it is possible to take a one-off backup of your Postgres cluster with PGO.
First, you need to configure the spec.backups.pgbackrest.manual section to be able to take a one-off backup. This contains information about the type of backup you want to take and any other pgBackRest configuration options.
Let's configure the custom resource to take a one-off full backup:
This does not yet trigger the one-off backup -- you have to do that by adding the postgres-operator.crunchydata.com/pgbackrest-backup annotation to your custom resource. The best way to set this annotation is with a timestamp, so you know when you initialized the backup.
For example, for our test cluster, we can run the following command to trigger the one-off backup:
kubectl annotate postgrescluster test \
postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
PGO will detect this annotation and create a new, one-off backup Job!
Next steps
See the Crunchy PGO documentation for more information such as high availability, monitoring, and managing your cluster.