From db4dc3239ea33f222ce3706980ef60d0a652b373 Mon Sep 17 00:00:00 2001
From: Jinpei Su <jpsu@alauda.io>
Date: Thu, 18 Jun 2026 06:01:28 +0000
Subject: [PATCH 1/2] docs: add PostgreSQL KB how-to and troubleshooting guides
 (MIDDLEWARE-31526)

Precipitate historical internal KB solutions into the product manual,
modernized to the current acid.zalan.do/v1 postgresql CR and verified
live on ACP 4.2/4.3:

how_to: install pgvector, install zhparser, configure pg_hba whitelist,
run as root, disable NodePort exposure.
trouble_shooting: connection SSL off, pg_wal disk full, coredump from
huge pages, repair streaming replica.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/en/how_to/configure_pg_hba_whitelist.mdx |  72 ++++++++++++
 docs/en/how_to/disable_nodeport_exposure.mdx  |  74 ++++++++++++
 docs/en/how_to/install_pgvector_extension.mdx | 109 ++++++++++++++++++
 docs/en/how_to/install_zhparser_extension.mdx | 102 ++++++++++++++++
 docs/en/how_to/run_postgresql_as_root.mdx     |  75 ++++++++++++
 .../trouble_shooting/connection_ssl_off.mdx   |  66 +++++++++++
 .../trouble_shooting/coredump_huge_pages.mdx  |  56 +++++++++
 .../fix_streaming_replication.mdx             |  65 +++++++++++
 docs/en/trouble_shooting/pg_wal_disk_full.mdx |  63 ++++++++++
 9 files changed, 682 insertions(+)
 create mode 100644 docs/en/how_to/configure_pg_hba_whitelist.mdx
 create mode 100644 docs/en/how_to/disable_nodeport_exposure.mdx
 create mode 100644 docs/en/how_to/install_pgvector_extension.mdx
 create mode 100644 docs/en/how_to/install_zhparser_extension.mdx
 create mode 100644 docs/en/how_to/run_postgresql_as_root.mdx
 create mode 100644 docs/en/trouble_shooting/connection_ssl_off.mdx
 create mode 100644 docs/en/trouble_shooting/coredump_huge_pages.mdx
 create mode 100644 docs/en/trouble_shooting/fix_streaming_replication.mdx
 create mode 100644 docs/en/trouble_shooting/pg_wal_disk_full.mdx

diff --git a/docs/en/how_to/configure_pg_hba_whitelist.mdx b/docs/en/how_to/configure_pg_hba_whitelist.mdx
new file mode 100644
index 0000000..805aab3
--- /dev/null
+++ b/docs/en/how_to/configure_pg_hba_whitelist.mdx
@@ -0,0 +1,72 @@
+---
+weight: 42
+title: Configuring the pg_hba Client Authentication Whitelist
+---
+
+# Configuring the pg_hba Client Authentication Whitelist
+
+## Overview
+
+PostgreSQL client authentication is controlled by `pg_hba.conf`. In a cluster
+managed by the PostgreSQL Operator, this file is rendered and managed by
+Patroni — **editing `pg_hba.conf` inside the container has no effect** because
+Patroni overwrites it. Instead, declare the rules in the `postgresql` custom
+resource under `spec.patroni.pg_hba`, and the Operator/Patroni will apply and
+reload them.
+
+## Prerequisites
+
+- A running PostgreSQL cluster managed by the PostgreSQL Operator.
+- Permission to edit the `postgresql` custom resource.
+
+## Procedure
+
+### 1. Locate the custom resource
+
+```bash
+kubectl get postgresql -n $NAMESPACE
+```
+
+### 2. Set the pg_hba rules
+
+Edit the `postgresql` resource and add the whitelist under `spec.patroni.pg_hba`.
+Keep the internal Patroni/replication entries, and append your own rules. Order
+matters — the first matching rule wins.
+
+```yaml
+spec:
+  patroni:
+    pg_hba:
+      - local     all          all                     trust
+      - hostssl   all          +zalandos  127.0.0.1/32 pam
+      - host      all          all        127.0.0.1/32 md5
+      - hostssl   all          +zalandos  ::1/128      pam
+      - host      all          all        ::1/128      md5
+      - hostssl   replication  standby    all          md5
+      - hostssl   all          +zalandos  all          pam
+      - hostssl   all          all        all          md5
+      - host      all          all        0.0.0.0/0    md5
+      - host      all          all        ::0/0        md5
+```
+
+Apply with `kubectl apply` / `kubectl edit`. Patroni reloads the configuration
+without a database restart.
+
+### 3. Verify
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- \
+  psql -U postgres -c "SELECT type, database, user_name, address, auth_method FROM pg_hba_file_rules ORDER BY line_number;"
+```
+
+The output should reflect the rules you declared. `pg_hba_file_rules` also
+reports parse errors in the `error` column if a rule is malformed.
+
+## Notes
+
+- Prefer `hostssl ... md5` over plain `host ... md5` when exposing the database
+  beyond the cluster, so that credentials are not sent over an unencrypted
+  connection. See also
+  [Connection fails with "SSL off"](../trouble_shooting/connection_ssl_off.mdx).
+- `+zalandos` is an internal role group used by the Operator; do not remove the
+  `+zalandos` lines or internal components may lose access.
diff --git a/docs/en/how_to/disable_nodeport_exposure.mdx b/docs/en/how_to/disable_nodeport_exposure.mdx
new file mode 100644
index 0000000..c49254f
--- /dev/null
+++ b/docs/en/how_to/disable_nodeport_exposure.mdx
@@ -0,0 +1,74 @@
+---
+weight: 44
+title: Disabling NodePort Exposure for a PostgreSQL Cluster
+---
+
+# Disabling NodePort Exposure for a PostgreSQL Cluster
+
+## Overview
+
+By default the Service that fronts a PostgreSQL cluster is of type `NodePort`,
+which opens a port on every node. In environments where exposing a node port is
+not acceptable, you can switch the Service to type `LoadBalancer` and disable
+node-port allocation, so the database is no longer reachable through a node
+port.
+
+:::info
+This requires the platform to provide a LoadBalancer implementation (for example
+MetalLB). If no `IPAddressPool` is configured, the Service's `EXTERNAL-IP` stays
+`<pending>` — the node port is still removed, but no external address is
+assigned until a pool exists. On OpenShift Container Platform, prefer exposing
+the database through a Route / passthrough instead of a node port.
+:::
+
+## Prerequisites
+
+- A running PostgreSQL cluster managed by the PostgreSQL Operator.
+- A LoadBalancer provider on the cluster if external reachability is required.
+
+## Procedure
+
+Set `$CLUSTER_NAME` and `$NAMESPACE` for the target cluster.
+
+### 1. Switch the Services to LoadBalancer
+
+```bash
+kubectl patch postgresql -n $NAMESPACE $CLUSTER_NAME --type merge \
+  -p '{"spec":{"enableMasterLoadBalancer":true,"enableReplicaLoadBalancer":true}}'
+```
+
+Wait ~30 seconds for the Operator to reconcile and the Service type to change to
+`LoadBalancer`:
+
+```bash
+kubectl get svc -n $NAMESPACE $CLUSTER_NAME -o jsonpath='{.spec.type}{"\n"}'
+```
+
+### 2. Remove node-port allocation
+
+Patch the master Service (and the `-repl` Service if you enabled the replica
+LoadBalancer) to stop allocating node ports:
+
+```bash
+kubectl patch service -n $NAMESPACE $CLUSTER_NAME \
+  -p '{"spec":{"allocateLoadBalancerNodePorts":false,"ports":[{"name":"postgresql","nodePort":null,"port":5432,"protocol":"TCP","targetPort":5432}]}}'
+
+kubectl patch service -n $NAMESPACE $CLUSTER_NAME-repl \
+  -p '{"spec":{"allocateLoadBalancerNodePorts":false,"ports":[{"name":"postgresql","nodePort":null,"port":5432,"protocol":"TCP","targetPort":5432}]}}'
+```
+
+### 3. Verify
+
+```bash
+kubectl get svc -n $NAMESPACE $CLUSTER_NAME \
+  -o custom-columns=NAME:.metadata.name,TYPE:.spec.type,NODEPORT:.spec.ports[0].nodePort,ALLOC:.spec.allocateLoadBalancerNodePorts
+```
+
+Expected: `TYPE=LoadBalancer`, `NODEPORT=<none>`, `ALLOC=false`.
+
+:::note
+The `ports[].name` in the patch must match the existing port name on the
+Service. Inspect it first with
+`kubectl get svc -n $NAMESPACE $CLUSTER_NAME -o jsonpath='{.spec.ports[*].name}'`
+and adjust the patch accordingly.
+:::
diff --git a/docs/en/how_to/install_pgvector_extension.mdx b/docs/en/how_to/install_pgvector_extension.mdx
new file mode 100644
index 0000000..b07d21f
--- /dev/null
+++ b/docs/en/how_to/install_pgvector_extension.mdx
@@ -0,0 +1,109 @@
+---
+weight: 40
+title: Installing the pgvector Extension
+---
+
+# Installing the pgvector Extension
+
+## Overview
+
+[pgvector](https://github.com/pgvector/pgvector) adds a `vector` data type and
+nearest-neighbor search to PostgreSQL, which is commonly used for embedding /
+similarity-search workloads. The extension is pre-bundled in the Spilo image
+shipped with the PostgreSQL Operator, so no image rebuild is required — you only
+need to create the extension inside the target database.
+
+## Prerequisites
+
+- A running PostgreSQL cluster managed by the PostgreSQL Operator.
+- A database user with privileges to create extensions (the `postgres`
+  superuser, used below).
+
+## Procedure
+
+### 1. Verify the extension is available
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- \
+  psql -U postgres -tAc \
+  "SELECT name, default_version FROM pg_available_extensions WHERE name = 'vector';"
+```
+
+Expected output (version may differ depending on the operand release):
+
+```
+vector|0.8.2
+```
+
+### 2. Create the extension
+
+```sql
+CREATE EXTENSION IF NOT EXISTS vector;
+```
+
+### 3. Smoke test
+
+```sql
+-- Create a table with a 3-dimensional vector column
+CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3));
+
+-- Insert sample data
+INSERT INTO items (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
+
+-- Order by L2 distance to a query vector
+SELECT id, embedding <-> '[3,1,2]' AS l2_distance FROM items ORDER BY l2_distance;
+```
+
+The distance operators are:
+
+| Operator | Distance |
+|----------|----------|
+| `<->`    | L2 (Euclidean) |
+| `<#>`    | negative inner product |
+| `<=>`    | cosine |
+
+## Indexing for approximate nearest-neighbor search
+
+By default pgvector performs an exact search (perfect recall). For larger
+datasets you can add an approximate index, trading some recall for speed.
+
+### IVFFlat
+
+Build the index **after** the table contains data. A good starting point for
+the number of lists is `rows / 1000` (up to 1M rows) or `sqrt(rows)` beyond
+that.
+
+```sql
+-- L2 distance
+CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);
+
+-- Tune probes at query time (higher = better recall, slower)
+SET ivfflat.probes = 10;
+```
+
+### HNSW
+
+HNSW has slower build time and higher memory usage than IVFFlat but better
+query performance, and can be created on an empty table.
+
+```sql
+CREATE INDEX ON items USING hnsw (embedding vector_l2_ops) WITH (m = 16, ef_construction = 64);
+
+-- Tune the search candidate list at query time (default 40)
+SET hnsw.ef_search = 100;
+```
+
+Use `vector_ip_ops` (inner product) or `vector_cosine_ops` (cosine) instead of
+`vector_l2_ops` to index the corresponding distance function.
+
+## Upgrading the extension
+
+```sql
+ALTER EXTENSION vector UPDATE;
+```
+
+## Verification
+
+```sql
+SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
+```
diff --git a/docs/en/how_to/install_zhparser_extension.mdx b/docs/en/how_to/install_zhparser_extension.mdx
new file mode 100644
index 0000000..756c0ae
--- /dev/null
+++ b/docs/en/how_to/install_zhparser_extension.mdx
@@ -0,0 +1,102 @@
+---
+weight: 41
+title: Installing the zhparser Extension
+---
+
+# Installing the zhparser Extension
+
+## Overview
+
+[zhparser](https://github.com/amutu/zhparser) is a PostgreSQL full-text search
+parser for Chinese, based on SCWS. It is pre-bundled in the Spilo image shipped
+with the PostgreSQL Operator, so you only need to create the extension and a
+text-search configuration that uses it.
+
+## Prerequisites
+
+- A running PostgreSQL cluster managed by the PostgreSQL Operator.
+- A database user with privileges to create extensions (the `postgres`
+  superuser, used below). Managing the custom dictionary requires superuser
+  privileges.
+
+## Procedure
+
+### 1. Create the extension
+
+```sql
+CREATE EXTENSION IF NOT EXISTS zhparser;
+```
+
+### 2. Create a text-search configuration
+
+```sql
+CREATE TEXT SEARCH CONFIGURATION testzhcfg (PARSER = zhparser);
+ALTER TEXT SEARCH CONFIGURATION testzhcfg ADD MAPPING FOR n,v,a,i,e,l WITH simple;
+```
+
+### 3. Tokenize and build search vectors
+
+```sql
+-- Inspect raw tokenization
+SELECT * FROM ts_parse('zhparser', '保障房资金压力');
+
+-- Build a tsvector using the configuration created above
+SELECT to_tsvector('testzhcfg', '2011年保障房进入了更大规模的建设阶段');
+
+-- Build a tsquery
+SELECT to_tsquery('testzhcfg', '保障房资金压力');
+```
+
+## Custom dictionary
+
+The custom dictionary is scoped per **database** (not per instance) and is
+stored under the data directory. Adding custom words requires superuser
+privileges.
+
+```sql
+-- Add a custom word
+INSERT INTO zhparser.zhprs_custom_word VALUES ('资金压力');
+
+-- Synchronize the dictionary
+SELECT sync_zhprs_custom_word();
+```
+
+Re-establish your session (reconnect) for the change to take effect. After that,
+`资金压力` is tokenized as a single word instead of `资金` + `压力`.
+
+## Parser configuration
+
+The following options control dictionary loading and tokenization behavior
+(PostgreSQL 9.2+). All default to `false`:
+
+| Option | Purpose |
+|--------|---------|
+| `zhparser.punctuation_ignore` | Ignore punctuation and special symbols |
+| `zhparser.seg_with_duality`   | Aggregate loose characters using bigram segmentation |
+| `zhparser.dict_in_memory`     | Load the whole dictionary into memory |
+| `zhparser.multi_short`        | Compound short words |
+| `zhparser.multi_duality`      | Compound loose characters into bigrams |
+| `zhparser.multi_zmain`        | Compound important single characters |
+| `zhparser.multi_zall`         | Compound all single characters |
+
+```sql
+SHOW zhparser.punctuation_ignore;
+ALTER SYSTEM SET zhparser.punctuation_ignore = true;
+SELECT pg_reload_conf();
+```
+
+`zhparser.extra_dicts` and `zhparser.dict_in_memory` must be set before the
+backend starts (set them in the configuration and reload; new connections pick
+them up). The other options can be set per session.
+
+## Upgrading the extension
+
+```sql
+ALTER EXTENSION zhparser UPDATE;
+```
+
+## Verification
+
+```sql
+SELECT extname, extversion FROM pg_extension WHERE extname = 'zhparser';
+```
diff --git a/docs/en/how_to/run_postgresql_as_root.mdx b/docs/en/how_to/run_postgresql_as_root.mdx
new file mode 100644
index 0000000..ca89e2a
--- /dev/null
+++ b/docs/en/how_to/run_postgresql_as_root.mdx
@@ -0,0 +1,75 @@
+---
+weight: 43
+title: Running PostgreSQL Internal Processes as root
+---
+
+# Running PostgreSQL Internal Processes as root
+
+## Overview
+
+By default the PostgreSQL Operator runs the database container as a non-root
+user for security. Some integrations — for example traditional storage backends
+that require root to mount or access volumes — only work when the container runs
+as root. This guide shows how to opt into running as root.
+
+:::warning
+Running the database as root is **not recommended**. It increases the attack
+surface and violates least-privilege principles: container-escape / privilege
+escalation become more damaging, account isolation is weakened, and it may
+violate security-compliance requirements. Only enable this when an integration
+genuinely requires it.
+:::
+
+## Prerequisites
+
+- A running PostgreSQL cluster managed by the PostgreSQL Operator.
+- Permission to edit the `postgresql` custom resource.
+- On OpenShift Container Platform (OCP): the target namespace's ServiceAccount
+  must be allowed to run privileged pods (for example by binding the
+  `privileged` SCC). Without this, the pods will be rejected by the Security
+  Context Constraints admission.
+
+## Procedure
+
+### 1. Set the root security fields
+
+Edit the `postgresql` resource and set the following fields:
+
+```yaml
+spec:
+  spiloRunAsUser: 0
+  spiloRunAsGroup: 0
+  spiloPrivileged: true
+  spiloAllowPrivilegeEscalation: true
+```
+
+Apply the change. The Operator rolls the pods so the new pod security context
+takes effect.
+
+### 2. Verify
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- id
+```
+
+Expected output (uid 0):
+
+```
+uid=0(root) gid=0(root) groups=0(root),103(postgres)
+```
+
+You can also confirm the pod security context:
+
+```bash
+kubectl get pod $CLUSTER_NAME-0 -n $NAMESPACE \
+  -o jsonpath='{.spec.securityContext}{"\n"}{.spec.containers[0].securityContext}{"\n"}'
+```
+
+It should show `runAsUser: 0`, `runAsGroup: 0`, `privileged: true` and
+`allowPrivilegeEscalation: true`.
+
+## Reverting
+
+Remove the four fields (or set `spiloRunAsUser`/`spiloRunAsGroup` back to the
+non-root defaults `101`/`103` and the privileged flags to `false`) and apply.
+The Operator rolls the pods back to the non-root security context.
diff --git a/docs/en/trouble_shooting/connection_ssl_off.mdx b/docs/en/trouble_shooting/connection_ssl_off.mdx
new file mode 100644
index 0000000..bf9ce4f
--- /dev/null
+++ b/docs/en/trouble_shooting/connection_ssl_off.mdx
@@ -0,0 +1,66 @@
+---
+weight: 40
+title: Connection Fails with "SSL off"
+---
+
+# Connection Fails with "SSL off"
+
+## Problem Description
+
+A client fails to connect to PostgreSQL and the server rejects the connection
+with an error similar to:
+
+```
+[PostgreSQL error] failed to retrieve PostgreSQL server_version_num:
+FATAL: pg_hba.conf rejects connection for host "172.x.x.x", user "postgres", database "iapi", SSL off
+```
+
+The key part is `SSL off`: the client connected without SSL, and no `pg_hba.conf`
+rule matches a non-SSL (`host`) connection for that client, so PostgreSQL
+rejects it.
+
+## Root Cause
+
+`pg_hba.conf` only contains `hostssl` (SSL-only) entries for the client's
+address range, or is missing a catch-all rule for the client. A client that
+does not negotiate SSL therefore has no matching rule.
+
+## Diagnosis
+
+Inspect the effective rules:
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- \
+  psql -U postgres -c "SELECT type, database, user_name, address, auth_method, error FROM pg_hba_file_rules ORDER BY line_number;"
+```
+
+Confirm there is no `host` (or `hostssl` if the client does use SSL) rule that
+matches the client's address.
+
+## Resolution
+
+Add a matching rule under `spec.patroni.pg_hba` in the `postgresql` custom
+resource. Prefer requiring SSL where possible:
+
+```yaml
+spec:
+  patroni:
+    pg_hba:
+      - local     all          all                     trust
+      - host      all          all        127.0.0.1/32 md5
+      - hostssl   replication  standby    all          md5
+      - hostssl   all          +zalandos  all          pam
+      - hostssl   all          all        all          md5
+      # Add this if the client cannot use SSL:
+      - host      all          all        0.0.0.0/0    md5
+```
+
+Patroni reloads the configuration without a restart. See
+[Configuring the pg_hba Client Authentication Whitelist](../how_to/configure_pg_hba_whitelist.mdx)
+for the full procedure and verification.
+
+:::warning
+Adding `host all all 0.0.0.0/0 md5` allows unencrypted password authentication
+from any address. Prefer fixing the **client** to use SSL and keeping only
+`hostssl` rules whenever possible.
+:::
diff --git a/docs/en/trouble_shooting/coredump_huge_pages.mdx b/docs/en/trouble_shooting/coredump_huge_pages.mdx
new file mode 100644
index 0000000..cdab3b8
--- /dev/null
+++ b/docs/en/trouble_shooting/coredump_huge_pages.mdx
@@ -0,0 +1,56 @@
+---
+weight: 60
+title: PostgreSQL Coredump Caused by Huge Pages
+---
+
+# PostgreSQL Coredump Caused by Huge Pages
+
+## Problem Description
+
+PostgreSQL crashes on start-up with a bus error / coredump. The bootstrap log
+ends with:
+
+```
+selecting default shared_buffers ... 400kB
+selecting default time zone ... Etc/UTC
+creating configuration files ... ok
+running bootstrap script ... Bus error (core dumped)
+```
+
+## Root Cause
+
+Huge pages are enabled on the host, but the pod has no huge-page resource
+allocated, so the container cannot use them. PostgreSQL tries to request huge
+pages by default; the kernel sends `SIGBUS`, which produces the coredump.
+
+## Resolution
+
+Disable huge pages for the database. Set the `huge_pages` parameter to `off` in
+the `postgresql` custom resource:
+
+```yaml
+spec:
+  postgresql:
+    parameters:
+      huge_pages: "off"
+```
+
+Apply the change and let the Operator reconcile. Verify:
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- \
+  psql -U postgres -tAc "SHOW huge_pages;"
+```
+
+Expected output:
+
+```
+off
+```
+
+:::note
+On current operand releases, setting the `huge_pages` parameter is sufficient —
+it is applied both at runtime and during database initialization. (Older
+guidance that mounted a `postgresql.conf.sample` ConfigMap per PostgreSQL major
+version is no longer required.)
+:::
diff --git a/docs/en/trouble_shooting/fix_streaming_replication.mdx b/docs/en/trouble_shooting/fix_streaming_replication.mdx
new file mode 100644
index 0000000..afebd57
--- /dev/null
+++ b/docs/en/trouble_shooting/fix_streaming_replication.mdx
@@ -0,0 +1,65 @@
+---
+weight: 65
+title: Repairing a Broken Streaming Replica
+---
+
+# Repairing a Broken Streaming Replica
+
+## Problem Description
+
+A standby in a Patroni-managed PostgreSQL cluster is not replicating: it shows a
+large lag, is stuck, or is otherwise out of sync with the leader. The leader has
+no active streaming standby for it.
+
+## Diagnosis
+
+### 1. Inspect the cluster topology
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- patronictl list
+```
+
+A member with a large `Lag in MB`, a `Pending restart`, or a non-`running`
+state is the broken replica.
+
+### 2. Check replication state on the leader
+
+```sql
+-- On the leader: a healthy standby appears here in state 'streaming'
+SELECT application_name, state, sent_lsn, replay_lsn, sync_state
+FROM pg_stat_replication;
+
+-- An inactive slot / stale restart_lsn indicates a stuck standby
+SELECT slot_name, active, restart_lsn FROM pg_replication_slots;
+```
+
+If `pg_stat_replication` returns no row for the standby, it is not streaming.
+
+## Resolution
+
+Reinitialize the broken member from the leader. This re-clones the standby's
+data directory from the current leader.
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- \
+  patronictl reinit $CLUSTER_NAME $CLUSTER_NAME-1 --force
+```
+
+Replace `$CLUSTER_NAME-1` with the name of the broken member. Without `--force`,
+`patronictl` prompts for confirmation.
+
+After the reinit completes, confirm the member is healthy:
+
+```bash
+kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- patronictl list
+```
+
+The repaired member should show role `Replica`, state `running`/`streaming`,
+and `Lag in MB` of `0`. On the leader, `pg_stat_replication` should now list the
+member in state `streaming`.
+
+:::note
+`patronictl reinit` performs a fresh base backup of the member from the leader.
+On large databases this can take a while and consumes leader I/O; run it during
+a low-traffic window where possible.
+:::
diff --git a/docs/en/trouble_shooting/pg_wal_disk_full.mdx b/docs/en/trouble_shooting/pg_wal_disk_full.mdx
new file mode 100644
index 0000000..21c4204
--- /dev/null
+++ b/docs/en/trouble_shooting/pg_wal_disk_full.mdx
@@ -0,0 +1,63 @@
+---
+weight: 50
+title: Disk Full Due to pg_wal Accumulation
+---
+
+# Disk Full Due to pg_wal Accumulation
+
+## Problem Description
+
+The data volume fills up because Write-Ahead Log (WAL) segments under the
+`pg_wal` directory accumulate and are not recycled. The data cannot simply be
+deleted — removing WAL files by hand can corrupt the cluster.
+
+## Root Cause
+
+WAL segments are retained until they are no longer needed by every consumer
+(replicas, replication slots, archiver). The most common cause is that a standby
+cannot keep up — for example because of slow disk I/O — so replication lag grows
+and the primary must retain WAL for the lagging standby, causing `pg_wal` to
+grow without bound.
+
+## Diagnosis
+
+1. Confirm the cluster is otherwise healthy:
+
+   ```bash
+   kubectl exec -n $NAMESPACE $CLUSTER_NAME-0 -c postgres -- patronictl list
+   ```
+
+   A large, growing `Lag in MB` on a replica points to replication lag as the
+   cause.
+
+2. Check replication slots and current WAL position:
+
+   ```sql
+   SELECT slot_name, active, restart_lsn FROM pg_replication_slots;
+   SELECT * FROM pg_stat_replication;
+   ```
+
+   An inactive slot whose `restart_lsn` is far behind pins WAL on the primary.
+
+## Resolution
+
+1. **Reduce the write rate.** Lower the application's insert/update throughput
+   (for example from 10 rows/s to 5 rows/s, or pause non-essential writers) so
+   the standby can catch up and WAL can be recycled.
+
+2. **Reduce to a single node temporarily**, if acceptable to the customer, so
+   there is no lagging standby retaining WAL. Edit the `postgresql` resource:
+
+   ```bash
+   kubectl get postgresql -A
+   # set spec.numberOfInstances: 1
+   ```
+
+   After the lag clears, WAL is archived/recycled automatically and the disk
+   space is released. Scale back up once the situation is stable.
+
+:::danger
+Never delete files under `pg_wal` manually. Removing WAL that the database still
+needs will corrupt the cluster. Always resolve the underlying retention cause
+(lagging standby, stale replication slot, or stalled archiver) instead.
+:::

From 2580206f7c70ba05b63918e6320bb4f281f72cd8 Mon Sep 17 00:00:00 2001
From: Jinpei Su <jpsu@alauda.io>
Date: Thu, 18 Jun 2026 07:34:26 +0000
Subject: [PATCH 2/2] docs: address CodeRabbit review on PG KB guides

- pg_hba whitelist: warn about permissive catch-all 0.0.0.0/0 / ::0/0 rules
- pg_wal disk full: show concrete kubectl patch instead of a get + comment
- zhparser: add zhparser.extra_dicts to the options table

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/en/how_to/configure_pg_hba_whitelist.mdx | 11 +++++++++++
 docs/en/how_to/install_zhparser_extension.mdx |  1 +
 docs/en/trouble_shooting/pg_wal_disk_full.mdx | 11 +++++++----
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/docs/en/how_to/configure_pg_hba_whitelist.mdx b/docs/en/how_to/configure_pg_hba_whitelist.mdx
index 805aab3..d7fd64b 100644
--- a/docs/en/how_to/configure_pg_hba_whitelist.mdx
+++ b/docs/en/how_to/configure_pg_hba_whitelist.mdx
@@ -45,6 +45,8 @@ spec:
       - hostssl   replication  standby    all          md5
       - hostssl   all          +zalandos  all          pam
       - hostssl   all          all        all          md5
+      # The two catch-all rules below permit UNENCRYPTED password auth from any
+      # address. Include them only if clients cannot use SSL (see the warning).
       - host      all          all        0.0.0.0/0    md5
       - host      all          all        ::0/0        md5
 ```
@@ -52,6 +54,15 @@ spec:
 Apply with `kubectl apply` / `kubectl edit`. Patroni reloads the configuration
 without a database restart.
 
+:::warning
+`host all all 0.0.0.0/0 md5` (and its IPv6 form `::0/0`) allow unencrypted
+password authentication from any address, exposing credentials to network
+sniffing. Prefer the `hostssl ... md5` rules and require clients to use SSL.
+Only add the plain `host` catch-all rules when a client genuinely cannot use
+SSL — see
+[Connection fails with "SSL off"](../trouble_shooting/connection_ssl_off.mdx).
+:::
+
 ### 3. Verify
 
 ```bash
diff --git a/docs/en/how_to/install_zhparser_extension.mdx b/docs/en/how_to/install_zhparser_extension.mdx
index 756c0ae..115ec6b 100644
--- a/docs/en/how_to/install_zhparser_extension.mdx
+++ b/docs/en/how_to/install_zhparser_extension.mdx
@@ -78,6 +78,7 @@ The following options control dictionary loading and tokenization behavior
 | `zhparser.multi_duality`      | Compound loose characters into bigrams |
 | `zhparser.multi_zmain`        | Compound important single characters |
 | `zhparser.multi_zall`         | Compound all single characters |
+| `zhparser.extra_dicts`        | Comma-separated extra dictionary files (`.txt`/`.xdb`) loaded in addition to the built-in dictionary; must be set before the backend starts |
 
 ```sql
 SHOW zhparser.punctuation_ignore;
diff --git a/docs/en/trouble_shooting/pg_wal_disk_full.mdx b/docs/en/trouble_shooting/pg_wal_disk_full.mdx
index 21c4204..7bf13f7 100644
--- a/docs/en/trouble_shooting/pg_wal_disk_full.mdx
+++ b/docs/en/trouble_shooting/pg_wal_disk_full.mdx
@@ -46,15 +46,18 @@ grow without bound.
    the standby can catch up and WAL can be recycled.
 
 2. **Reduce to a single node temporarily**, if acceptable to the customer, so
-   there is no lagging standby retaining WAL. Edit the `postgresql` resource:
+   there is no lagging standby retaining WAL. Patch the `postgresql` resource to
+   one instance:
 
    ```bash
-   kubectl get postgresql -A
-   # set spec.numberOfInstances: 1
+   # Find the cluster name/namespace first if needed: kubectl get postgresql -A
+   kubectl patch postgresql -n $NAMESPACE $CLUSTER_NAME --type merge \
+     -p '{"spec":{"numberOfInstances":1}}'
    ```
 
    After the lag clears, WAL is archived/recycled automatically and the disk
-   space is released. Scale back up once the situation is stable.
+   space is released. Scale back up (restore `numberOfInstances`) once the
+   situation is stable.
 
 :::danger
 Never delete files under `pg_wal` manually. Removing WAL that the database still