Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions docs/configuration/pgdog.toml/control.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
icon: material/remote
---

# Control

Control settings configure PgDog's connection to the PgDog control plane.

!!! note "Enterprise edition"
This feature is available in [Enterprise Edition](../../enterprise_edition/index.md) only.

```toml
[control]
endpoint = "http://localhost:8080"
token = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
metrics_interval = 1000
stats_interval = 5000
active_queries_interval = 5000
request_timeout = 1000
```

### `endpoint`

Control plane endpoint PgDog connects to.

Default: **`http://localhost:8080`** (required)

### `token`

Authentication token sent to the control plane.

Default: **none** (required)

### `metrics_interval`

How often, in milliseconds, PgDog sends metrics to the control plane.

Default: **`1_000`** (1 second)

### `stats_interval`

How often, in milliseconds, PgDog sends query statistics to the control plane.

Default: **`5_000`** (5 seconds)

### `active_queries_interval`

How often, in milliseconds, PgDog sends active query information to the control plane.

Default: **`5_000`** (5 seconds)

### `request_timeout`

HTTP request timeout, in milliseconds, for requests sent to the control plane.

Default: **`1_000`** (1 second)
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
title: Query insights
nav:
- index.md
- control_plane
- ...
40 changes: 0 additions & 40 deletions docs/enterprise_edition/control_plane/cli.md

This file was deleted.

77 changes: 42 additions & 35 deletions docs/enterprise_edition/control_plane/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,65 +4,72 @@ icon: material/console

# Control plane

Multi-node PgDog deployments require synchronization to perform certain tasks, like atomic configuration changes, toggling [maintenance mode](../../administration/maintenance_mode.md), [resharding](../../features/sharding/resharding/index.md), and more. To make this work, PgDog Enterprise comes with a control plane, an application deployed alongside PgDog, to provide coordination and collect and present system telemetry.
Multi-node PgDog deployments require synchronization to perform certain tasks, like atomic [configuration](../../configuration/index.md) changes, toggling [maintenance mode](../../administration/maintenance_mode.md) and [resharding](../../features/sharding/resharding/index.md).

## How it works
To make this work, PgDog Enterprise ships with a control plane: an application deployed alongside PgDog, which provides synchronization of administrative commands.

## Installation

Ready to deploy? See the [installation guide](installation.md).

The control plane and PgDog processes communicate via the network using HTTP. They exchange messages to send metrics, commands, and other metadata that allows PgDog to transmit real-time information to the control plane, and for the control plane to control the behavior of each PgDog process.
## How it works

<center>
<img src="/images/control_plane.png" width="90%" alt="Control plane">
<img src="/images/ee/metrics.png" width="100%" alt="Control plane">
</center>

The control plane and PgDog processes communicate via the network using their own protocol, using HTTP(S) as the transport.

They exchange messages to send metrics, commands, and other metadata that allows PgDog to transmit real-time information to the control plane, and for the control plane to control the behavior of each PgDog process.

### Configuration

In order for PgDog to connect to the control plane, it needs to be configured with its endpoint address and an authentication token, both of which are specified in [`pgdog.toml`](../../configuration/pgdog.toml/general.md):
In order for PgDog to connect to the control plane, it needs to be configured with its endpoint address and an authentication token, both of which are set in [`pgdog.toml`](../../configuration/pgdog.toml/control.md):

```toml
[control]
endpoint = "https://control-plane-endpoint.cloud.pgdog.dev"
token = "cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d"
```
=== "pgdog.toml"
```toml
[control]
endpoint = "https://control-plane-endpoint.cloud.pgdog.dev"
token = "cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d"
```
=== "Helm chart"
```yaml
control:
endpoint: https://control-plane-endpoint.cloud.pgdog.dev
token: cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d
```

The authentication token is generated by the control plane and identifies each PgDog deployment. PgDog nodes which are part of the same deployment should use the same token.
PgDog nodes that are part of the same deployment should use the same token. It can be any string value and serves to differentiate one PgDog deployment from another.

For example, if you're using our [Helm chart](../../installation.md#kubernetes), you can configure the endpoint and token in `values.yaml` as follows:
!!! info "Multiple PgDog deployments"
A control plane deployment is capable of managing several PgDog deployments. It's not necessary (although possible) to have one control plane per PgDog deployment.

```yaml
control:
endpoint: https://control-plane-endpoint.cloud.pgdog.dev
token: cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d
```

### Connection flow

The connection to the control plane is initiated by PgDog on startup and happens in the background. Upon connecting, PgDog will send its node identifier (randomly generated, or set in the `NODE_ID` environment variable) to register with the control plane, and start uploading telemetry and poll for commands.
<center>
<img src="/images/control_plane.png" width="65%" alt="Control plane">
</center>

PgDog initiates a connection to the control plane on startup. This happens in the background and doesn't block queries.

!!! note "Error handling"
Since most PgDog functions (including sharding) are configuration-driven, the control plane connection is **not required**
for PgDog to start and serve queries.
Upon connecting, PgDog will send its node identifier (set in the `NODE_ID` environment variable, or randomly generated) to register with the control plane, and will start uploading telemetry and poll for commands.

If any error is encountered while communicating with the control plane,
PgDog will continue operating normally, while attempting to reconnect periodically.
#### Error handling

Since most PgDog functions (including sharding) are configuration-driven, the control plane connection is **not required**
for PgDog to start or serve queries.

This architecture makes the communication link more resilient to unreliable network conditions.
If any error is encountered while communicating with the control plane,
PgDog will continue operating normally, while attempting to reconnect periodically.

### Telemetry

PgDog transmits the following information to the control plane:

| Telemetry | Description |
|-|-|
| [Metrics](../metrics.md) | The same [metrics](../../features/metrics.md) as exposed by the Prometheus endpoint (and the admin database), are transmitted at a much higher frequency, to allow for real-time monitoring. |
| [Active queries](../insights/active_queries.md) | Queries that are currently executing through each PgDog node. |
| [Query statistics](../insights/statistics.md) | Real-time statistics on each query executed through PgDog, like duration, idle-in-transaction time, and more. |
| [Errors](../insights/errors.md) | Recent errors encountered by clients, e.g. query syntax issues. |
| [Query plans](../insights/query_plans.md) | Output of `EXPLAIN` for slow and sampled queries, collected by PgDog in the background. |
| Metrics | System and utilization metrics, transmitted on a per-second frequency. |
| Queries | Queries that are currently executing through each PgDog node. |
| Query plans | Output of `EXPLAIN` for slow and sampled queries, collected in the background. |
| Configuration | Current PgDog settings and database schema. |

#### High availability

The control plane itself is backed by a PostgreSQL database, used for storing historical metrics, query statistics, configuration, and other metadata.

This allows multiple instances of the control plane to be deployed in a high-availability setup, since all actions are synchronized by PostgreSQL transactions and locks.
153 changes: 153 additions & 0 deletions docs/enterprise_edition/control_plane/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
icon: material/cog
---

# Installation

## Kubernetes

The PgDog control plane comes with its own [Helm chart](https://github.com/pgdogdev/helm-ee). You can install it directly from our chart repository:

```bash
helm repo add pgdogdev-ee https://helm-ee.pgdog.dev
helm install control pgdogdev-ee/pgdog-control
```

The chart has a few external requirements, [documented below](#requirements).

## Guided install

While the chart creates and manages several resources, including an `Ingress`, some of them have external dependencies which cannot be created by Helm.

If you're not sure if your Kubernetes cluster has all the necessary dependencies, we created a quick script you can run to validate this:

```bash
curl -fsSL \
https://raw.githubusercontent.com/pgdogdev/helm-ee/main/install.sh | bash
```

The script requires that you have both the `awscli` and `kubectl` installed, which it will use to inspect your environment.

!!! note "Read-only actions"
The guided installation script is strictly **read-only** and will never make any modifications to your environment.


## Requirements

Since the chart creates an `Ingress` resource for the web dashboard, an ingress controller is required to access the web dashboard. The chart supports four Ingress settings out of the box:

| Ingress | Description |
|-|-|
| [Nginx](#nginx) | Uses the `ingress-nginx` controller with `cert-manager` for TLS. The controller is widely used, although currently deprecated by the Kubernetes consortium. |
| [AWS ALB](#aws-alb) | Uses the AWS ELB controller to create a load balancer. Supports TLS termination with an ACM-managed certificate. |
| Gateway API | Uses the more modern Kubernetes [Gateway API](https://kubernetes.io/docs/concepts/services-networking/gateway/), with support for gateways like Envoy. |
| Custom | All labels and annotations are exposed to the chart caller, so you can configure your own Ingress. |

### Authentication

If deploying the dashboard with access to the Internet, make sure to configure authentication to protect against unauthorized access. The control plane supports OAuth2 and two providers: GitHub and Google.

## Ingress

Most of the settings that need to be provided are around the Ingress and OAuth authentication. The [guided install](#guided-install) will configure them automatically. However, if you're installing manually, they are documented below:

| Setting | Description | Example |
|-|-|-|
| `ingress.mode` | Which [ingress](#ingress) to use for the web dashboard. | `gateway` |
| `ingress.host` | DNS for the dashboard. Tightly coupled to the TLS certificate, if enabled. | `pgdog.acme.com` |


### Nginx

The [nginx ingress](https://github.com/kubernetes/ingress-nginx/) (deprecated, but still available) is supported out of the box, along with automatic TLS termination (using `cert-manager`).

| Setting | Description | Example |
|-|-|-|
| `ingress.nginx.clusterIssuer` | The name of the `ClusterIssuer` resource. | `letsencrypt-prod` |

##### Example

```yaml title="values.yaml"
ingress:
mode: nginx
host: pgdog.acme.com
nginx:
clusterIssuer: letsencrypt-prod
```

### AWS ALB

The [AWS ALB ingress](https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html) is supported out of the box and uses ACM for TLS termination at the load balancer.

| Setting | Description | Example |
|-|-|-|
| `ingress.aws.scheme` | `internet-facing` or `internal`. | `internet-facing` |
| `ingress.aws.certificateArn` | ARN of the ACM TLS certificate (validated externally, e.g., with DNS). | `arn:aws:acm:us-east-1:111111111111:certificate/abc-123` |

##### Example

```yaml title="values.yaml"
ingress:
mode: aws
host: control.acme.com
aws:
scheme: internet-facing
certificateArn: arn:aws:acm:us-east-1:111111111111:certificate/abc-123
```

## OAuth2

OAuth2 authentication is supported out of the box for GitHub and Google providers. Either one can be configured as follows:

=== "GitHub"
```yaml title="values.yaml"
control:
config:
auth:
redirect_base_url: https://control.acme.com
github:
client_id: Iv1.0123456789abcdef
client_secret: shhh
allowed_orgs:
- acme-corp
```
=== "Google"
```yaml title="values.yaml"
control:
config:
auth:
redirect_base_url: https://control.acme.com
google:
client_id: 0123456789-abc.apps.googleusercontent.com
client_secret: shhh
allowed_domains:
- acme.com
```

The client secret can be alternatively set as an environment variable:

| Provider | Variable |
|-|-|
| GitHub| `GITHUB_CLIENT_SECRET` |
| Google | `GOOGLE_CLIENT_SECRET` |


### Access control
`allowed_orgs` (GitHub) and `allowed_domains` (Google) restrict logins to members of those organizations or email domains. If left empty, anyone who can authenticate with the provider is allowed in.

Both accept a list, so you can allow more than one:

=== "GitHub"
```yaml title="values.yaml"
github:
allowed_orgs:
- acme-corp
- acme-labs
```
=== "Google"
```yaml title="values.yaml"
google:
allowed_domains:
- acme.com
- acme.io
```
Loading
Loading