diff --git a/docs/configuration/pgdog.toml/control.md b/docs/configuration/pgdog.toml/control.md new file mode 100644 index 00000000..2429e97c --- /dev/null +++ b/docs/configuration/pgdog.toml/control.md @@ -0,0 +1,56 @@ +--- +icon: material/remote +--- + +# Control + +Control settings configure PgDog's connection to the PgDog control plane. + +!!! note "Enterprise edition" + This feature is available in [Enterprise Edition](../../enterprise_edition/index.md) only. + +```toml +[control] +endpoint = "http://localhost:8080" +token = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" +metrics_interval = 1000 +stats_interval = 5000 +active_queries_interval = 5000 +request_timeout = 1000 +``` + +### `endpoint` + +Control plane endpoint PgDog connects to. + +Default: **`http://localhost:8080`** (required) + +### `token` + +Authentication token sent to the control plane. + +Default: **none** (required) + +### `metrics_interval` + +How often, in milliseconds, PgDog sends metrics to the control plane. + +Default: **`1_000`** (1 second) + +### `stats_interval` + +How often, in milliseconds, PgDog sends query statistics to the control plane. + +Default: **`5_000`** (5 seconds) + +### `active_queries_interval` + +How often, in milliseconds, PgDog sends active query information to the control plane. + +Default: **`5_000`** (5 seconds) + +### `request_timeout` + +HTTP request timeout, in milliseconds, for requests sent to the control plane. + +Default: **`1_000`** (1 second) diff --git a/docs/enterprise_edition/insights/.pages b/docs/enterprise_edition/.pages similarity index 54% rename from docs/enterprise_edition/insights/.pages rename to docs/enterprise_edition/.pages index 52699f93..91498b87 100644 --- a/docs/enterprise_edition/insights/.pages +++ b/docs/enterprise_edition/.pages @@ -1,4 +1,4 @@ -title: Query insights nav: - index.md + - control_plane - ... diff --git a/docs/enterprise_edition/control_plane/cli.md b/docs/enterprise_edition/control_plane/cli.md deleted file mode 100644 index bb0346b8..00000000 --- a/docs/enterprise_edition/control_plane/cli.md +++ /dev/null @@ -1,40 +0,0 @@ ---- -icon: material/console-line ---- - -# CLI - -The control plane comes with a command-line interface (CLI). It allows you to create users, access tokens, and control other aspects of its operation. - -## Commands - -The control plane CLI supports the following commands: - -| Command | Description | -|-|-| -| [`onboard`](#onboarding) | Create a new user and PgDog deployment. This is typically the first command you need to run when installing the control plane. | -| `migrate` | Run PostgreSQL database migrations. Run this after upgrading your PgDog Enterprise version. | -| `server` | Run the control plane server. This is executed by default when the `control` executable is running. | -| `token` | Create an access token for PgDog to connect to the control plane. | - -### Onboarding - -The `onboard` command is an all-in-one command to create a PgDog authentication token and a web UI user associated with that token. It's typical to execute this command right after installing the control plane, for example: - -```bash -control onboard \ - --email demo@pgdog.dev \ - --password demopass \ - --token 644b527c-b9d6-4fb2-9861-703bad871ec0 \ - --name Demo -``` - -| Argument | Description | -|-|-| -| `email` | The email for the new user. | -| `password` | The password for the new user. | -| `token` | The authentication token, which grants PgDog access to the control plane to upload telemetry. | -| `name` | The name for the deployment. | -| `generate-token` | If `token` is not specified, this will generate a random one and print it to the terminal. | - -This command is idempotent: if the user exists already, this will update its password. If the token already exists, the user will be associated to that token. If all of these are already true, no changes will be made. diff --git a/docs/enterprise_edition/control_plane/index.md b/docs/enterprise_edition/control_plane/index.md index 69c91616..2ccc934d 100644 --- a/docs/enterprise_edition/control_plane/index.md +++ b/docs/enterprise_edition/control_plane/index.md @@ -4,49 +4,64 @@ icon: material/console # Control plane -Multi-node PgDog deployments require synchronization to perform certain tasks, like atomic configuration changes, toggling [maintenance mode](../../administration/maintenance_mode.md), [resharding](../../features/sharding/resharding/index.md), and more. To make this work, PgDog Enterprise comes with a control plane, an application deployed alongside PgDog, to provide coordination and collect and present system telemetry. +Multi-node PgDog deployments require synchronization to perform certain tasks, like atomic [configuration](../../configuration/index.md) changes, toggling [maintenance mode](../../administration/maintenance_mode.md) and [resharding](../../features/sharding/resharding/index.md). -## How it works +To make this work, PgDog Enterprise ships with a control plane: an application deployed alongside PgDog, which provides synchronization of administrative commands. + +## Installation + +Ready to deploy? See the [installation guide](installation.md). -The control plane and PgDog processes communicate via the network using HTTP. They exchange messages to send metrics, commands, and other metadata that allows PgDog to transmit real-time information to the control plane, and for the control plane to control the behavior of each PgDog process. +## How it works
- Control plane + Control plane
+The control plane and PgDog processes communicate via the network using their own protocol, using HTTP(S) as the transport. + +They exchange messages to send metrics, commands, and other metadata that allows PgDog to transmit real-time information to the control plane, and for the control plane to control the behavior of each PgDog process. + ### Configuration -In order for PgDog to connect to the control plane, it needs to be configured with its endpoint address and an authentication token, both of which are specified in [`pgdog.toml`](../../configuration/pgdog.toml/general.md): +In order for PgDog to connect to the control plane, it needs to be configured with its endpoint address and an authentication token, both of which are set in [`pgdog.toml`](../../configuration/pgdog.toml/control.md): -```toml -[control] -endpoint = "https://control-plane-endpoint.cloud.pgdog.dev" -token = "cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d" -``` +=== "pgdog.toml" + ```toml + [control] + endpoint = "https://control-plane-endpoint.cloud.pgdog.dev" + token = "cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d" + ``` +=== "Helm chart" + ```yaml + control: + endpoint: https://control-plane-endpoint.cloud.pgdog.dev + token: cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d + ``` -The authentication token is generated by the control plane and identifies each PgDog deployment. PgDog nodes which are part of the same deployment should use the same token. +PgDog nodes that are part of the same deployment should use the same token. It can be any string value and serves to differentiate one PgDog deployment from another. -For example, if you're using our [Helm chart](../../installation.md#kubernetes), you can configure the endpoint and token in `values.yaml` as follows: +!!! info "Multiple PgDog deployments" + A control plane deployment is capable of managing several PgDog deployments. It's not necessary (although possible) to have one control plane per PgDog deployment. -```yaml -control: - endpoint: https://control-plane-endpoint.cloud.pgdog.dev - token: cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d -``` ### Connection flow -The connection to the control plane is initiated by PgDog on startup and happens in the background. Upon connecting, PgDog will send its node identifier (randomly generated, or set in the `NODE_ID` environment variable) to register with the control plane, and start uploading telemetry and poll for commands. +
+ Control plane +
+ +PgDog initiates a connection to the control plane on startup. This happens in the background and doesn't block queries. -!!! note "Error handling" - Since most PgDog functions (including sharding) are configuration-driven, the control plane connection is **not required** - for PgDog to start and serve queries. +Upon connecting, PgDog will send its node identifier (set in the `NODE_ID` environment variable, or randomly generated) to register with the control plane, and will start uploading telemetry and poll for commands. - If any error is encountered while communicating with the control plane, - PgDog will continue operating normally, while attempting to reconnect periodically. +#### Error handling +Since most PgDog functions (including sharding) are configuration-driven, the control plane connection is **not required** +for PgDog to start or serve queries. -This architecture makes the communication link more resilient to unreliable network conditions. +If any error is encountered while communicating with the control plane, +PgDog will continue operating normally, while attempting to reconnect periodically. ### Telemetry @@ -54,15 +69,7 @@ PgDog transmits the following information to the control plane: | Telemetry | Description | |-|-| -| [Metrics](../metrics.md) | The same [metrics](../../features/metrics.md) as exposed by the Prometheus endpoint (and the admin database), are transmitted at a much higher frequency, to allow for real-time monitoring. | -| [Active queries](../insights/active_queries.md) | Queries that are currently executing through each PgDog node. | -| [Query statistics](../insights/statistics.md) | Real-time statistics on each query executed through PgDog, like duration, idle-in-transaction time, and more. | -| [Errors](../insights/errors.md) | Recent errors encountered by clients, e.g. query syntax issues. | -| [Query plans](../insights/query_plans.md) | Output of `EXPLAIN` for slow and sampled queries, collected by PgDog in the background. | +| Metrics | System and utilization metrics, transmitted on a per-second frequency. | +| Queries | Queries that are currently executing through each PgDog node. | +| Query plans | Output of `EXPLAIN` for slow and sampled queries, collected in the background. | | Configuration | Current PgDog settings and database schema. | - -#### High availability - -The control plane itself is backed by a PostgreSQL database, used for storing historical metrics, query statistics, configuration, and other metadata. - -This allows multiple instances of the control plane to be deployed in a high-availability setup, since all actions are synchronized by PostgreSQL transactions and locks. diff --git a/docs/enterprise_edition/control_plane/installation.md b/docs/enterprise_edition/control_plane/installation.md new file mode 100644 index 00000000..77144192 --- /dev/null +++ b/docs/enterprise_edition/control_plane/installation.md @@ -0,0 +1,153 @@ +--- +icon: material/cog +--- + +# Installation + +## Kubernetes + +The PgDog control plane comes with its own [Helm chart](https://github.com/pgdogdev/helm-ee). You can install it directly from our chart repository: + +```bash +helm repo add pgdogdev-ee https://helm-ee.pgdog.dev +helm install control pgdogdev-ee/pgdog-control +``` + +The chart has a few external requirements, [documented below](#requirements). + +## Guided install + +While the chart creates and manages several resources, including an `Ingress`, some of them have external dependencies which cannot be created by Helm. + +If you're not sure if your Kubernetes cluster has all the necessary dependencies, we created a quick script you can run to validate this: + +```bash +curl -fsSL \ + https://raw.githubusercontent.com/pgdogdev/helm-ee/main/install.sh | bash +``` + +The script requires that you have both the `awscli` and `kubectl` installed, which it will use to inspect your environment. + +!!! note "Read-only actions" + The guided installation script is strictly **read-only** and will never make any modifications to your environment. + + +## Requirements + +Since the chart creates an `Ingress` resource for the web dashboard, an ingress controller is required to access the web dashboard. The chart supports four Ingress settings out of the box: + +| Ingress | Description | +|-|-| +| [Nginx](#nginx) | Uses the `ingress-nginx` controller with `cert-manager` for TLS. The controller is widely used, although currently deprecated by the Kubernetes consortium. | +| [AWS ALB](#aws-alb) | Uses the AWS ELB controller to create a load balancer. Supports TLS termination with an ACM-managed certificate. | +| Gateway API | Uses the more modern Kubernetes [Gateway API](https://kubernetes.io/docs/concepts/services-networking/gateway/), with support for gateways like Envoy. | +| Custom | All labels and annotations are exposed to the chart caller, so you can configure your own Ingress. | + +### Authentication + +If deploying the dashboard with access to the Internet, make sure to configure authentication to protect against unauthorized access. The control plane supports OAuth2 and two providers: GitHub and Google. + +## Ingress + +Most of the settings that need to be provided are around the Ingress and OAuth authentication. The [guided install](#guided-install) will configure them automatically. However, if you're installing manually, they are documented below: + +| Setting | Description | Example | +|-|-|-| +| `ingress.mode` | Which [ingress](#ingress) to use for the web dashboard. | `gateway` | +| `ingress.host` | DNS for the dashboard. Tightly coupled to the TLS certificate, if enabled. | `pgdog.acme.com` | + + +### Nginx + +The [nginx ingress](https://github.com/kubernetes/ingress-nginx/) (deprecated, but still available) is supported out of the box, along with automatic TLS termination (using `cert-manager`). + +| Setting | Description | Example | +|-|-|-| +| `ingress.nginx.clusterIssuer` | The name of the `ClusterIssuer` resource. | `letsencrypt-prod` | + +##### Example + +```yaml title="values.yaml" +ingress: + mode: nginx + host: pgdog.acme.com + nginx: + clusterIssuer: letsencrypt-prod +``` + +### AWS ALB + +The [AWS ALB ingress](https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html) is supported out of the box and uses ACM for TLS termination at the load balancer. + +| Setting | Description | Example | +|-|-|-| +| `ingress.aws.scheme` | `internet-facing` or `internal`. | `internet-facing` | +| `ingress.aws.certificateArn` | ARN of the ACM TLS certificate (validated externally, e.g., with DNS). | `arn:aws:acm:us-east-1:111111111111:certificate/abc-123` | + +##### Example + +```yaml title="values.yaml" +ingress: + mode: aws + host: control.acme.com + aws: + scheme: internet-facing + certificateArn: arn:aws:acm:us-east-1:111111111111:certificate/abc-123 +``` + +## OAuth2 + +OAuth2 authentication is supported out of the box for GitHub and Google providers. Either one can be configured as follows: + +=== "GitHub" + ```yaml title="values.yaml" + control: + config: + auth: + redirect_base_url: https://control.acme.com + github: + client_id: Iv1.0123456789abcdef + client_secret: shhh + allowed_orgs: + - acme-corp + ``` +=== "Google" + ```yaml title="values.yaml" + control: + config: + auth: + redirect_base_url: https://control.acme.com + google: + client_id: 0123456789-abc.apps.googleusercontent.com + client_secret: shhh + allowed_domains: + - acme.com + ``` + +The client secret can be alternatively set as an environment variable: + +| Provider | Variable | +|-|-| +| GitHub| `GITHUB_CLIENT_SECRET` | +| Google | `GOOGLE_CLIENT_SECRET` | + + +### Access control +`allowed_orgs` (GitHub) and `allowed_domains` (Google) restrict logins to members of those organizations or email domains. If left empty, anyone who can authenticate with the provider is allowed in. + +Both accept a list, so you can allow more than one: + +=== "GitHub" + ```yaml title="values.yaml" + github: + allowed_orgs: + - acme-corp + - acme-labs + ``` +=== "Google" + ```yaml title="values.yaml" + google: + allowed_domains: + - acme.com + - acme.io + ``` diff --git a/docs/enterprise_edition/control_plane/self-hosting.md b/docs/enterprise_edition/control_plane/self-hosting.md deleted file mode 100644 index b004e6e1..00000000 --- a/docs/enterprise_edition/control_plane/self-hosting.md +++ /dev/null @@ -1,109 +0,0 @@ ---- -icon: material/server ---- - -# Self-hosting - -Self-hosting the [control plane](index.md) (not managed by PgDog) is supported and requires a bit of configuration and setup. - -## Getting started - -The easiest way to run the control plane is using our Docker image. It contains both the backend and the web UI and can be deployed as standalone application, either in Kubernetes or on any machine with Docker installed. - -### Dependencies - -The control plane has two dependencies: - -1. A PostgreSQL database used to store historical metrics, query statistics, users and other metadata -2. A Redis database, used for synchronization and real-time metrics - -If you're using our [Helm chart](#kubernetes), Redis is deployed automatically, while the PostgreSQL database has to be created manually. - -### Kubernetes - -If you're already running PgDog in Kubernetes using our [Helm chart](../../installation.md#kubernetes), you can deploy the control plane into the same cluster using our Enterprise Helm chart: - -``` -helm repo add pgdogdev-ee https://helm-ee.pgdog.dev -helm install pgdogdev-ee/pgdog-control -``` - -The following values should be set in `values.yaml`: - -| Value | Description | -|-|-| -| `image.tag` | The Docker tag for the control plane image. | -| `ingress.host` | The DNS host for the control plane, e.g., `pgdog.database.internal`. | -| `env` | A key/value mapping of [environment variables](#configuration) passed to the control plane application. | - -For example: - -```yaml -image: - tag: main-ent -ingress: - host: pgdog.database.internal -env: - DATABASE_URL: postgres://user:password@[...] -``` - -### Configuration - -!!! note "Helm chart" - If you're using the [Helm chart](#kubernetes), all variables except `DATABASE_URL` are generated from settings in `values.yaml` and don't need to be configured manually. - -The control plane is configured via environment variables. The following variables are required for it to work correctly: - -| Environment variable | Description | Example | -|-|-|-| -| `DATABASE_URL` | URL pointing to the Postgres database used for storing control plane data. | `postgres://user:password@host:5432/db` | -| `SESSION_KEY` | Secret key used to encrypt user session cookies. Can be any value, as long as it's at least 64 bytes. | `abcsf32a[...]` | -| `REDIS_URL` | URL pointing to the Redis database used for synchronization. | `redis://127.0.0.1/0` | -| `FRONTEND_URL` | The URL where the frontend application is hosted. This defaults to `ingress.host` if you're using the Helm chart. | `http://pgdog.internal` | - - - -#### Session key - -The control plane requires a 64 bytes randomly generated session key to encrypt user session cookies. If you're not using our Helm chart, you can generate one with just one line of Python: - -=== "Command" - ```bash - python3 -c "import secrets,base64; print(base64.b64encode(secrets.token_bytes(64)).decode())" - ``` -=== "Output" - ``` - 1b80a3cc1640a37b59b7dd591749ebd6532b720712e9ae2c37cb5572828ed5135332595decdadf702f919d5b58099135fbd4344979c2a0e2cf514ff3c6e640ac - ``` - -### Authentication - -The control plane web UI supports two authentication methods: - -1. Email and password -2. OAuth2 - -Password authentication works out of the box and requires no additional setup beyond creating users via the [CLI](cli.md). - -For OAuth2, you need to configure each provider, and depending on which provider you choose, different environment variables need to be set: - -=== "Google" - | Environment variable | Description | - |-|-| - | `GOOGLE_CLIENT_ID` | Google OAuth2 client identifier. You can obtain one by creating an OAuth2 application in the [Google Cloud Console](https://console.cloud.google.com/apis/credentials). | - | `GOOGLE_CLIENT_SECRET` | Google OAuth2 client secret. | - | `GOOGLE_REDIRECT_URL` | OAuth redirect URL. It should be set to the following: `${FRONTEND_URL}/google/oauth/callback`. | - -=== "GitHub" - | Environment variable | Description | - |-|-| - | `GITHUB_CLIENT_ID` | GitHub OAuth2 client identifier. You can obtain one by creating an OAuth application in the [Developer Settings](https://github.com/settings/developers) in your GitHub account. | - | `GITHUB_CLIENT_SECRET` | GitHub OAuth2 client secret. | - | `GITHUB_REDIRECT_URL` | OAuth redirect URL. It should be set to the following: `${FRONTEND_URL}/github/oauth/callback`. | - -!!! note "OAuth2 redirect" - The redirect URL (e.g., `GOOGLE_REDIRECT_URL`) is set automatically by the Helm chart. You only need to set it if you're self-hosting using a different orchestration mechanism. - -#### Creating users - -You can create a user using the [CLI](cli.md) [`onboard`](cli.md#onboarding) command. It works for both password-based and OAuth2 authentication mechanisms. diff --git a/docs/enterprise_edition/index.md b/docs/enterprise_edition/index.md index 47b521fb..37e4b423 100644 --- a/docs/enterprise_edition/index.md +++ b/docs/enterprise_edition/index.md @@ -4,24 +4,25 @@ icon: material/office-building # PgDog Enterprise -PgDog Enterprise is a version of PgDog that contains additional features for large-scale monitoring and deployment of sharded (and unsharded) PostgreSQL databases. +PgDog Enterprise is a version of PgDog with additional features for teams running PgDog in production. -Unlike PgDog itself, PgDog Enterprise is closed source and available upon the purchase of a license. It comes with a control plane which provides real-time visibility into PgDog's operations and enterprise features and dedicated support from the team that built PgDog. +It comes with a control plane, real-time visibility into PgDog's operations, and dedicated support from the team that built it. Unlike the open source edition, PgDog Enterprise is closed source and available upon the purchase of a license. ## Features +The following features are available exclusively in the Enterprise edition: + | Feature | Description | |-|-| -| [Control plane](control_plane/index.md) | Synchronize and monitor multiple PgDog processes. | -| [Schema management](schema.md) | Synchronize database schema changes between multiple PgDog nodes. | -| [Active queries](insights/active_queries.md) | Real-time view into queries running through PgDog. | -| [Query plans](insights/query_plans.md) | Root cause slow queries and execution anomalies with real-time Postgres query plans, collected in the background. | -| [Real-time metrics](metrics.md) | All PgDog metrics, delivered with second-precision through a dedicated connection. | -| [Query statistics](insights/statistics.md) | Query execution statistics, like duration, idle-in-transaction time, errors, and more. | +| [Control plane](control_plane/index.md) (beta) | Manage multiple PgDog nodes and deployments. | +| Queries | Monitor queries running through PgDog in real-time. | +| Plans | Request and track Postgres query plans for slow queries. | +| Metrics | Second-precision PgDog and resource usage metrics. | +| [Quality of Service](qos.md) (alpha) | Track and block bad queries automatically. | ## Demo -You can run a demo version of PgDog Enterprise locally with Docker Compose: +You can run a demo of PgDog Enterprise locally with Docker Compose: ```bash curl -sSL \ @@ -30,48 +31,50 @@ curl -sSL \ && docker-compose up ``` -The demo comes with the control plane, the web UI and PgDog configured as follows: +The demo comes with the control plane, the web dashboard and PgDog configured as follows: | Setting | Value | |-|-| -| Web UI | `http://localhost:8099` | -| Username | `demo@pgdog.dev` | -| Password | `demopass` | -| PgDog | `postgres://pgdog_control:pgdog_control@0.0.0.0:6432/pgdog_control` | +| Web dashboard | http://localhost:8099 | +| PgDog | postgres://postgres:postgres@0.0.0.0:6432/postgres | For questions about the demo, PgDog Enterprise features, or pricing, [contact us](https://calendly.com/lev-pgdog/30min). PgDog can be deployed on-prem, in your cloud account, or entirely managed by us. ## Getting PgDog Enterprise -The Enterprise edition is available from two sources: +You can obtain the Enterprise edition of PgDog as follows: -1. Our Docker repository -2. From source +1. Our [Docker repository](#docker-repository) +2. From [source](#from-source) ### Docker repository !!! note "Enterprise license" - Before deploying these images to production, make sure you purchased our Enterprise Edition license. You're welcome to use these for evaluation purposes, e.g., demo deployment or in a staging environment. + Before deploying these images to production, make sure you purchased our Enterprise Edition license. You're welcome to use these for evaluation purposes, e.g., for an internal demo or in a staging environment. Both PgDog and the control plane are available as Docker images: -| Application | Repository | -|-|-| -| PgDog | `ghcr.io/pgdogdev/pgdog-enterprise` | -| Control plane | `ghcr.io/pgdogdev/pgdog-enterprise/control` | +| Application | Repository | Latest tag | +|-|-|-| +| PgDog | `ghcr.io/pgdogdev/pgdog-enterprise` | `{{ enterprise_tag }}` | +| Control plane | `ghcr.io/pgdogdev/pgdog-enterprise/control` | `{{ enterprise_tag }}` | If you're using our [Helm chart](../installation.md#kubernetes), you just need to change the `image.repository` and `image.tag` variables: ```yaml image: repository: ghcr.io/pgdogdev/pgdog-enterprise - tag: a93701bd + tag: {{ enterprise_tag }} ``` -For deploying the [control plane](control_plane/index.md), you have two options: +#### Control plane + +The [control plane](control_plane/index.md) comes with its own [Helm chart](control_plane/installation.md). The chart has a few cluster dependencies, which you can check using our installation script: -1. Use our managed deployment ([contact us](https://calendly.com/lev-pgdog/30min)) -2. [Self-hosting](control_plane/self-hosting.md) +```bash +curl -fsSL \ + https://raw.githubusercontent.com/pgdogdev/helm-ee/main/install.sh | bash +``` ### From source @@ -79,10 +82,10 @@ If you want to manage all aspects of deploying PgDog Enterprise, [get in touch]( ## Roadmap -PgDog Enterprise is new and in active development. A lot of the features we want aren't built yet: +PgDog Enterprise is new and in active development. A lot of the features we want aren't fully built yet: | Feature | Description | |-|-| -| QoS | Quality of service guarantees, incl. throttling on a per-user/database/query level. | +| [Quality of Service](qos.md) | Quality of service guarantees, incl. throttling on a per-user/database/query level. | | AWS RDS integration | Deploy PgDog on top of AWS RDS, without the hassle of Kubernetes or manual configuration. | | Automatic resharding | Detect hot shards and re-shard data without operator intervention. | diff --git a/docs/enterprise_edition/insights/active_queries.md b/docs/enterprise_edition/insights/active_queries.md deleted file mode 100644 index 044e68af..00000000 --- a/docs/enterprise_edition/insights/active_queries.md +++ /dev/null @@ -1,59 +0,0 @@ ---- -icon: material/play-circle ---- - -# Active queries - -PgDog provides a real-time view into queries currently executing on its PostgreSQL connections. This is accessible in two places: - -1. [`SHOW ACTIVE_QUERIES`](#admin-database) admin command -2. [Activity](#web-ui) view in the dashboard - -## How it works - -When a client sends a query to PgDog, it will first attempt to acquire a connection from the connection pool. Once acquired, it will register the query with the live query view. After the query finishes running, it's removed from the view. - -Only queries that are currently executing through PgDog are visible. If your application doesn't connect to PgDog, its queries won't appear here. - -### Admin database - -You can see which queries are actually running on each instance by connecting to the [admin database](../../administration/index.md) and running the `SHOW ACTIVE_QUERIES` command: - -=== "Command" - ``` - SHOW ACTIVE_QUERIES; - ``` - -=== "Output" - ``` - query | protocol | database | user | running_time | plan - ---------------------------------------------------+----------+----------+-------+--------------+--------------------------------------------------------------- - SELECT * FROM users WHERE id = $1 | extended | pgdog | pgdog | 15 | Index Scan on users (cost=0.15..8.17 rows=1 width=64) - SELECT pg_sleep(50) | simple | pgdog | pgdog | 1662 | Result (cost=0.00..0.01 rows=1 width=4) - INSERT INTO users (id, email) VALUES ($1, $2) | extended | pgdog | pgdog | 1 | Insert on users (cost=0.00..0.01 rows=0 width=0) - ``` - -The following information is available in the running queries view: - -| Column | Description | -|-|-| -| `query` | The SQL statement currently executing on a PostgreSQL connection. | -| `protocol` | What version of the query protocol is used. `simple` protocol injects parameters into text, while `extended` is used by prepared statements. | -| `database` | The name of the connection pool database. | -| `user` | The name of the user executing the query. | -| `running_time` | For how long (in ms) has the query been running. | -| `plan` | The query execution plan obtained from PostgreSQL using `EXPLAIN`. | - -### Web UI - -If you're running multiple instances of PgDog, active queries from all instances are aggregated and sent to the [control plane](../control_plane/index.md). They are then made available in the Activity tab, in real-time, with query plans automatically attached for slow queries. - -
- How PgDog works -
- -### Parameters - -If your application is using prepared statements (or just placeholders in queries), the parameters for these queries are not shown and will not be sent to the control plane. - -If your application is using simple statements (parameters in query text), PgDog will normalize the queries, removing values and replacing them with parameter symbols (e.g., `$1`). This is to make sure no sensitive data leaves the database network. diff --git a/docs/enterprise_edition/insights/errors.md b/docs/enterprise_edition/insights/errors.md deleted file mode 100644 index 63dfcb69..00000000 --- a/docs/enterprise_edition/insights/errors.md +++ /dev/null @@ -1,47 +0,0 @@ ---- -icon: material/alert-circle ---- - -# Errors - -PgDog tracks query errors returned by PostgreSQL, providing a real-time view into recently encountered issues like syntax errors, missing columns, or lock timeouts. - -## Admin database - -You can see recent errors by connecting to the [admin database](../../administration/index.md) and running the `SHOW ERRORS` command: - -=== "Command" - ``` - SHOW ERRORS; - ``` - -=== "Output" - ``` - error | count | age | query - --------------------------------+-------+------+------------------------ - column "sdfsdf" does not exist | 1 | 1444 | SELECT sdfsdf; - syntax error at end of input | 3 | 500 | SELECT FROM users; - relation "foo" does not exist | 2 | 120 | SELECT * FROM foo; - ``` - -The following information is available in the errors view: - -| Column | Description | -|-|-| -| `error` | The error message returned by PostgreSQL. | -| `count` | The number of times this error has been encountered. | -| `age` | How long ago (in ms) was this error last seen. | -| `query` | The last SQL statement that caused the error. | - -## Configuration - -Errors are collected automatically if query statistics are enabled. The in-memory view is periodically purged of old errors, configurable in [`pgdog.toml`](../../configuration/pgdog.toml/general.md): - -```toml -[query_stats] -enabled = true -max_errors = 100 -max_error_age = 300_000 # 5 minutes -``` - -By default, PgDog will keep up to 100 distinct errors for a maximum of 5 minutes. This data is periodically sent to the [control plane](../control_plane/index.md), so the history of seen errors is available in the web UI. diff --git a/docs/enterprise_edition/insights/index.md b/docs/enterprise_edition/insights/index.md deleted file mode 100644 index 5cc39bc8..00000000 --- a/docs/enterprise_edition/insights/index.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -icon: material/lightbulb-on ---- - -# Query insights - -PgDog provides visibility into all queries that it serves, which allows it to analyze and report how those queries perform, in real-time. - -## Telemetry - -PgDog collects and displays the following telemetry: - -| Telemetry | Frequency | Description | -|-|-|-| -| [Active queries](active_queries.md) | real time | Queries actively executing through the proxy. | -| [Query plans](query_plans.md) | sample / threshold | Query plans (`EXPLAIN` output) are collected for slow queries and sampled queries automatically. | -| [Query statistics](statistics.md) | real time | Query duration, number of rows returned, idle-in-transaction time, errors, and more. | -| [Errors](errors.md) | real time | View into recently encountered query errors, like syntax errors or lock timeouts. | - -This data is transmitted to the [control plane](../control_plane/index.md) in real-time, which makes it available via its web dashboard and HTTP API. diff --git a/docs/enterprise_edition/insights/query_plans.md b/docs/enterprise_edition/insights/query_plans.md deleted file mode 100644 index d26f979b..00000000 --- a/docs/enterprise_edition/insights/query_plans.md +++ /dev/null @@ -1,66 +0,0 @@ ---- -icon: material/chart-timeline ---- -# Query plans - -For any [running query](active_queries.md) exceeding a configurable time threshold, PgDog will ask Postgres for a query plan. The query plans are stored in their own view, accessible via two methods: - -1. [`SHOW QUERY_PLANS`](#admin-database) admin command -2. [Activity](active_queries.md#web-ui) view in the dashboard - -## How it works - -When a [running query](active_queries.md) exceeds a configurable threshold, PgDog will ask Postgres for its query plan by sending an `EXPLAIN` command via a dedicated connection. For prepared statements, PgDog automatically provides the parameters sent with the statement by the client. - -Since `EXPLAIN` itself is very quick, fetching and storing query plans is efficient and doesn't impact database performance. Nonetheless, to avoid planning queries unnecessarily, the plans are stored in an in-memory cache. Old plans are evicted automatically and recomputed. - -### Admin database - -The query plans are accessible by connecting to the admin database and running the `SHOW QUERY_PLANS` command: - -=== "Command" - ``` - SHOW QUERY_PLANS; - ``` -=== "Output" - ``` - query | plan | database | user | age - -------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+----------+-------+--------- - select pg_sleep(50); | Result (cost=0.00..0.01 rows=1 width=4) | pgdog | pgdog | 6984139 - SELECT abalance FROM pgbench_accounts WHERE aid = $1; | Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.29..8.31 rows=1 width=4) Index Cond: (aid = 96934) | pgdog | pgdog | 7711 - (2 rows) - ``` - -The following information is available in this view: - -| Column | Description | -|-|-| -| `query` | The query for which the plan is prepared. | -| `plan` | The query plan fetched directly from PostgreSQL. | -| `database` | The name of the connection pool database. | -| `user` | The name of the user running the query. | -| `age` | How long ago the plan was fetched from Postgres (in ms). | - -### Configuration - -Which queries are planned and how frequently is configurable in [`pgdog.toml`](../../configuration/pgdog.toml/general.md): - -```toml -[query_stats] -enabled = true -query_plan_threshold = 250 # 250 ms -query_plans_cache = 100 -query_plans_sample_rate = 0.0 -query_plan_max_age = 15_000 -``` - -| Setting | Description | -|-|-| -| `query_plan_threshold` | Minimum query execution duration (in ms), as recorded by PgDog in [query statistics](statistics.md) which will trigger a plan collection. | -| `query_plans_cache` | How many plans to keep in the cache to avoid planning the same queries multiple times. | -| `query_plans_sample_rate` | Percentage of queries (0.0 - 1.0) to collect plans for irrespective of their execution duration. | -| `query_plan_max_age` | For how long (in ms) to keep plans in the cache before they are considered stale and require a new plan. | - -### Dashboard - -The query plans are automatically attached to running queries and sent to the Dashboard via a dedicated connection. They can be viewed in real-time in the [Activity](active_queries.md#web-ui) tab. diff --git a/docs/enterprise_edition/insights/statistics.md b/docs/enterprise_edition/insights/statistics.md deleted file mode 100644 index dc738fd0..00000000 --- a/docs/enterprise_edition/insights/statistics.md +++ /dev/null @@ -1,100 +0,0 @@ ---- -icon: material/chart-bar ---- - -# Query statistics - -PgDog collects detailed per-query statistics, similar to PostgreSQL's `pg_stat_statements`, with extra information useful for debugging application performance. These are viewable and searchable in the [control plane UI](../control_plane/index.md) and in the [admin database](#admin-database). - -## How it works - -All queries are normalized (parameters replaced with `$1`, `$2`, etc.) and grouped, so you can see aggregate performance data for each unique query pattern. Each query execution is recorded, along with the number of rows returned, the time it took to process the request, and how much of it was spent idling inside a transaction. - -This data is accessible via two mediums: - -1. [Admin database](#admin-database) -2. The Insights page in the web UI of the [control plane](../control_plane/index.md) - -### Admin database - -You can view query statistics by connecting to the [admin database](../../administration/index.md) and running the `SHOW QUERY_STATS` command: - -=== "Command" - ``` - SHOW QUERY_STATS; - ``` - -=== "Output" - ``` - -[ RECORD 1 ]------------+------------------------------- - query | SELECT now(); - calls | 1 - active | 0 - total_exec_time | 2.045 - min_exec_time | 2.045 - max_exec_time | 2.045 - avg_exec_time | 2.045 - total_rows | 1 - min_rows | 1 - max_rows | 1 - avg_rows | 1.000 - errors | 0 - last_exec | 2026-03-06 13:06:23.255 -08:00 - last_exec_in_transaction | 0 - idle_in_transaction_time | 0.000 - -[ RECORD 2 ]------------+------------------------------- - query | SELECT $1; - calls | 2 - active | 0 - total_exec_time | 5.718 - min_exec_time | 2.322 - max_exec_time | 3.397 - avg_exec_time | 2.859 - total_rows | 2 - min_rows | 1 - max_rows | 1 - avg_rows | 1.000 - errors | 0 - last_exec | 2026-03-06 13:06:15.990 -08:00 - last_exec_in_transaction | 0 - idle_in_transaction_time | 0.000 - ``` - -The following information is available in the query statistics view: - -| Column | Description | -|-|-| -| `query` | The normalized SQL statement. | -| `calls` | Total number of times this query has been executed. | -| `active` | Number of instances of this query currently executing. | -| `total_exec_time` | Total execution time (in ms) across all calls. | -| `min_exec_time` | Minimum execution time (in ms) of a single call. | -| `max_exec_time` | Maximum execution time (in ms) of a single call. | -| `avg_exec_time` | Average execution time (in ms) per call. | -| `total_rows` | Total number of rows returned across all calls. | -| `min_rows` | Minimum number of rows returned by a single call. | -| `max_rows` | Maximum number of rows returned by a single call. | -| `avg_rows` | Average number of rows returned per call. | -| `errors` | Total number of errors encountered by this query. | -| `last_exec` | Timestamp of the last time this query was executed. | -| `last_exec_in_transaction` | Number of times the last execution was inside a transaction. | -| `idle_in_transaction_time` | Total time (in ms) spent idle inside a transaction after this query completed. | - -### Configuration - -Query statistics collection can be enabled/disabled and tweaked via configuration in [`pgdog.toml`](../../configuration/pgdog.toml/general.md): - -```toml -[query_stats] -enabled = true -max_entries = 10_000 -``` - -By default, if enabled, query statistics will store 10,000 distinct query entries. When a new query exceeds this limit, PgDog will remove the least frequently seen query from the view, using a similar exponential decay algorithm used by `pg_stat_statements` in PostgreSQL. - - -### Comparison to `pg_stat_statements` - -PgDog's query statistics are an improvement on `pg_stat_statements` because they record information it doesn't, like `errors`, and idle-in-transaction timing. These are important for debugging production performance issues. - -Additionally, PgDog can have multiple instances of the proxy in front of the same database. This allows the query statistics implementation to have a lower impact on overall database performance, by taking advantage of multiple CPUs and reduced locking overhead. diff --git a/docs/enterprise_edition/metrics.md b/docs/enterprise_edition/metrics.md deleted file mode 100644 index 65f85d41..00000000 --- a/docs/enterprise_edition/metrics.md +++ /dev/null @@ -1,113 +0,0 @@ ---- -icon: material/speedometer ---- -# Real-time metrics - -PgDog Enterprise collects and transmits its own metrics to the [control plane](control_plane/index.md), at a configurable interval (1s, by default). This provides a real-time view into PgDog internals, without a delay that's typically present in other monitoring solutions. - -## How it works - -Real-time metrics are available in both Open Source and Enterprise versions of PgDog. The [open source metrics](../features/metrics.md) are accessible via an OpenMetrics endpoint or via the admin database. - -In PgDog Enterprise, the same metrics are collected and sent via a dedicated connection to the control plane. Since metrics are just numbers, they can be serialized and sent quickly. To deliver second-precision metrics, PgDog requires less than 1KB/second of bandwidth and little to no additional CPU or memory. - -### Configuration - -The intervals at which metrics are uploaded to the control plane are configurable in [`pgdog.toml`](../configuration/pgdog.toml/general.md): - -```toml -[control] -metrics_interval = 1_000 # 1s -endpoint = "https://control-plane-endpoint.cloud.pgdog.dev" -token = "cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d" -``` - -The default value is **1 second**, which should be sufficient to debug most production issues. - -### Web UI - -Once the metrics reach the control plane, they are pushed down to the web dashboard via a real-time connection. Per-minute aggregates are computed in the background and stored in a separate PostgreSQL database, which provides a historical view into overall database performance. - -
- PgDog Real-time Metrics -
- -## Available dashboard metrics - -Dashboard metrics are distinct from the [OpenMetrics endpoint](../features/metrics.md). They use millisecond units throughout and are collected at specified intervals. - -### Connection pool - -| Metric | Description | -|--------|-------------| -| Clients | Total number of connected clients. | -| Server Connections | Total server connections open across all pools. | -| Connection Rate (cps) | Average number of connections established to servers per second. | -| Waiting | Clients waiting for a connection from a pool. | -| Max Wait (ms) | How long the first (oldest) client in the queue has waited, in milliseconds. | -| Idle Connections | Servers available for clients to use. | -| Idle in Transaction Connections | Servers currently idle in transaction. | -| Checked Out | Servers currently serving client requests. | -| Instances | Number of PgDog instances currently connected to the control plane. | - -### Errors - -| Metric | Description | -|--------|-------------| -| Errors | Errors that connections in the pool have experienced. | -| Server Errors | Errors returned by server connections. | - -### Query throughput - -| Metric | Description | -|--------|-------------| -| Queries | Total number of executed queries. | -| Transactions | Total number of executed transactions. | -| Transaction Rate (tps) | Average number of executed transactions per statistics period. | -| Query Rate (qps) | Average number of executed queries per statistics period. | -| Blocked Queries | Queries blocked by lock contention. | - -### Timing and latency - -| Metric | Description | -|--------|-------------| -| Query Time (ms) | Total time spent executing queries. | -| Transaction Time (ms) | Total time spent executing transactions. | -| Idle in Transaction Time (ms) | Total time spent idling inside transactions. | -| Wait Time (ms) | Total time clients spent waiting for a server connection. | -| Query Response Time (ms) | Total client-observed query latency, including connection wait time. | -| Transaction Response Time (ms) | Total client-observed transaction latency, including connection wait time. | - -!!! note "Max Wait vs Wait Time" - **Max Wait** captures the worst single waiter at one instant. It drops to zero the moment that client is served. - - **Wait Time** measures total queuing burden across all clients. It stays elevated when many clients are waiting briefly. - Use both together: high Max Wait with low Wait Time points to a single slow client; high Wait Time with low Max Wait indicates widespread shallow queuing. - -### Network throughput - -| Metric | Description | -|--------|-------------| -| Bytes Received (MB) | Total number of bytes received. | -| Bytes Sent (MB) | Total number of bytes sent. | - -### Memory and caching - -| Metric | Description | -|--------|-------------| -| Prepared Statements | Number of prepared statements in the cache. | -| Prepared Statements Memory (MB) | Number of bytes used for the prepared statements cache. | -| Query Cache Size | Number of queries in the cache. | -| Query Cache Hits | Queries already present in the query cache. | -| Query Cache Misses | New queries added to the query cache. | -| Query Cache Hit Rate (%) | Percentage of queries served from the query cache. | -| Direct Shard Queries | Queries sent directly to a single shard. | -| Cross-Shard Queries | Queries sent to multiple or all shards. | -| Direct Shard Hit Rate (%) | Percentage of queries that avoided a cross-shard fanout. | - -### Query stats - -| Metric | Description | -|--------|-------------| -| Query Stats Tracked Queries | Number of unique query fingerprints currently tracked. | -| Query Stats Memory (MB) | Memory consumed by the query stats store. | diff --git a/docs/images/control_plane.png b/docs/images/control_plane.png index cbe919a7..aa2e8d41 100644 Binary files a/docs/images/control_plane.png and b/docs/images/control_plane.png differ diff --git a/docs/images/ee/metrics.png b/docs/images/ee/metrics.png index bd0a161f..ae4c9dfd 100644 Binary files a/docs/images/ee/metrics.png and b/docs/images/ee/metrics.png differ diff --git a/main.py b/main.py index cfcf75f8..1effe41e 100644 --- a/main.py +++ b/main.py @@ -7,9 +7,16 @@ log = logging.getLogger("mkdocs.plugins.macros") +# Latest released tag for the Enterprise Docker images. Update this in one +# place; reference it in docs with {{ enterprise_tag }}. Can be overridden at +# build time with the ENTERPRISE_TAG environment variable. +ENTERPRISE_TAG = os.environ.get("ENTERPRISE_TAG", "v2026-06-04") + def define_env(env): + env.variables["enterprise_tag"] = ENTERPRISE_TAG + def _validate_link(href, page): """Check that a .md link target exists on disk.""" parsed = urlparse(href)