Skip to content

Commit 14aa72e

Browse files
docs: update audit logging to be more related to bytebase (#1065)
* update audit logging * update * update * update * update
1 parent d7322ab commit 14aa72e

2 files changed

Lines changed: 112 additions & 46 deletions

File tree

content/blog/database-audit-logging.md

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'Database Audit Logging Best Practices for Compliance'
33
author: Adela
4-
updated_at: 2026/03/19 09:00:00
4+
updated_at: 2026/04/02 09:00:00
55
feature_image: /content/blog/database-audit-logging/banner.webp
66
tags: Explanation
77
description: 'How to set up database audit logging for SOC 2, HIPAA, and ISO 27001 compliance across PostgreSQL, MySQL, SQL Server, and Oracle.'
@@ -150,6 +150,35 @@ SQL is routed through a centralized gateway or workflow before executing.
150150
*For example:*
151151
A workflow platform like **Bytebase** produces complete, contextual audit logs because all SQL flows through a single, identity-aware pipeline.
152152

153+
## How Bytebase Handles Audit Logging
154+
155+
[Bytebase](https://docs.bytebase.com/security/audit-log/) takes the proxy/workflow approach: SQL executed through Bytebase's SQL Editor or change workflows — DDL, DML, and SELECT — is logged before reaching the database. Because Bytebase manages user identity, every audit record is tied to a real person, not a shared `admin` account. Direct database connections that bypass Bytebase are not captured in these logs.
156+
157+
### What gets logged
158+
159+
Bytebase records:
160+
161+
- **SQL execution** — every query that flows through the system, including the full SQL text, target database, and execution result
162+
- **Schema changes** — issue creation, approval decisions, rollout status
163+
- **Data access** — data queries and exports, with the requesting user's identity
164+
- **Authentication** — login, logout, SSO token exchange
165+
- **Permission changes** — role grants, project membership updates, policy modifications
166+
- **System configuration** — instance connection changes, environment settings, workspace policies
167+
168+
Each entry includes the user's email, IP address, timestamp, operation duration, affected resource, and request/response payloads. Sensitive fields (passwords, certificates, SSH keys) are automatically redacted.
169+
170+
### Export and integration
171+
172+
Three ways to get audit data out:
173+
174+
1. **GUI** — filter by user, action type, resource, and date range in Settings → Audit Log
175+
2. **API** — query `/v1/auditLogs:search` (workspace-level) or `/v1/projects/{project}/auditLogs:search` (project-level). Returns structured JSON ready for any SIEM. See the [API audit log tutorial](https://docs.bytebase.com/tutorials/api-audit-log) for examples.
176+
3. **Log streaming** — enable audit log export to stdout in Settings → General → Audit Log Export. Add the `--enable-json-logging` flag to output structured JSON, which a Datadog/Splunk/Grafana agent can ingest directly
177+
178+
### Availability
179+
180+
Audit logging is available on [Pro and Enterprise plans](https://www.bytebase.com/pricing/). The Pro plan covers most audit needs; Enterprise adds custom approval workflows and advanced access control that generate additional audit events.
181+
153182
## Recommended Best Practices
154183

155184
Regardless of database engine or auditing method, strong audit practices share the same foundations:
@@ -185,4 +214,12 @@ SOC 2, ISO 27001, HIPAA, PCI DSS, and GDPR all require some form of database aud
185214

186215
**How do I export database audit logs to Datadog or Splunk?**
187216

188-
Most engines write audit logs to files or system tables. For PostgreSQL, configure `pgaudit` to write to `csvlog` and use a Datadog or Splunk agent to ingest the files. For MySQL, enable the audit plugin and point the log file at your SIEM collector. For SQL Server, parse the `.sqlaudit` files with `fn_get_audit_file()` and forward via a log shipper. Bytebase provides a built-in [audit log API](/docs/security/audit-log/) that exports structured JSON, ready for any SIEM.
217+
Most engines write audit logs to files or system tables. For PostgreSQL, configure `pgaudit` to write to `csvlog` and use a Datadog or Splunk agent to ingest the files. For MySQL, enable the audit plugin and point the log file at your SIEM collector. For SQL Server, parse the `.sqlaudit` files with `fn_get_audit_file()` and forward via a log shipper. Bytebase provides a built-in [audit log API](https://docs.bytebase.com/security/audit-log/) that exports structured JSON, ready for any SIEM.
218+
219+
**How does Bytebase handle database audit logging?**
220+
221+
All SQL executed through Bytebase — via the SQL Editor or change workflows — is automatically logged with the real user's identity, full SQL text, target database, timestamp, and execution result. Direct database connections that bypass Bytebase are not captured. Logs can be queried via the GUI, exported via API (`/v1/auditLogs:search`), or streamed as JSON to any SIEM. Available on Pro and Enterprise plans.
222+
223+
**Do I still need engine-native auditing if I use Bytebase?**
224+
225+
It depends on your compliance scope. Bytebase captures all SQL that flows through its gateway — schema changes, data queries, exports, and admin actions. If you also have direct database connections that bypass Bytebase (e.g., emergency SSH access or application service accounts), you should keep engine-native auditing enabled for those paths. Many teams use Bytebase as the primary audit trail and engine-native logs as a secondary safety net.
Lines changed: 73 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'What is Dynamic Data Masking (DDM)'
33
author: Tianzhou
4-
updated_at: 2024/09/02 09:00
4+
updated_at: 2026/04/02 09:00
55
feature_image: /content/blog/what-is-dynamic-data-masking/cover.webp
66
tags: Explanation
77
featured: true
@@ -14,50 +14,28 @@ Dynamic Data Masking (DDM) protects sensitive data in real-time by dynamically a
1414

1515
DDM contrasts with Static Data Masking (SDM). While SDM involves creating a permanently altered, non-reversible copy of the original data, DDM modifies the data on-the-fly as it is accessed in real-time. This dynamic approach ensures that sensitive data remains protected during query execution without changing the underlying data at rest.
1616

17-
## Use Case
17+
## When to Use Dynamic Data Masking vs Static Data Masking
1818

19-
The primary use case for SDM is to create safe, sanitized versions of production data for use in non-production environments.
20-
On the other hand, DDM is primarily used in production environments to control and limit access to sensitive data dynamically, based on user roles, permissions, or other contextual factors. This allows organizations to protect sensitive information without needing to alter the underlying data, making it a powerful tool for maintaining security and compliance in real-time data access scenarios.
19+
Static Data Masking (SDM) creates sanitized copies of production data for dev/test environments. DDM is different — it masks data in real-time in production, controlling what each user sees based on their role and permissions. The underlying data stays untouched.
2120

22-
## DDM Complexity
21+
| | Static Data Masking | Dynamic Data Masking |
22+
|---|---|---|
23+
| **Environment** | Non-production (dev, test, staging) | Production |
24+
| **Data altered?** | Yes — permanent copy | No — masked on-the-fly |
25+
| **Use case** | Safe test data | Role-based access control |
2326

24-
The complexity of DDM arises primarily from its dynamic nature, where the system must make real-time decisions about how and when to mask data based on various runtime contexts. These contexts include:
27+
## What Makes Dynamic Data Masking Hard
2528

26-
### User Context
29+
DDM has to make real-time decisions about what each user sees. The complexity comes from the number of variables involved:
2730

28-
- **Role-Based Access**: Different users or roles may have varying levels of access to data. DDM must dynamically adjust the visibility of data based on the user’s identity, ensuring that only authorized users can see sensitive information in its unmasked form.
31+
- **User role and identity** — a DBA sees unmasked data, an analyst sees partial masks, a contractor sees full masks. The same query returns different results depending on who runs it.
32+
- **Temporary access** — an on-call engineer needs unmasked access to debug a production incident, then the access should expire.
33+
- **Column-level granularity** — an `email` column might need partial masking while a `phone` column needs full masking, even in the same table.
34+
- **Multiple databases and environments** — masking rules in production differ from staging. If you run MySQL, PostgreSQL, and Oracle, each has different (or no) native DDM support.
35+
- **Masking algorithm choice** — partial masking keeps data useful for debugging (`john@****`), but full masking or hashing is needed for compliance. Picking the wrong algorithm makes the data either too exposed or too useless.
36+
- **Performance** — masking happens on every query at runtime. A poorly implemented DDM layer adds latency to every SELECT.
2937

30-
- **User Location and Device**: In some scenarios, data access might be influenced by the user's location (e.g., within or outside a corporate network) or the device being used. DDM must be capable of factoring in these variables dynamically.
31-
32-
### Temporal Context
33-
34-
- **Temporary Access**: User may require temporary access to solve emergencies.
35-
36-
- **Date and Time Sensitivity**: Certain data might only be considered sensitive during specific time periods, requiring DDM to adapt its behavior accordingly.
37-
38-
### Target Database Column
39-
40-
- **Column-Specific Masking**: Different columns in a database might require different masking techniques or rules. DDM must dynamically apply the appropriate masking algorithm based on the specific column being accessed.
41-
42-
- **Complex Data Types**: Handling complex data types, such as JSON or XML within columns, adds additional layers of complexity as DDM must parse and selectively mask content within these structures.
43-
44-
### Application Context
45-
46-
- **Environment-Specific Masking**: The masking rules may need to vary depending on the environment in which the application is running (e.g., dev, test, UAT, prod). DDM must recognize the environment and apply the appropriate level of masking.
47-
48-
- **Business Project or Use Case**: Different business projects or use cases might have unique data access requirements.
49-
50-
### Masking Algorithm
51-
52-
- **Algorithm Selection**: DDM must dynamically choose the most suitable masking algorithm based on the context, ensuring that the data remains useful while still protecting sensitive information. Algorithms might include techniques like partial masking, randomization, or tokenization.
53-
54-
- **Algorithm Complexity and Performance**: The choice of masking algorithm has a direct impact on performance. DDM needs to balance the security provided by the algorithm with the need to minimize performance overhead, ensuring that query execution times remain acceptable.
55-
56-
### Performance
57-
58-
Given the dynamic nature of DDM, one of the critical challenges is minimizing the performance overhead associated with real-time masking. This involves optimizing the masking logic to ensure that it is both efficient and scalable, particularly in high-traffic environments.
59-
60-
## Database Support
38+
## Which Databases Support Dynamic Data Masking
6139

6240
| Databases | Supported |
6341
| ---------- | --------------------------------------------------------------------------------------------------- |
@@ -82,11 +60,62 @@ CREATE OR REPLACE MASKING POLICY email_mask AS (val string) RETURNS string ->
8260
ALTER TABLE IF EXISTS user_info MODIFY COLUMN email SET MASKING POLICY email_mask;
8361
```
8462

85-
Database engine only provides the data masking primitives. Holistically configuring the masking policy for
86-
an entire organization is still a big challenge.
63+
Database engines only provide masking primitives. Holistically configuring masking policies for an entire organization — across multiple databases, environments, and user roles — is still a big challenge. For database-specific guides, see [Data Masking for MySQL](/blog/mysql-data-masking/) and [Data Masking for PostgreSQL](/blog/postgres-data-masking/). For Snowflake specifically, see [Snowflake Dynamic Data Masking and Alternatives](/blog/snowflake-dynamic-data-masking-and-alternatives/).
64+
65+
## How Bytebase Handles Dynamic Data Masking
66+
67+
[Bytebase](https://docs.bytebase.com/security/data-masking/overview/) implements DDM at the application layer rather than relying on database-native features. All queries through Bytebase's SQL Editor are masked in real-time based on policies you define. This is particularly valuable for MySQL and PostgreSQL, which have no native DDM support.
68+
69+
### Supported databases
70+
71+
Bytebase DDM works with MySQL, PostgreSQL, Oracle, TiDB, and others — the same masking policies apply across all of them, regardless of whether the engine has native DDM.
72+
73+
### How masking is configured
74+
75+
Bytebase uses a three-level policy system:
76+
77+
1. **Global masking rules** — workspace admins apply batch masking to columns matching a name pattern (e.g., all columns named `ssn` or `email` across every database)
78+
2. **Column-level masking** — project owners set masking on specific table columns
79+
3. **Masking exemptions** — grant specific users access to unmasked data when needed
80+
81+
Precedence: exemptions > global rules > column masking.
82+
83+
Policies are organized around **semantic types** — you classify columns (e.g., "PII-email", "PII-phone") and attach a masking algorithm to the type. Changing one semantic type updates masking for all columns tagged with it.
84+
85+
### Masking algorithms
86+
87+
Five built-in algorithms:
88+
89+
| Algorithm | Example | Use case |
90+
|-----------|---------|----------|
91+
| Full mask | `123456789``*` | Completely hide the value |
92+
| Range mask | `john@example.com``john@****` | Preserve prefix for usability |
93+
| Inner mask | `123456``12**56` | Show edges, hide middle |
94+
| Outer mask | `123456``**34**` | Show middle, hide edges |
95+
| MD5 mask | `value``2063c1608d6e0baf80249c42e2be5804` | Irreversible hash for analytics |
96+
97+
### Infrastructure as code
98+
99+
Masking policies can be managed via [Bytebase's Terraform provider](https://docs.bytebase.com/tutorials/manage-data-masking-with-terraform/) — define semantic types, global rules, and column masking in HCL and apply across environments.
100+
101+
### Availability
102+
103+
Dynamic Data Masking is available on the [Enterprise plan](https://www.bytebase.com/pricing/). DDM is one part of Bytebase's broader [database access control](/blog/database-access-control-best-practices/) capabilities, which also include role-based access, [just-in-time access](/blog/just-in-time-database-access/), and [audit logging](/blog/database-audit-logging/).
104+
105+
## FAQ
106+
107+
**What is Dynamic Data Masking?**
108+
109+
Dynamic Data Masking (DDM) protects sensitive data by altering query results in real-time based on user roles and policies, without changing the data at rest. Unlike static data masking, which creates a permanent sanitized copy, DDM applies masking on-the-fly during query execution.
110+
111+
**Which databases support Dynamic Data Masking natively?**
112+
113+
Oracle, SQL Server, BigQuery, and Snowflake have built-in DDM features. MySQL and PostgreSQL do not support DDM natively. Bytebase provides application-layer DDM for MySQL, PostgreSQL, Oracle, TiDB, and others, using the same policies across all engines.
114+
115+
**How does Bytebase implement DDM for MySQL and PostgreSQL?**
87116

88-
<HintBlock type="info">
117+
Bytebase applies masking at the application layer when queries run through its SQL Editor. No database extensions, views, or plugins are required. You define masking policies centrally in Bytebase, and they apply consistently across all connected databases.
89118

90-
Bytebase provides an UI interface as well as API to [configure Dynamic Data Masking](https://docs.bytebase.com/security/data-masking/overview/). In particular, Bytebase supports MySQL and PostgreSQL.
119+
**What is the difference between Dynamic Data Masking and Static Data Masking?**
91120

92-
</HintBlock>
121+
Static Data Masking (SDM) creates a permanent, altered copy of production data for use in non-production environments. Dynamic Data Masking (DDM) modifies data on-the-fly as it is queried, without changing the underlying data. SDM is for dev/test environments; DDM is for production access control.

0 commit comments

Comments
 (0)