| title | Manage Data Masking with Terraform |
|---|---|
| author | Adela |
| updated_at | 2025/07/16 21:15 |
| tags | Tutorial |
| integrations | Terraform |
| category | Integration |
| featured | true |
| level | Intermediate |
| estimated_time | 30 mins |
import TerraformGitHubSample from '/snippets/tutorials/terraform-github-sample.mdx';
This tutorial is part of the Bytebase Terraform Provider series:
- Part 1: Manage Environments with Terraform - Set up environments with policies
- Part 2: Manage Databases with Terraform - Register database instances
- Part 3: Manage Projects with Terraform - Organize databases into projects
- Part 4: Manage Bytebase Settings with Terraform - Configure workspace profile and approval policies
- Part 5: Manage SQL Review Rules with Terraform - Define SQL review policies
- Part 6: Manage Users and Groups with Terraform - Configure users and groups
- Part 7: Manage Database Access Control with Terraform - Grant database permissions
- Part 8: Manage Data Masking with Terraform 👈
- Define semantic types with various masking algorithms
- Configure data classification levels and categories
- Create global masking policies that apply workspace-wide
- Set up database-specific column masking
- Grant masking exceptions for specific users
Before starting this tutorial, ensure you have:
- Completed Part 7: Manage Database Access Control with Terraform
- Bytebase running with service account configured
- Your Terraform files from the previous tutorials
From the previous tutorials, you should have:
- Database instances and projects configured
- Users and access controls set up
- Production database
hr_prodwith employee data
Bytebase employs two concepts for data masking:
- Semantic Types: Define masking algorithms (e.g., full mask, partial mask)
- Classifications: Group data by sensitivity levels (e.g., Level 1, Level 2)
Classifications MUST be mapped to semantic types for masking to work:
- Classifications define sensitivity levels (Level 1, Level 2, etc.) but cannot mask data by themselves
- Semantic types define the actual masking algorithms (full-mask, range-mask, etc.)
You must map classifications to semantic types for masking to occur (e.g., Level 2 → full-mask)
You can apply masking in two ways:
-
Global masking rules: Define workspace-wide rules that map to semantic types
- Match by column names or patterns → semantic type
- Match by classification levels → semantic type
-
Column-level masking: Apply directly to specific columns
- Assign semantic types directly to columns
- Assign classifications to columns (which then use semantic types via global rules)
| Terraform resource | bytebase_setting |
| Sample file | 8-1-semantic-types.tf |
Create 8-1-semantic-types.tf to define masking algorithms:
resource "bytebase_setting" "semantic_types" {
name = "settings/SEMANTIC_TYPES"
semantic_types {
id = "full-mask"
title = "Full mask"
algorithm {
full_mask {
substitution = "***"
}
}
}
semantic_types {
id = "date-year-mask"
title = "Date year mask"
algorithm {
range_mask {
slices {
start = 0
end = 4
substitution = "****"
}
}
}
}
semantic_types {
id = "name-first-letter-only"
title = "Name first letter only"
algorithm {
inner_outer_mask {
prefix_len = 1
suffix_len = 0
substitution = "*"
type = "INNER"
}
}
}
}| Terraform resource | bytebase_setting |
| Sample file | 8-2-classification.tf |
Create 8-2-classification.tf to organize data by sensitivity levels:
resource "bytebase_setting" "classification" {
name = "settings/DATA_CLASSIFICATION"
classification {
id = "classification-example"
title = "Classification Example"
levels {
level = 1
title = "Level 1"
}
levels {
level = 2
title = "Level 2"
}
classifications {
id = "1"
title = "Basic"
}
classifications {
id = "1-1"
title = "User basic"
level = 1
}
classifications {
id = "1-2"
title = "User contact info"
level = 2
}
classifications {
id = "2"
title = "Employment"
}
classifications {
id = "2-1"
title = "Employment info"
level = 2
}
}
}Apply the semantic types and classification configuration:
terraform plan
terraform applyVerify in Bytebase:
-
Click Data Access > Semantic Types on the left sidebar. You should see three masking types configured.
-
Click Data Access > Data Classification on the left sidebar. You should see the classification hierarchy with two levels. Note that Level 2 is marked as more sensitive.
Now that you've defined your masking methods, apply them workspace-wide using a global policy.
Classification levels must be mapped to semantic types to perform actual masking. Classification defines the sensitivity level, while semantic types define the masking algorithm.| Terraform resource | bytebase_policy |
| Sample file | 8-3-global-data-masking.tf |
Create 8-3-global-data-masking.tf to apply workspace-wide masking rules:
resource "bytebase_policy" "global_masking_policy" {
depends_on = [
bytebase_instance.prod,
bytebase_setting.environments
]
# parent defaults to the current workspace when omitted.
type = "MASKING_RULE"
enforce = true
inherit_from_parent = false
global_masking_policy {
rules {
condition = "resource.column_name == \"birth_date\""
id = "birth-date-mask"
semantic_type = "date-year-mask"
title = "Mask Birth Date Year"
}
rules {
condition = "resource.column_name == \"last_name\""
id = "last-name-first-letter-only"
semantic_type = "name-first-letter-only"
title = "Last Name Only Show First Letter"
}
rules {
condition = "resource.classification_level == 2"
id = "classification-level-2"
semantic_type = "full-mask" # Maps Level 2 classification to full-mask semantic type
title = "Full Mask for Classification Level 2"
}
}
}Apply the global policy:
terraform plan
terraform applyVerify in Bytebase:
-
Click Data Access > Global Masking. You should see the global policy with three conditions with corresponding semantic types.
-
Log in as Developer 1 (
dev1@example.com), then go to SQL Editor to accesshr_prod. Double-clickemployeetable on the left.birth_datehasMask Birth Date Yearsemantic type, andlast_namehasLast Name Only Show First Letter. Hovering the eye icon will show the masking reason.
| Terraform resource | bytebase_database |
| Sample file | 8-4-database-masking.tf |
Create 8-4-database-masking.tf to apply masking to specific columns:
- Column
from_dateis assigned the semantic typedate-year-mask - Column
amountis assigned the classification2-1(Employment info)
resource "bytebase_database" "database" {
depends_on = [
bytebase_instance.prod,
bytebase_project.project-two,
bytebase_setting.environments
]
name = "instances/prod-sample-instance/databases/hr_prod"
project = bytebase_project.project-two.name
environment = bytebase_setting.environments.environment_setting[0].environment[1].name
catalog {
schemas {
name = "public"
tables {
name = "salary"
columns {
name = "from_date"
semantic_type = "date-year-mask"
}
columns {
name = "amount"
classification = "2-1"
}
}
}
}
}Apply the column-specific masking:
terraform plan
terraform applyVerify in Bytebase:
-
Go into Project Two, then click Database > Databases and click hr_prod.
-
Scroll down to find
salarytable, click it. You should see:amountis assigned asEmployment info(Level 2) classificationfrom_dateis assigned asdate-year-masksemantic type
-
Log in as Developer 1 (
dev1@example.com), then go to SQL Editor to accesshr_prod. Double-clicksalarytable on the left.from_datehasDate year masksemantic type, andamounthasL2classification which leads toFull maskingsemantic type.
| Terraform resource | bytebase_policy |
| Sample file | 8-5-masking-exemption.tf |
Create 8-5-masking-exemption.tf to grant bypass permissions:
- Workspace Admin (
admin@example.com) and QA (qa1@example.com) can see unmaskedbirth_dateandlast_namein theemployeetable (expires 2027-07-30). - Developer 1 (
dev1@example.com) can see unmaskedfirst_name,last_name, andgenderin theemployeetable (viaraw_expression, no expiry).
resource "bytebase_policy" "masking_exemption_policy" {
depends_on = [
bytebase_project.project-two,
bytebase_instance.prod
]
parent = bytebase_project.project-two.name
type = "MASKING_EXEMPTION"
enforce = true
inherit_from_parent = false
masking_exemption_policy {
exemptions {
reason = "Business requirement"
database = "instances/prod-sample-instance/databases/hr_prod"
table = "employee"
columns = ["birth_date", "last_name"]
members = ["user:admin@example.com", "user:qa1@example.com"]
expire_timestamp = "2027-07-30T16:11:49Z"
}
exemptions {
reason = "Grant query access"
members = ["user:dev1@example.com"]
raw_expression = "resource.instance_id == \"prod-sample-instance\" && resource.database_name == \"hr_prod\" && resource.table_name == \"employee\" && resource.column_name in [\"first_name\", \"last_name\", \"gender\"]"
}
}
}2027-07-30T16:11:49Z is an ISO 8601 UTC timestamp.
Our system uses PostgreSQL to store metadata, where this value is stored as a timestamptz.
Apply the masking exceptions and test everything:
terraform plan
terraform applyVerify the masking exemptions are working:
-
Log in as Workspace Admin (
admin@example.com), then go to SQL Editor to accesshr_prodand double-click theemployeetable. Bothbirth_dateandlast_nameare now unmasked for both query and export. -
You may go to Manage > Masking Exemptions to view current exemptions. They will expire automatically after the expiration time.
This tutorial demonstrated how to implement data masking in Bytebase using Terraform. Here are the key concepts:
Define Phase:
- Semantic Types: Define reusable masking algorithms
- Classification: Organize data by sensitivity levels (requires mapping to semantic types)
Apply Phase:
- Global Policies: Apply masking rules workspace-wide based on conditions
- Column-Level Masking: Apply semantic types or classifications to specific columns
Additional Control:
- Masking Exemption: Grant bypass permissions for specific users to query/export unmasked data
Congratulations! You've completed the Bytebase Terraform tutorial series. You now have a fully configured Bytebase workspace with:
- Database instances and environments
- Organized projects
- Risk policies and approval workflows
- SQL review rules for schema standards
- Database access control
- Data masking for sensitive information







