Skip to content

Commit d9955d4

Browse files
committed
Style improvements
1 parent d4ec3b6 commit d9955d4

4 files changed

Lines changed: 24 additions & 15 deletions

File tree

modules/manage/pages/iceberg/iceberg-performance-tuning.adoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
include::shared:partial$enterprise-license.adoc[]
1313
====
1414

15-
Use this guide to optimize the performance of Iceberg topics in Redpanda. It covers strategies for improving downstream query performance, tuning the Iceberg translation pipeline, and monitoring translation throughput.
15+
This guide covers strategies for optimizing the performance of Iceberg topics in Redpanda, including improving downstream query performance, tuning the Iceberg translation pipeline, and monitoring translation throughput.
1616

1717
After reading this page, you will be able to:
1818

@@ -22,7 +22,7 @@ After reading this page, you will be able to:
2222
2323
== Prerequisites
2424

25-
Before tuning Iceberg performance, you need to be familiar with how Iceberg topics work in Redpanda. See xref:manage:iceberg/about-iceberg-topics.adoc[About Iceberg Topics].
25+
You must be familiar with how Iceberg topics work in Redpanda. See xref:manage:iceberg/about-iceberg-topics.adoc[About Iceberg Topics].
2626

2727
== Optimize query performance
2828

@@ -32,7 +32,7 @@ Query engines read Parquet files from object storage to process Iceberg table da
3232

3333
To improve query performance, consider implementing custom https://iceberg.apache.org/docs/nightly/partitioning/[partitioning^] for the Iceberg topic. Use the xref:reference:properties/topic-properties.adoc#redpanda-iceberg-partition-spec[`redpanda.iceberg.partition.spec`] topic property to define the partitioning scheme:
3434

35-
[,bash,]
35+
[,bash]
3636
----
3737
# Create new topic with five topic partitions, replication factor 3, and custom table partitioning for Iceberg
3838
rpk topic create <new-topic-name> -p5 -r3 -c redpanda.iceberg.mode=value_schema_id_prefix -c "redpanda.iceberg.partition.spec=(<partition-key1>, <partition-key2>, ...)"
@@ -50,7 +50,7 @@ To learn more about how partitioning schemes can affect query performance, and f
5050

5151
[TIP]
5252
====
53-
* Partition by columns that you frequently use in queries. Columns with relatively few unique values, also known as low cardinality, are also good candidates for partitioning.
53+
* Partition by columns that you frequently use in queries. Columns with relatively few unique values (low cardinality) are good candidates for partitioning.
5454
* If you must partition based on columns with high cardinality, for example timestamps, use Iceberg's available transforms such as extracting the year, month, or day to avoid creating too many partitions. Too many partitions can be detrimental to performance because more files need to be scanned and managed.
5555
====
5656

modules/manage/pages/iceberg/iceberg-topics-gcp-biglake.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ iceberg_dlq_table_suffix: _dlq
246246
+
247247
--
248248
* Replace `<bucket-name>` with your bucket name and `<gcp-project-id>` with your Google Cloud project ID.
249-
* You must set the `iceberg_dlq_table_suffix` property to a value that does not include dots or tildes (`~`). The example above uses `_dlq` as the suffix for the xref:manage:iceberg/iceberg-troubleshooting.adoc#dead-letter-queue-dlq[dead-letter queue (DLQ) table].
249+
* You must set the `iceberg_dlq_table_suffix` property to a value that does not include dots or tildes (`~`). The example above uses `_dlq` as the suffix for the xref:manage:iceberg/iceberg-troubleshooting.adoc#dead-letter-queue[dead-letter queue (DLQ) table].
250250
--
251251
+
252252
NOTE: If you edit `bootstrap.yml`, you can skip the cluster configuration step in <<configure-redpanda-for-iceberg>> and proceed to the next step in that section to enable Iceberg for a topic.
@@ -293,7 +293,7 @@ iceberg_dlq_table_suffix: _dlq
293293
+
294294
--
295295
* Replace `<bucket-name>` with your bucket name and `<gcp-project-id>` with your Google Cloud project ID.
296-
* You must set the `iceberg_dlq_table_suffix` property to a value that does not include dots or tildes (`~`). The example above uses `_dlq` as the suffix for the xref:manage:iceberg/iceberg-troubleshooting.adoc#dead-letter-queue-dlq[dead-letter queue (DLQ) table].
296+
* You must set the `iceberg_dlq_table_suffix` property to a value that does not include dots or tildes (`~`). The example above uses `_dlq` as the suffix for the xref:manage:iceberg/iceberg-troubleshooting.adoc#dead-letter-queue[dead-letter queue (DLQ) table].
297297
--
298298

299299
ifndef::env-cloud[]

modules/manage/pages/iceberg/iceberg-troubleshooting.adoc

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
= Troubleshoot Iceberg Topics
2-
:description: Diagnose and resolve errors in Redpanda Iceberg translation, including dead-letter queue inspection and record reprocessing.
2+
:description: Diagnose and resolve errors in Redpanda Iceberg translation, including dead-letter queue (DLQ) inspection and record reprocessing.
33
:page-categories: Iceberg, Troubleshooting
4+
:page-topic-type: troubleshooting
5+
:personas: ops_admin, streaming_developer
6+
:learning-objective-1: Diagnose Iceberg translation errors using DLQ tables and metrics
7+
:learning-objective-2: Reprocess or drop invalid records from the DLQ table
48

59
// tag::single-source[]
610

@@ -11,11 +15,16 @@ include::shared:partial$enterprise-license.adoc[]
1115
====
1216
endif::[]
1317

14-
This page covers how to diagnose and resolve errors that occur during Iceberg translation, including working with dead-letter queue (DLQ) tables and handling invalid records.
18+
{description}
1519

16-
== Dead-letter queue (DLQ)
20+
Use this page to:
1721

18-
If Redpanda encounters an error while writing a record to the Iceberg table, Redpanda by default writes the record to a separate dead-letter queue (DLQ) Iceberg table named `<topic-name>~dlq`. The following can cause errors to occur when translating records in the `value_schema_id_prefix` and `value_schema_latest` modes to the Iceberg table format:
22+
* [ ] {learning-objective-1}
23+
* [ ] {learning-objective-2}
24+
25+
== Dead-letter queue
26+
27+
If Redpanda encounters an error while writing a record to the Iceberg table, Redpanda by default writes the record to a separate DLQ Iceberg table named `<topic-name>~dlq`. The following can cause errors to occur when translating records in the `value_schema_id_prefix` and `value_schema_latest` modes to the Iceberg table format:
1928

2029
- Redpanda cannot find the embedded schema ID in the Schema Registry.
2130
- Redpanda fails to translate one or more schema data types to an Iceberg type.
@@ -62,7 +71,7 @@ The data is in binary format, and the first byte is not `0x00`, indicating that
6271

6372
=== Reprocess DLQ records
6473

65-
You can apply a transformation and reprocess the record in your data lakehouse to the original Iceberg table. In this case, you have a JSON value represented as a UTF-8 binary. Depending on your query engine, you might need to decode the binary value first before extracting the JSON fields. Some engines may automatically decode the binary value for you:
74+
You can apply a transformation and reprocess the record in your data lakehouse to the original Iceberg table. In this case, you have a JSON value represented as a UTF-8 binary. Depending on your query engine, you might need to decode the binary value first before extracting the JSON fields. Some query engines decode the binary value automatically:
6675

6776
.ClickHouse SQL example to reprocess DLQ record
6877
[,sql]
@@ -87,7 +96,7 @@ FROM (
8796
+---------+--------------+--------------------------+
8897
----
8998

90-
You can now insert the transformed record back into the main Iceberg table. Redpanda recommends employing a strategy for exactly-once processing to avoid duplicates when reprocessing records.
99+
You can now insert the transformed record back into the main Iceberg table. Redpanda recommends using an exactly-once processing strategy to avoid duplicates when reprocessing records.
91100

92101
=== Drop invalid records
93102

@@ -102,8 +111,8 @@ endif::[]
102111

103112
The following xref:reference:public-metrics-reference.adoc#iceberg-metrics[Iceberg metrics] help identify translation errors, invalid records, and catalog connectivity issues:
104113

105-
* xref:reference:public-metrics-reference.adoc#redpanda_iceberg_translation_dlq_files_created[`redpanda_iceberg_translation_dlq_files_created`]: Number of dead letter queue (DLQ) Parquet files created. A non-zero and increasing value indicates records are failing to translate.
106-
* xref:reference:public-metrics-reference.adoc#redpanda_iceberg_translation_invalid_records[`redpanda_iceberg_translation_invalid_records`]: Number of invalid records encountered during translation, labeled by cause.
114+
* xref:reference:public-metrics-reference.adoc#redpanda_iceberg_translation_dlq_files_created[`redpanda_iceberg_translation_dlq_files_created`]: Number of DLQ Parquet files created. A non-zero and increasing value indicates records are failing to translate. See <<inspect-dlq-table>> to examine the failed records.
115+
* xref:reference:public-metrics-reference.adoc#redpanda_iceberg_translation_invalid_records[`redpanda_iceberg_translation_invalid_records`]: Number of invalid records encountered during translation, labeled by cause. See <<drop-invalid-records>> to configure how Redpanda handles these records.
107116
* xref:reference:public-metrics-reference.adoc#redpanda_iceberg_rest_client_num_commit_table_update_requests_failed[`redpanda_iceberg_rest_client_num_commit_table_update_requests_failed`]: Failed table commit requests to the REST catalog. Applies only when using a REST catalog (`iceberg_catalog_type: rest`). Persistent failures indicate catalog connectivity or permission issues.
108117

109118
// end::single-source[]

modules/manage/pages/iceberg/specify-iceberg-schema.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ The following modes are compatible with producing to an Iceberg topic using Redp
6060
- `key_value`
6161
- Starting in version 25.2, `value_schema_latest` with a JSON schema
6262
63-
Otherwise, records may fail to write to the Iceberg table and instead write to the xref:manage:iceberg/iceberg-troubleshooting.adoc#dead-letter-queue-dlq[dead-letter queue].
63+
Otherwise, records may fail to write to the Iceberg table and instead write to the xref:manage:iceberg/iceberg-troubleshooting.adoc#dead-letter-queue[dead-letter queue].
6464
====
6565

6666
== Configure Iceberg mode for a topic

0 commit comments

Comments
 (0)