Skip to content

Migrate bigquery storage SDK from v1beta1 to v1#2168

Open
benhxy wants to merge 1 commit into
tensorflow:masterfrom
benhxy:ben/dev
Open

Migrate bigquery storage SDK from v1beta1 to v1#2168
benhxy wants to merge 1 commit into
tensorflow:masterfrom
benhxy:ben/dev

Conversation

@benhxy
Copy link
Copy Markdown

@benhxy benhxy commented May 13, 2026

BigQuery Storage Read API v1 has been GA since 2020. Update the SDK from v1beta1 to v1 to prepare for v1beta1 deprecation.

Context: b/505001153

  1. Service & Namespace Updates:

    • Renamed the service from BigQueryStorage to BigQueryRead as per the v1 transition.
    • Updated the namespace from google::cloud::bigquery::storage::v1beta1 to google::cloud::bigquery::storage::v1.
    • Updated C++ includes to use google/cloud/bigquery/storage/v1/storage.grpc.pb.h.
  2. Request/Response Structure Changes:

    • CreateReadSession:
      • Replaced TableReference (nested fields) with a single table string formatted as projects/{project_id}/datasets/{dataset_id}/tables/{table_id}.
      • Moved DataFormat and ReadOptions into the read_session message.
      • Replaced requested_streams and sharding_strategy with max_stream_count in the top-level request.
    • ReadRows:
      • Updated ReadRowsRequest to use read_stream (string) and offset (int64) directly, replacing the old read_position message.
      • Moved row_count from AvroRows/ArrowRecordBatch to the top-level ReadRowsResponse.
  3. Kernel & Library Refactoring:

    • bigquery_lib.h & bigquery_lib.cc: Updated the BigQueryClientResource to manage BigQueryRead::Stub and adjusted EnsureReaderInitialized and EnsureHasRow to use the v1 request/response fields.
    • bigquery_kernels.cc & bigquery_dataset_op.cc: Refactored the kernel implementation to construct the v1 CreateReadSessionRequest and process the updated ReadSession response.
    • bigquery_test_client_op.cc: Updated the test client to communicate with the v1 service.
  4. Test Suite Updates:

    • tests/test_bigquery.py:
      • Migrated the FakeBigQueryServer to the v1 BigQueryRead service.
      • Updated Python imports to use google-cloud-bigquery-storage v1/v2 compatible paths.
      • Adjusted the mock server logic to handle the v1 table path parsing and the new ReadRows response structure.

@google-cla
Copy link
Copy Markdown

google-cla Bot commented May 13, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@benhxy benhxy marked this pull request as draft May 13, 2026 15:40
@benhxy benhxy marked this pull request as ready for review May 13, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant