Skip to content

Commit 5cd086c

Browse files
sagebreeSamson GebreSaurabh Badenkal
authored
Implement batch API with changeset, upsert, and DataFrame integration (#129)
## Summary - Adds `client.batch` namespace -- a deferred-execution batch API that packs multiple Dataverse Web API operations into a single `POST $batch` HTTP request - Adds `client.batch.dataframe` namespace -- pandas DataFrame wrappers for batch operations - Adds `client.records.upsert()` and `client.batch.records.upsert()` backed by the `UpsertMultiple` bound action with alternate-key support - Fixes a bug where alternate key fields were merged into the UpsertMultiple request body, causing `400 Bad Request` on the create path ## Batch API Design Implements the [Batch API Design](#129 (comment)) spec from @sagebree: | Capability | How to use | Status | |---|---|---| | Record CRUD (create / update / delete / get) | `batch.records.*` | Done | | Upsert by alternate key | `batch.records.upsert(...)` | Done | | Table metadata (create / delete / columns / relationships) | `batch.tables.*` | Done | | SQL queries | `batch.query.sql(...)` | Done | | Atomic write groups | `batch.changeset()` | Done | | Continue past failures | `batch.execute(continue_on_error=True)` | Done | | DataFrame integration | `batch.dataframe.create/update/delete` | Done (new) | **Design constraints enforced:** - Maximum 1000 operations per batch (validated before sending) - `records.get` paginated overload not supported -- single-record only - GET operations cannot be placed inside a changeset (enforced by API design) - Content-ID references are only valid within the same changeset - File upload operations not batchable - `tables.create` returns no table metadata on success (HTTP 204) - `tables.add_columns` / `tables.remove_columns` do not flush the picklist cache - `client.flush_cache()` not supported in batch (client-side operation) ## What's included ### New: `client.batch` API - `batch.records.create / get / update / delete / upsert` - `batch.tables.create / get / list / add_columns / remove_columns / delete` - `batch.tables.list(filter=..., select=...)` -- parity with `client.tables.list()` from #112 - `batch.tables.create_one_to_many_relationship / create_many_to_many_relationship / delete_relationship / get_relationship / create_lookup_field` - `batch.query.sql` - `batch.changeset()` context manager for transactional (all-or-nothing) operations - Content-ID reference chaining inside changesets (globally unique across all changesets via shared counter) - `execute(continue_on_error=True)` for mixed success/failure batches - `BatchResult` with `.responses`, `.succeeded`, `.failed`, `.created_ids`, `.has_errors` ### New: `client.batch.dataframe` API - `batch.dataframe.create(table, df)` -- DataFrame rows to CreateMultiple batch item - `batch.dataframe.update(table, df, id_column)` -- DataFrame rows to update batch items - `batch.dataframe.delete(table, ids_series)` -- pandas Series to delete batch items ### Existing: Refactored existing APIs - Payload generation shared between batch and direct API via `_build_*` / `_RawRequest` pattern - Execution of batch operations deferred to `execute()` ### OData $batch spec compliance - Audited against [Microsoft Learn docs](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/webapi/execute-batch-operations-using-web-api) - `Content-Transfer-Encoding: binary` per part - `Content-Type: application/http` per part - `Content-Type: application/json; type=entry` for POST/PATCH bodies - CRLF line endings throughout - Absolute URLs in batch parts - Empty changesets silently skipped (prevents invalid multipart) - Top-level batch error handling (non-multipart 4xx/5xx raises `HttpError` with parsed Dataverse error details) - Accepts `200`, `202 Accepted`, `207 Multi-Status`, and `400` batch response codes ### Review comment fixes - Fixed `expected` status codes to include `202`/`207` for all Dataverse environments - Fixed `_split_multipart` / `_parse_mime_part` return type annotations: `List[Tuple[Dict[str, str], str]]` - Fixed OptionSet string check regression: now uses dict key lookup instead of JSON string search - Fixed `_build_get` to lowercase select column names (consistency with `_get_multiple`) - Added RFC 3986 `%20` encoding documentation in `_build_sql` docstring - Fixed content-id response parsing for non-changeset parts - Fixed test assertions after merge: `data` bytes instead of `json` kwarg - Exception type parity: `batch.records.upsert()` raises `TypeError` (matching `client.records.upsert()`) ### Testing **Unit tests -- 579 tests passing:** - `test_batch_operations.py` -- BatchRequest, BatchRecordOperations, BatchTableOperations, BatchQueryOperations, ChangeSet, BatchItemResponse, BatchResult - `test_batch_serialization.py` -- multipart serialization, response parsing, intent resolution, upsert dispatch, batch size limit, content-ID uniqueness, top-level error handling - `test_batch_edge_cases.py` -- 40 edge case tests: empty changeset, changeset rollback, content-ID in standalone parts, mixed batch, multiple changesets, batch size limits, top-level errors, continue-on-error, serialization compliance, multipart parsing, content-ID references, intent validation - `test_batch_dataframe.py` -- 18 tests: DataFrame create/update/delete, validation, NaN handling, empty series, bulk delete - `test_odata_internal.py` -- `_build_upsert_multiple` body exclusion, conflict detection, URL/method correctness **E2E tests -- 14 tests passing against live Dataverse (`crm10.dynamics.com`):** 1. Basic batch CRUD (single create + CreateMultiple, update, get, delete) 2. Changeset happy path (create + update via `$ref` content-ID) 3. Changeset rollback (failing op rolls back entire changeset) 4. Multiple changesets (globally unique content-IDs) 5. Continue-on-error (mixed success/failure) 6. Batch SQL query 7. Batch tables.get + tables.list 8. DataFrame batch create 9. DataFrame batch update 10. DataFrame batch delete 11. Mixed batch (changeset + standalone GET) 12. Empty changeset (silently skipped) 13. Content-ID chaining (2 creates + 2 updates via `$ref`) 14. Table setup/teardown ### Examples & docs - `examples/advanced/batch.py` -- reference examples for all batch operation types - `examples/advanced/walkthrough.py` -- batch section added (section 11) - `examples/basic/functional_testing.py` -- `test_batch_all_operations()` covering all operation categories against a live environment --------- Co-authored-by: Samson Gebre <sagebree@microsoft.com> Co-authored-by: Saurabh Badenkal <sbadenkal@microsoft.com>
1 parent 5a395ec commit 5cd086c

24 files changed

Lines changed: 6381 additions & 217 deletions

File tree

.claude/skills/dataverse-sdk-dev/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ This skill provides guidance for developers working on the PowerPlatform Dataver
1313

1414
### API Design
1515

16-
1. **Public methods in operation namespaces** - New public methods go in the appropriate namespace module under `src/PowerPlatform/Dataverse/operations/` (`records.py`, `query.py`, `tables.py`). The `client.py` file exposes these via namespace properties (`client.records`, `client.query`, `client.tables`). Public types and constants live in their own modules (e.g., `models/metadata.py`, `common/constants.py`)
16+
1. **Public methods in operation namespaces** - New public methods go in the appropriate namespace module under `src/PowerPlatform/Dataverse/operations/` (`records.py`, `query.py`, `tables.py`, `batch.py`). The `client.py` file exposes these via namespace properties (`client.records`, `client.query`, `client.tables`, `client.batch`). Public types and constants live in their own modules (e.g., `models/metadata.py`, `models/batch.py`, `common/constants.py`)
1717
2. **Every public method needs README example** - Public API methods must have examples in README.md
1818
3. **Reuse existing APIs** - Always check if an existing method can be used before making direct Web API calls
1919
4. **Update documentation** when adding features - Keep README and SKILL files (both copies) in sync

.claude/skills/dataverse-sdk-use/SKILL.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Use the PowerPlatform Dataverse Client Python SDK to interact with Microsoft Dat
2222
- `client.query` -- query and search operations
2323
- `client.tables` -- table metadata, columns, and relationships
2424
- `client.files` -- file upload operations
25+
- `client.batch` -- batch multiple operations into a single HTTP request
2526

2627
### Bulk Operations
2728
The SDK supports Dataverse's native bulk operations: Pass lists to `create()`, `update()` for automatic bulk processing, for `delete()`, set `use_bulk_delete` when passing lists to use bulk operation
@@ -369,6 +370,50 @@ client.files.upload(
369370
)
370371
```
371372

373+
### Batch Operations
374+
375+
Use `client.batch` to send multiple operations in one HTTP request. All batch methods return `None`; results arrive via `BatchResult` after `execute()`.
376+
377+
```python
378+
# Build a batch request
379+
batch = client.batch.new()
380+
batch.records.create("account", {"name": "Contoso"})
381+
batch.records.update("account", account_id, {"telephone1": "555-0100"})
382+
batch.records.get("account", account_id, select=["name"])
383+
batch.query.sql("SELECT TOP 5 name FROM account")
384+
385+
result = batch.execute()
386+
for item in result.responses:
387+
if item.is_success:
388+
print(f"[OK] {item.status_code} entity_id={item.entity_id}")
389+
if item.data:
390+
# GET responses populate item.data with the parsed JSON record
391+
print(item.data.get("name"))
392+
else:
393+
print(f"[ERR] {item.status_code}: {item.error_message}")
394+
395+
# Transactional changeset (all succeed or roll back)
396+
with batch.changeset() as cs:
397+
ref = cs.records.create("contact", {"firstname": "Alice"})
398+
cs.records.update("account", account_id, {"primarycontactid@odata.bind": ref})
399+
400+
# Continue on error
401+
result = batch.execute(continue_on_error=True)
402+
print(f"Succeeded: {len(result.succeeded)}, Failed: {len(result.failed)}")
403+
```
404+
405+
**BatchResult properties:**
406+
- `result.responses` -- list of `BatchItemResponse` in submission order
407+
- `result.succeeded` -- responses with 2xx status codes
408+
- `result.failed` -- responses with non-2xx status codes
409+
- `result.has_errors` -- True if any response failed
410+
- `result.entity_ids` -- GUIDs from OData-EntityId headers (creates and updates)
411+
412+
**Batch limitations:**
413+
- Maximum 1000 operations per batch
414+
- Paginated `records.get()` (without `record_id`) is not supported in batch
415+
- `flush_cache()` is not supported in batch
416+
372417
## Error Handling
373418

374419
The SDK provides structured exceptions with detailed error information:

README.md

Lines changed: 89 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ A Python client library for Microsoft Dataverse that provides a unified interfac
2929
- [Table management](#table-management)
3030
- [Relationship management](#relationship-management)
3131
- [File operations](#file-operations)
32+
- [Batch operations](#batch-operations)
3233
- [Next steps](#next-steps)
3334
- [Troubleshooting](#troubleshooting)
3435
- [Contributing](#contributing)
@@ -43,6 +44,7 @@ A Python client library for Microsoft Dataverse that provides a unified interfac
4344
- **🔗 Relationship Management**: Create one-to-many and many-to-many relationships between tables with full metadata control
4445
- **🐼 DataFrame Support**: Pandas wrappers for all CRUD operations, returning DataFrames and Series
4546
- **📎 File Operations**: Upload files to Dataverse file columns with automatic chunking for large files
47+
- **📦 Batch Operations**: Send multiple CRUD, table metadata, and SQL query operations in a single HTTP request with optional transactional changesets
4648
- **🔐 Azure Identity**: Built-in authentication using Azure Identity credential providers with comprehensive support
4749
- **🛡️ Error Handling**: Structured exception hierarchy with detailed error context and retry guidance
4850

@@ -115,9 +117,9 @@ The SDK provides a simple, pythonic interface for Dataverse operations:
115117

116118
| Concept | Description |
117119
|---------|-------------|
118-
| **DataverseClient** | Main entry point; provides `records`, `query`, `tables`, and `files` namespaces |
120+
| **DataverseClient** | Main entry point; provides `records`, `query`, `tables`, `files`, and `batch` namespaces |
119121
| **Context Manager** | Use `with DataverseClient(...) as client:` for automatic cleanup and HTTP connection pooling |
120-
| **Namespaces** | Operations are organized into `client.records` (CRUD & OData queries), `client.query` (QueryBuilder & SQL), `client.tables` (metadata), and `client.files` (file uploads) |
122+
| **Namespaces** | Operations are organized into `client.records` (CRUD & OData queries), `client.query` (QueryBuilder & SQL), `client.tables` (metadata), `client.files` (file uploads), and `client.batch` (batch requests) |
121123
| **Records** | Dataverse records represented as Python dictionaries with column schema names |
122124
| **Schema names** | Use table schema names (`"account"`, `"new_MyTestTable"`) and column schema names (`"name"`, `"new_MyTestColumn"`). See: [Table definitions in Microsoft Dataverse](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/entity-metadata) |
123125
| **Bulk Operations** | Efficient bulk processing for multiple records with automatic optimization |
@@ -513,6 +515,90 @@ client.files.upload(
513515
)
514516
```
515517

518+
### Batch operations
519+
520+
Use `client.batch` to send multiple operations in one HTTP request. The batch namespace mirrors `client.records`, `client.tables`, and `client.query`.
521+
522+
```python
523+
# Build a batch request and add operations
524+
batch = client.batch.new()
525+
batch.records.create("account", {"name": "Contoso"})
526+
batch.records.create("account", [{"name": "Fabrikam"}, {"name": "Woodgrove"}])
527+
batch.records.update("account", account_id, {"telephone1": "555-0100"})
528+
batch.records.delete("account", old_id)
529+
batch.records.get("account", account_id, select=["name"])
530+
531+
result = batch.execute()
532+
for item in result.responses:
533+
if item.is_success:
534+
print(f"[OK] {item.status_code} entity_id={item.entity_id}")
535+
else:
536+
print(f"[ERR] {item.status_code}: {item.error_message}")
537+
```
538+
539+
**Transactional changeset** — all operations in a changeset succeed or roll back together:
540+
541+
```python
542+
batch = client.batch.new()
543+
with batch.changeset() as cs:
544+
lead_ref = cs.records.create("lead", {"firstname": "Ada"})
545+
contact_ref = cs.records.create("contact", {"firstname": "Ada"})
546+
cs.records.create("account", {
547+
"name": "Babbage & Co.",
548+
"originatingleadid@odata.bind": lead_ref,
549+
"primarycontactid@odata.bind": contact_ref,
550+
})
551+
result = batch.execute()
552+
print(f"Created {len(result.entity_ids)} records atomically")
553+
```
554+
555+
**Table metadata and SQL queries in a batch:**
556+
557+
```python
558+
batch = client.batch.new()
559+
batch.tables.create("new_Product", {"new_Price": "decimal", "new_InStock": "bool"})
560+
batch.tables.add_columns("new_Product", {"new_Rating": "int"})
561+
batch.tables.get("new_Product")
562+
batch.query.sql("SELECT TOP 5 name FROM account")
563+
564+
result = batch.execute()
565+
```
566+
567+
**Continue on error** — attempt all operations even when one fails:
568+
569+
```python
570+
result = batch.execute(continue_on_error=True)
571+
print(f"Succeeded: {len(result.succeeded)}, Failed: {len(result.failed)}")
572+
for item in result.failed:
573+
print(f"[ERR] {item.status_code}: {item.error_message}")
574+
```
575+
576+
**DataFrame integration** -- feed pandas DataFrames directly into a batch:
577+
578+
```python
579+
import pandas as pd
580+
581+
batch = client.batch.new()
582+
583+
# Create records from a DataFrame
584+
df = pd.DataFrame([{"name": "Contoso"}, {"name": "Fabrikam"}])
585+
batch.dataframe.create("account", df)
586+
587+
# Update records from a DataFrame
588+
updates = pd.DataFrame([
589+
{"accountid": id1, "telephone1": "555-0100"},
590+
{"accountid": id2, "telephone1": "555-0200"},
591+
])
592+
batch.dataframe.update("account", updates, id_column="accountid")
593+
594+
# Delete records from a Series
595+
batch.dataframe.delete("account", pd.Series([id1, id2]))
596+
597+
result = batch.execute()
598+
```
599+
600+
For a complete example see [examples/advanced/batch.py](https://github.com/microsoft/PowerPlatform-DataverseClient-Python/blob/main/examples/advanced/batch.py).
601+
516602
## Next steps
517603

518604
### More sample code
@@ -527,6 +613,7 @@ Explore our comprehensive examples in the [`examples/`](https://github.com/micro
527613
- **[Complete Walkthrough](https://github.com/microsoft/PowerPlatform-DataverseClient-Python/blob/main/examples/advanced/walkthrough.py)** - Full feature demonstration with production patterns
528614
- **[Relationship Management](https://github.com/microsoft/PowerPlatform-DataverseClient-Python/blob/main/examples/advanced/relationships.py)** - Create and manage table relationships
529615
- **[File Upload](https://github.com/microsoft/PowerPlatform-DataverseClient-Python/blob/main/examples/advanced/file_upload.py)** - Upload files to Dataverse file columns
616+
- **[Batch Operations](https://github.com/microsoft/PowerPlatform-DataverseClient-Python/blob/main/examples/advanced/batch.py)** - Send multiple operations in a single request with changesets
530617

531618
📖 See the [examples README](https://github.com/microsoft/PowerPlatform-DataverseClient-Python/blob/main/examples/README.md) for detailed guidance and learning progression.
532619

0 commit comments

Comments
 (0)