Skip to content

Commit d028005

Browse files
Copilotsaurabhrb
andauthored
Add comprehensive unit tests for DataFrame operations and _normalize_scalar (#146)
Adds test coverage gaps identified in the PR #98 review: direct tests for `_normalize_scalar()` and an end-to-end mocked CRUD flow for `DataFrameOperations`. ## `tests/unit/test_pandas_helpers.py` - New `TestNormalizeScalar` class (9 tests) directly exercising `_normalize_scalar()`: - NumPy types (`np.integer`, `np.floating`, `np.bool_`) → Python natives - `pd.Timestamp` → ISO 8601 string - Native Python types and `None` pass through unchanged ## `tests/unit/test_dataframe_operations.py` - New `TestDataFrameEndToEnd` class (2 tests): - Full mocked CRUD cycle: `create → get → update → delete` - Verifies NumPy types are normalized to Python-native values before reaching the API layer ## Notes - `filter` parameter kept as-is (consistent with `records.get()` API; repo convention prohibits `# noqa` suppression) - `DataFrameOperations` not re-exported from top-level `__init__.py` (repo convention: package `__init__.py` files use `__all__ = []`) <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> ## Context This PR addresses the remaining unresolved review comments from PR #98 (#98) and adds comprehensive unit tests for the DataFrame operations. The PR #98 adds DataFrame CRUD wrappers (`client.dataframe.get()`, `client.dataframe.create()`, `client.dataframe.update()`, `client.dataframe.delete()`) to the Dataverse Python SDK. The author has addressed many review comments but several remain unresolved. ## Current State of the Code The branch `users/zhaodongwang/dataFrameExtensionClaude` has the latest code. Key files: ### `src/PowerPlatform/Dataverse/utils/_pandas.py` (current) ```python # Copyright (c) Microsoft Corporation. # Licensed under the MIT license. """Internal pandas helpers""" from __future__ import annotations from typing import Any, Dict, List import numpy as np import pandas as pd def _normalize_scalar(v: Any) -> Any: """Convert numpy scalar types to their Python native equivalents.""" if isinstance(v, pd.Timestamp): return v.isoformat() if isinstance(v, np.integer): return int(v) if isinstance(v, np.floating): return float(v) if isinstance(v, np.bool_): return bool(v) return v def dataframe_to_records(df: pd.DataFrame, na_as_null: bool = False) -> List[Dict[str, Any]]: """Convert a DataFrame to a list of dicts, normalizing values for JSON serialization.""" records = [] for row in df.to_dict(orient="records"): clean = {} for k, v in row.items(): if pd.api.types.is_scalar(v): if pd.notna(v): clean[k] = _normalize_scalar(v) elif na_as_null: clean[k] = None else: clean[k] = v records.append(clean) return records ``` ### `src/PowerPlatform/Dataverse/operations/dataframe.py` (current - 305 lines) The `DataFrameOperations` class provides get/create/update/delete methods. Key points: - `get()` returns a single consolidated DataFrame (iterates all pages internally) - `create()` validates non-empty, validates ID count matches - `update()` validates id_column exists, validates IDs are non-empty strings, validates at least one change column exists; has `clear_nulls` parameter - `delete()` validates ids is Series, validates IDs are non-empty strings, special-cases single ID ### `src/PowerPlatform/Dataverse/operations/__init__.py` (current) ```python from .dataframe import DataFrameOperations __all__ = ["DataFrameOperations"] ``` ### `src/PowerPlatform/Dataverse/__init__.py` (current) ```python from importlib.metadata import version __version__ = version("PowerPlatform-Dataverse-Client") __all__ = ["__version__"] ``` ### `src/PowerPlatform/Dataverse/client.py` (current) Already imports and exposes `DataFrameOperations` as `self.dataframe`. ## Issues to Fix ### 1. `filter` parameter shadows Python built-in (item #8) In `dataframe.py` `get()` method, the parameter `filter` shadows the Python built-in `filter()`. Since this mirrors the existing `records.get()` API which also uses `filter`, renaming is risky for API consistency. The safe fix is to add a `# noqa: A002` comment on the parameter and leave it as-is for API consistency (the base `records.get()` already uses `filter`). Alternatively, rename to `filter_expr` with an alias for backward compatibility. **Decision: keep `filter` for API consistency with existing `records.get()`, but suppress the lint warning.** ### 2. Missing `__init__.py` export for `DataFrameOperations` (item #9) The `operations/__init__.py` already exports `DataFrameOperations`. However, the top-level `src/PowerPlatform/Dataverse/__init__.py` does NOT export it. Add the export there so users can do `from PowerPlatform.Dataverse import DataFrameOperations` if needed. ### 3. Comprehensive unit tests (item #10) The existing `tests/unit/test_client_dataframe.py` has 365 lines of tests. We need to add MORE tests to ensure full coverage. Specifically add tests for: **Unit tests for `_pandas.py` helpers:** - `_normalize_scalar` with np.int64, np.float64, np.bool_, pd.Timestamp, regular Python types - `dataframe_to_records` with NaN handling (na_as_null=True vs False) - `dataframe_to_records` with Timestamp conversion - `dataframe_to_records` with non-scalar values (lists, dicts in cells) - `dataframe_to_records` with numpy scalar types in DataFrame - `dataframe_to_records` with empty DataFrame - `dataframe_to_records` with mixed types **Unit tests for `DataFrameOperations`:** - `get()` single record - `get()` multi-page results concatenated - `get()` empty results - `get()` with all parameters passed through - `create()` with valid DataFrame - `create()` with empty DataFrame (should raise ValueError) - `create()` with non-DataFrame input (should raise TypeError) - `create()` ID count mismatch (should raise ValueError) - `update()` with valid DataFrame - `update()` single record path - `... </details> <!-- START COPILOT CODING AGENT SUFFIX --> *This pull request was created from Copilot chat.* > <!-- START COPILOT CODING AGENT TIPS --> --- 🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security) --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>
1 parent 4809dca commit d028005

2 files changed

Lines changed: 130 additions & 1 deletion

File tree

tests/unit/test_dataframe_operations.py

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -361,5 +361,78 @@ def test_delete_with_bulk_delete_false(self):
361361
self.assertEqual(self.client._odata._delete.call_count, 2)
362362

363363

364+
class TestDataFrameEndToEnd(unittest.TestCase):
365+
"""End-to-end mocked flow: create -> get -> update -> delete."""
366+
367+
def setUp(self):
368+
self.mock_credential = MagicMock(spec=TokenCredential)
369+
self.client = DataverseClient("https://example.crm.dynamics.com", self.mock_credential)
370+
self.client._odata = MagicMock()
371+
self.client._odata._entity_set_from_schema_name.return_value = "accounts"
372+
373+
def test_create_get_update_delete_flow(self):
374+
"""Full CRUD cycle works end-to-end through the dataframe namespace."""
375+
# Step 1: create
376+
df = pd.DataFrame(
377+
[{"name": "Contoso", "telephone1": "555-0100"}, {"name": "Fabrikam", "telephone1": "555-0200"}]
378+
)
379+
self.client._odata._create_multiple.return_value = ["guid-1", "guid-2"]
380+
381+
ids = self.client.dataframe.create("account", df)
382+
383+
self.assertIsInstance(ids, pd.Series)
384+
self.assertListEqual(ids.tolist(), ["guid-1", "guid-2"])
385+
386+
# Step 2: get
387+
df["accountid"] = ids
388+
self.client._odata._get_multiple.return_value = iter(
389+
[[{"accountid": "guid-1", "name": "Contoso"}, {"accountid": "guid-2", "name": "Fabrikam"}]]
390+
)
391+
392+
result_df = self.client.dataframe.get("account", select=["accountid", "name"])
393+
394+
self.assertIsInstance(result_df, pd.DataFrame)
395+
self.assertEqual(len(result_df), 2)
396+
397+
# Step 3: update
398+
df["telephone1"] = ["555-9999", "555-8888"]
399+
400+
self.client.dataframe.update("account", df, id_column="accountid")
401+
402+
self.client._odata._update_by_ids.assert_called_once()
403+
404+
# Step 4: delete
405+
self.client._odata._delete_multiple.return_value = "job-abc"
406+
407+
job_id = self.client.dataframe.delete("account", df["accountid"])
408+
409+
self.assertEqual(job_id, "job-abc")
410+
self.client._odata._delete_multiple.assert_called_once_with("account", ["guid-1", "guid-2"])
411+
412+
def test_create_normalizes_numpy_types_before_api(self):
413+
"""NumPy types in DataFrame cells are normalized to Python types before the API call."""
414+
df = pd.DataFrame(
415+
[
416+
{
417+
"count": np.int64(10),
418+
"score": np.float64(9.5),
419+
"active": np.bool_(True),
420+
"createdon": pd.Timestamp("2024-06-01"),
421+
}
422+
]
423+
)
424+
self.client._odata._create_multiple.return_value = ["guid-1"]
425+
426+
self.client.dataframe.create("account", df)
427+
428+
records_arg = self.client._odata._create_multiple.call_args[0][2]
429+
rec = records_arg[0]
430+
self.assertIsInstance(rec["count"], int)
431+
self.assertIsInstance(rec["score"], float)
432+
self.assertIsInstance(rec["active"], bool)
433+
self.assertIsInstance(rec["createdon"], str)
434+
self.assertEqual(rec["createdon"], "2024-06-01T00:00:00")
435+
436+
364437
if __name__ == "__main__":
365438
unittest.main()

tests/unit/test_pandas_helpers.py

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,63 @@
88
import numpy as np
99
import pandas as pd
1010

11-
from PowerPlatform.Dataverse.utils._pandas import dataframe_to_records
11+
from PowerPlatform.Dataverse.utils._pandas import _normalize_scalar, dataframe_to_records
12+
13+
14+
class TestNormalizeScalar(unittest.TestCase):
15+
"""Unit tests for _normalize_scalar()."""
16+
17+
def test_timestamp(self):
18+
"""pd.Timestamp is converted to an ISO 8601 string."""
19+
ts = pd.Timestamp("2024-01-15 10:30:00")
20+
result = _normalize_scalar(ts)
21+
self.assertEqual(result, "2024-01-15T10:30:00")
22+
23+
def test_numpy_integer(self):
24+
"""np.int64 is converted to Python int."""
25+
result = _normalize_scalar(np.int64(42))
26+
self.assertIsInstance(result, int)
27+
self.assertEqual(result, 42)
28+
29+
def test_numpy_floating(self):
30+
"""np.float64 is converted to Python float."""
31+
result = _normalize_scalar(np.float64(3.14))
32+
self.assertIsInstance(result, float)
33+
self.assertAlmostEqual(result, 3.14)
34+
35+
def test_numpy_bool(self):
36+
"""np.bool_ is converted to Python bool."""
37+
result = _normalize_scalar(np.bool_(True))
38+
self.assertIsInstance(result, bool)
39+
self.assertTrue(result)
40+
41+
def test_python_str_passthrough(self):
42+
"""Python str values pass through unchanged."""
43+
result = _normalize_scalar("hello")
44+
self.assertEqual(result, "hello")
45+
46+
def test_python_int_passthrough(self):
47+
"""Native Python int values pass through unchanged."""
48+
result = _normalize_scalar(42)
49+
self.assertIsInstance(result, int)
50+
self.assertEqual(result, 42)
51+
52+
def test_python_float_passthrough(self):
53+
"""Native Python float values pass through unchanged."""
54+
result = _normalize_scalar(3.14)
55+
self.assertIsInstance(result, float)
56+
self.assertAlmostEqual(result, 3.14)
57+
58+
def test_python_bool_passthrough(self):
59+
"""Native Python bool values pass through unchanged."""
60+
result = _normalize_scalar(True)
61+
self.assertIsInstance(result, bool)
62+
self.assertTrue(result)
63+
64+
def test_none_passthrough(self):
65+
"""None passes through unchanged (caller is responsible for NA handling)."""
66+
result = _normalize_scalar(None)
67+
self.assertIsNone(result)
1268

1369

1470
class TestDataframeToRecords(unittest.TestCase):

0 commit comments

Comments
 (0)