Skip to content

Commit 9835dba

Browse files
authored
Chore/claude tests (Pipelex#137)
Claude writes pytests: * test_path_utils * test_filetype_utils
1 parent 78a0c0f commit 9835dba

6 files changed

Lines changed: 652 additions & 2 deletions

File tree

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,4 +54,5 @@ temp/
5454

5555
gcp_credentials.json
5656

57-
/pipelex.toml
57+
/pipelex.toml
58+
settings.local.json

CLAUDE.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# General rules
2+
3+
## Repo structure
4+
5+
Pipelex is a framework to run low-code AI workflows for repeatable processes.
6+
This python >=3.10 code is in the `pipelex` directory.
7+
8+
## Code Style & formatting
9+
10+
- Imitate existing style
11+
- Use type hints
12+
- Respect Pydantic v2 standard
13+
- Use Typer for CLIs
14+
- Use explicit keyword arguments for function calls with multiple parameters (e.g., `func(arg_name=value)` not just `func(value)`)
15+
- Add trailing commas to multi-line lists, dicts, function arguments, and tuples with >2 items (helps with cleaner diffs and prevents syntax errors when adding items)
16+
- All imports inside this repo's packages must be absolute package paths from the root
17+
18+
## Test file structure
19+
20+
- Name test files with `test_` prefix
21+
- Use descriptive names that match the functionality being tested
22+
- Place test files in the appropriate test category directory:
23+
- `tests/unit/` - for unit tests that test individual functions/classes in isolation
24+
- `tests/integration/` - for integration tests that test component interactions
25+
- `tests/e2e/` - for end-to-end tests that test complete workflows
26+
- `tests/test_pipelines/` - for test pipeline definitions (TOML files and their structuring python files)
27+
- Fixtures are defined in conftest.py modules at different levels of the hierarchy, their scope is handled by pytest
28+
- Test data is placed inside test_data.py at different levels of the hierarchy, they must be imported with package paths from the root like `tests.pipelex.test_data`. Their content is all constants, regrouped inside classes to keep things tidy.
29+
- Always put test inside Test classes.
30+
- The pipelex pipelines should be stored in `tests/test_pipelines` as well as the related structured Output classes that inherit from `StructuredContent`
31+
32+
## Markers
33+
34+
Apply the appropriate markers:
35+
- "llm: uses an LLM to generate text or objects"
36+
- "imgg: uses an image generation AI"
37+
- "inference: uses either an LLM or an image generation AI"
38+
- "gha_disabled: will not be able to run properly on GitHub Actions"
39+
40+
Several markers may be applied. For instance, if the test uses an LLM, then it uses inference, so you must mark with both `inference`and `llm`.
41+
42+
## Test Class Structure
43+
44+
Always group the tests of a module into a test class:
45+
46+
```python
47+
@pytest.mark.llm
48+
@pytest.mark.inference
49+
@pytest.mark.asyncio(loop_scope="class")
50+
class TestFooBar:
51+
@pytest.mark.parametrize(
52+
"topic test_case_blueprint",
53+
[
54+
TestCases.CASE_1,
55+
TestCases.CASE_2,
56+
],
57+
)
58+
async def test_pipe_processing(
59+
self,
60+
request: FixtureRequest,
61+
topic: str,
62+
test_case_blueprint: StuffBlueprint,
63+
):
64+
# Test implementation
65+
```
66+
67+
## Linting & checking
68+
69+
- Run `make lint` -> it runs `ruff check . --fix` to enforce all our linting rules
70+
- Run `make pyright` -> it typechecks with pyright using proper settings
71+
- Run `make mypy` -> it typechecks with mypy using proper settings
72+
- if you added a dependency and mypy complains that it's not typed, add it to the list of modules in [[tool.mypy.overrides]] in pyproject.toml, be sure to signal it in your PR recap so that maintainers can look for existing stubs
73+
- After `make pyright`, you must also check with `make mypy`
74+
75+
## Testing
76+
77+
- Always test with `make t` -> it runs pytest using proper settings
78+
- If some pytest tests fail, run pytest on the failed ones with the required verbosity to diagnose the issue
79+
- If all unit tests pass, run `make validate` -> it runs a minimal version of our app with just the inits and data loading
80+
81+
## PR Instructions
82+
83+
- Run `make fix-unused-imports` -> removes unused imports, required to validate PR
84+
- Re-run checks in one call with `make check` -> formatting and linting with Ruff, type-checking with Pyright and Mypy
85+
- Re-run `make codex-tests`
86+
- Write a one-line summary of the changes.
87+
- Be sure to list changes made to configs, tests and dependencies
88+
89+
## More docs
90+
91+
- Scan the *.mdc files in .cursor/rules/ to get usefull details and explanations on the codebase
Lines changed: 242 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,242 @@
1+
import base64
2+
import binascii
3+
from pathlib import Path
4+
5+
import pytest
6+
from pytest_mock import MockerFixture
7+
8+
from pipelex.tools.misc.filetype_utils import (
9+
FileType,
10+
FileTypeException,
11+
detect_file_type_from_base64,
12+
detect_file_type_from_bytes,
13+
detect_file_type_from_path,
14+
)
15+
16+
17+
class TestFileType:
18+
def test_file_type_creation(self):
19+
file_type = FileType(extension="jpg", mime="image/jpeg")
20+
assert file_type.extension == "jpg"
21+
assert file_type.mime == "image/jpeg"
22+
23+
def test_file_type_pydantic_validation(self):
24+
# Test that FileType is a proper Pydantic model
25+
data = {"extension": "png", "mime": "image/png"}
26+
file_type = FileType(**data)
27+
assert file_type.extension == "png"
28+
assert file_type.mime == "image/png"
29+
30+
31+
class TestFileTypeException:
32+
def test_file_type_exception_inheritance(self):
33+
error = FileTypeException("test message")
34+
assert isinstance(error, Exception)
35+
assert str(error) == "test message"
36+
37+
38+
class TestDetectFileTypeFromPath:
39+
def test_detect_file_type_from_path_success_string(self, mocker: MockerFixture):
40+
# Mock the filetype.guess function
41+
mock_kind = mocker.MagicMock()
42+
mock_kind.extension = "jpg"
43+
mock_kind.mime = "image/jpeg"
44+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=mock_kind)
45+
46+
result = detect_file_type_from_path("/path/to/image.jpg")
47+
48+
assert isinstance(result, FileType)
49+
assert result.extension == "jpg"
50+
assert result.mime == "image/jpeg"
51+
mock_filetype_guess.assert_called_once_with("/path/to/image.jpg")
52+
53+
def test_detect_file_type_from_path_success_pathlib(self, mocker: MockerFixture):
54+
# Mock the filetype.guess function
55+
mock_kind = mocker.MagicMock()
56+
mock_kind.extension = "png"
57+
mock_kind.mime = "image/png"
58+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=mock_kind)
59+
60+
path = Path("/path/to/image.png")
61+
result = detect_file_type_from_path(path)
62+
63+
assert isinstance(result, FileType)
64+
assert result.extension == "png"
65+
assert result.mime == "image/png"
66+
mock_filetype_guess.assert_called_once_with(path)
67+
68+
def test_detect_file_type_from_path_failure(self, mocker: MockerFixture):
69+
# Mock filetype.guess to return None (unrecognized file type)
70+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=None)
71+
72+
with pytest.raises(FileTypeException, match="Could not identify file type of '/unknown/file.xyz'"):
73+
detect_file_type_from_path("/unknown/file.xyz")
74+
75+
mock_filetype_guess.assert_called_once_with("/unknown/file.xyz")
76+
77+
def test_detect_file_type_from_path_failure_pathlib(self, mocker: MockerFixture):
78+
# Mock filetype.guess to return None (unrecognized file type)
79+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=None)
80+
81+
path = Path("/unknown/file.xyz")
82+
with pytest.raises(FileTypeException, match="Could not identify file type of '/unknown/file.xyz'"):
83+
detect_file_type_from_path(path)
84+
85+
mock_filetype_guess.assert_called_once_with(path)
86+
87+
88+
class TestDetectFileTypeFromBytes:
89+
def test_detect_file_type_from_bytes_success(self, mocker: MockerFixture):
90+
# Mock the filetype.guess function
91+
mock_kind = mocker.MagicMock()
92+
mock_kind.extension = "pdf"
93+
mock_kind.mime = "application/pdf"
94+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=mock_kind)
95+
96+
test_bytes = b"\x25\x50\x44\x46" # PDF header
97+
result = detect_file_type_from_bytes(test_bytes)
98+
99+
assert isinstance(result, FileType)
100+
assert result.extension == "pdf"
101+
assert result.mime == "application/pdf"
102+
mock_filetype_guess.assert_called_once_with(test_bytes)
103+
104+
def test_detect_file_type_from_bytes_failure(self, mocker: MockerFixture):
105+
# Mock filetype.guess to return None (unrecognized file type)
106+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=None)
107+
108+
test_bytes = b"unknown file content"
109+
with pytest.raises(FileTypeException, match="Could not identify file type of given bytes: b'unknown file content'"):
110+
detect_file_type_from_bytes(test_bytes)
111+
112+
mock_filetype_guess.assert_called_once_with(test_bytes)
113+
114+
def test_detect_file_type_from_bytes_failure_long_bytes(self, mocker: MockerFixture):
115+
# Mock filetype.guess to return None and test truncation of long bytes in error message
116+
mock_filetype_guess = mocker.patch("filetype.guess", return_value=None)
117+
118+
# Create bytes longer than 300 characters to test truncation
119+
test_bytes = b"a" * 350
120+
with pytest.raises(FileTypeException) as exc_info:
121+
detect_file_type_from_bytes(test_bytes)
122+
123+
# Check that the error message contains truncated bytes (first 300 chars)
124+
error_message = str(exc_info.value)
125+
assert "Could not identify file type of given bytes:" in error_message
126+
assert len(error_message) < len(f"Could not identify file type of given bytes: {test_bytes!r}")
127+
mock_filetype_guess.assert_called_once_with(test_bytes)
128+
129+
130+
class TestDetectFileTypeFromBase64:
131+
def test_detect_file_type_from_base64_string_success(self, mocker: MockerFixture):
132+
# Mock detect_file_type_from_bytes
133+
mock_detect_bytes = mocker.patch(
134+
"pipelex.tools.misc.filetype_utils.detect_file_type_from_bytes", return_value=FileType(extension="gif", mime="image/gif")
135+
)
136+
137+
# GIF header encoded in base64
138+
gif_bytes = b"GIF89a"
139+
b64_string = base64.b64encode(gif_bytes).decode("ascii")
140+
141+
result = detect_file_type_from_base64(b64_string)
142+
143+
assert isinstance(result, FileType)
144+
assert result.extension == "gif"
145+
assert result.mime == "image/gif"
146+
mock_detect_bytes.assert_called_once_with(buf=gif_bytes)
147+
148+
def test_detect_file_type_from_base64_bytes_success(self, mocker: MockerFixture):
149+
# Mock detect_file_type_from_bytes
150+
mock_detect_bytes = mocker.patch(
151+
"pipelex.tools.misc.filetype_utils.detect_file_type_from_bytes", return_value=FileType(extension="jpg", mime="image/jpeg")
152+
)
153+
154+
# JPEG header encoded in base64
155+
jpeg_bytes = b"\xff\xd8\xff"
156+
b64_bytes = base64.b64encode(jpeg_bytes)
157+
158+
result = detect_file_type_from_base64(b64_bytes)
159+
160+
assert isinstance(result, FileType)
161+
assert result.extension == "jpg"
162+
assert result.mime == "image/jpeg"
163+
mock_detect_bytes.assert_called_once_with(buf=jpeg_bytes)
164+
165+
def test_detect_file_type_from_base64_data_url_success(self, mocker: MockerFixture):
166+
# Mock detect_file_type_from_bytes
167+
mock_detect_bytes = mocker.patch(
168+
"pipelex.tools.misc.filetype_utils.detect_file_type_from_bytes", return_value=FileType(extension="png", mime="image/png")
169+
)
170+
171+
# PNG header encoded in base64
172+
png_bytes = b"\x89PNG\r\n\x1a\n"
173+
b64_data = base64.b64encode(png_bytes).decode("ascii")
174+
data_url = f"data:image/png;base64,{b64_data}"
175+
176+
result = detect_file_type_from_base64(data_url)
177+
178+
assert isinstance(result, FileType)
179+
assert result.extension == "png"
180+
assert result.mime == "image/png"
181+
mock_detect_bytes.assert_called_once_with(buf=png_bytes)
182+
183+
def test_detect_file_type_from_base64_data_url_with_whitespace(self, mocker: MockerFixture):
184+
# Mock detect_file_type_from_bytes
185+
mock_detect_bytes = mocker.patch(
186+
"pipelex.tools.misc.filetype_utils.detect_file_type_from_bytes", return_value=FileType(extension="txt", mime="text/plain")
187+
)
188+
189+
test_bytes = b"hello world"
190+
b64_data = base64.b64encode(test_bytes).decode("ascii")
191+
# Note: The function strips leading whitespace but not trailing whitespace after comma
192+
data_url_with_whitespace = f" data:text/plain;base64,{b64_data}"
193+
194+
result = detect_file_type_from_base64(data_url_with_whitespace)
195+
196+
assert isinstance(result, FileType)
197+
assert result.extension == "txt"
198+
assert result.mime == "text/plain"
199+
mock_detect_bytes.assert_called_once_with(buf=test_bytes)
200+
201+
def test_detect_file_type_from_base64_invalid_base64_string(self, mocker: MockerFixture):
202+
# Test with invalid base64 string
203+
invalid_b64 = "invalid!base64!string!"
204+
205+
with pytest.raises(FileTypeException, match="Could not identify file type of given bytes because input is not valid Base-64"):
206+
detect_file_type_from_base64(invalid_b64)
207+
208+
def test_detect_file_type_from_base64_invalid_base64_bytes(self, mocker: MockerFixture):
209+
# Test with invalid base64 bytes
210+
invalid_b64_bytes = b"invalid!base64!bytes!"
211+
212+
with pytest.raises(FileTypeException, match="Could not identify file type of given bytes because input is not valid Base-64"):
213+
detect_file_type_from_base64(invalid_b64_bytes)
214+
215+
def test_detect_file_type_from_base64_data_url_no_comma(self, mocker: MockerFixture):
216+
# Test data URL without comma (should be treated as regular base64)
217+
mocker.patch(
218+
"pipelex.tools.misc.filetype_utils.detect_file_type_from_bytes", return_value=FileType(extension="bin", mime="application/octet-stream")
219+
)
220+
221+
# This will be treated as a base64 string since there's no comma
222+
no_comma_url = "data:image/png;base64somedata"
223+
# base64 decode will likely fail, but let's mock it working for testing
224+
mocker.patch("base64.b64decode", return_value=b"test")
225+
226+
result = detect_file_type_from_base64(no_comma_url)
227+
228+
assert isinstance(result, FileType)
229+
assert result.extension == "bin"
230+
assert result.mime == "application/octet-stream"
231+
232+
def test_detect_file_type_from_base64_binascii_error_chain(self, mocker: MockerFixture):
233+
# Test that binascii.Error is properly chained when base64 decode fails
234+
mocker.patch("base64.b64decode", side_effect=binascii.Error("Invalid base64"))
235+
236+
with pytest.raises(FileTypeException) as exc_info:
237+
detect_file_type_from_base64("invalid")
238+
239+
# Check that the original exception is chained
240+
assert exc_info.value.__cause__ is not None
241+
assert isinstance(exc_info.value.__cause__, binascii.Error)
242+
assert str(exc_info.value.__cause__) == "Invalid base64"

0 commit comments

Comments
 (0)