Skip to content

Commit 7a93bc9

Browse files
authored
Chore/test pyramid (Pipelex#110)
### πŸ“ Description - ClassRegistryUtils implements features removed from dependency `kajson` ### πŸ”„ Type of Change - [ ] πŸ› Bug fix - [ ] ✨ New feature - [ ] πŸ’₯ Breaking change - [ ] πŸ“š Documentation update - [X] 🧹 Code refactor - [ ] ⚑ Performance improvement - [X] βœ… Test update ### πŸ§ͺ Tests - test pyramid: unit/ and integration/ tests, e2e/ empty for now
1 parent f7e43ca commit 7a93bc9

99 files changed

Lines changed: 1199 additions & 284 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

β€Ž.cursor/rules/pytest.mdcβ€Ž

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@ These rules apply when writing unit tests.
1010

1111
- Name test files with `test_` prefix
1212
- Use descriptive names that match the functionality being tested
13-
- Place test files in the appropriate subdirectory of `tests/`:
14-
- `tests/tools/` for tests related to sub-package `pipelex.tools`
15-
- `tests/pipelex/` for tests related to `pipelex`and its sub-packages
16-
- `tests/pipelex/cogt/` for tests related to sub-package `pipelex.cogt`
17-
- More precisely, for `pipelex` and `pipelex.cogt` the async tests are placed inside subdirectories named `cogt_asynch` and `pipelex_asynch`
13+
- Place test files in the appropriate test category directory:
14+
- `tests/unit/` - for unit tests that test individual functions/classes in isolation
15+
- `tests/integration/` - for integration tests that test component interactions
16+
- `tests/e2e/` - for end-to-end tests that test complete workflows
17+
- `tests/test_pipelines/` - for test pipeline definitions (TOML files)
1818
- Fixtures are defined in conftest.py modules at different levels of the hierarchy, their scope is handled by pytest
1919
- Test data is placed inside test_data.py at different levels of the hierarchy, they must be imported with package paths from the root like `tests.pipelex.test_data`. Their content is all constants, regrouped inside classes to keep things tidy.
2020
- Always put test inside Test classes.
@@ -55,7 +55,7 @@ class TestFooBar:
5555
# Test implementation
5656
```
5757

58-
Sometimes it can be convenient to access the test's name in its body, for instance to include into a job_id. To achieve that, add the argument `request: FixtureRequest` into the signature and then you can get th test name using `cast(str, request.node.originalname), # type: ignore`.
58+
Sometimes it can be convenient to access the test's name in its body, for instance to include into a job_id. To achieve that, add the argument `request: FixtureRequest` into the signature and then you can get the test name using `cast(str, request.node.originalname), # type: ignore`.
5959

6060
# Pipe tests
6161

β€Ž.cursor/rules/standards.mdcβ€Ž

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ This document outlines the coding standards and quality control procedures that
1414
Before finalizing a task, you must run the following command to check for linting issues, type errors, and code quality problems:
1515

1616
```bash
17+
make fix-unused-imports
1718
make check
1819
```
1920

@@ -34,20 +35,16 @@ We have several make commands for running tests:
3435
```
3536
Use this for quick test runs that don't require LLM or image generation.
3637

37-
2. `make ti`: Runs all tests with these markers:
38-
```
39-
inference and not imgg
40-
```
41-
Use this for testing LLM functionality without image generation.
42-
43-
3. To run specific tests:
38+
2. To run specific tests:
4439
```bash
4540
make tp TEST=TestClassName
4641
# or
4742
make tp TEST=test_function_name
4843
```
4944
It matches names, so `TEST=test_function_name` is going to run all test with the function name that STARTS with `test_function_name`.
5045

46+
Note: never run `make ti`, `make test-inference`, `make to`, `make test-ocr`, `make tg`, or `make test-imgg`: these all use inference which is costly.
47+
5148
## Important Project Directories
5249

5350
### Pipelines Directory

β€ŽMakefileβ€Ž

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -207,7 +207,7 @@ cleanall: cleanderived cleanenv cleanlibraries
207207
codex-tests: env
208208
$(call PRINT_TITLE,"Unit testing for Codex")
209209
@echo "β€’ Running unit tests for Codex (excluding inference and codex_disabled)"
210-
$(VENV_PYTEST) -n auto --exitfirst --quiet -m "(dry_runnable or not inference) and not (needs_output or pipelex_api or codex_disabled)" || [ $$? = 5 ]
210+
$(VENV_PYTEST) --exitfirst -m "(dry_runnable or not inference) and not (needs_output or pipelex_api or codex_disabled)" || [ $$? = 5 ]
211211

212212
gha-tests: env
213213
$(call PRINT_TITLE,"Unit testing for github actions")
@@ -318,6 +318,27 @@ test-pipelex-api: env
318318
ta: test-pipelex-api
319319
@echo "> done: ta = test-pipelex-api"
320320

321+
cov: env
322+
$(call PRINT_TITLE,"Unit testing with coverage")
323+
@echo "β€’ Running unit tests with coverage"
324+
@if [ -n "$(TEST)" ]; then \
325+
$(VENV_PYTEST) --cov=$(if $(PKG),$(PKG),pipelex) -k "$(TEST)" $(if $(filter 2,$(VERBOSE)),-vv,$(if $(filter 3,$(VERBOSE)),-vvv,-v)); \
326+
else \
327+
$(VENV_PYTEST) --cov=$(if $(PKG),$(PKG),pipelex) $(if $(filter 2,$(VERBOSE)),-vv,$(if $(filter 3,$(VERBOSE)),-vvv,-v)); \
328+
fi
329+
330+
cov-missing: env
331+
$(call PRINT_TITLE,"Unit testing with coverage and missing lines")
332+
@echo "β€’ Running unit tests with coverage and missing lines"
333+
@if [ -n "$(TEST)" ]; then \
334+
$(VENV_PYTEST) --cov=$(if $(PKG),$(PKG),pipelex) --cov-report=term-missing -k "$(TEST)" $(if $(filter 2,$(VERBOSE)),-vv,$(if $(filter 3,$(VERBOSE)),-vvv,-v)); \
335+
else \
336+
$(VENV_PYTEST) --cov=$(if $(PKG),$(PKG),pipelex) --cov-report=term-missing $(if $(filter 2,$(VERBOSE)),-vv,$(if $(filter 3,$(VERBOSE)),-vvv,-v)); \
337+
fi
338+
339+
cm: cov-missing
340+
@echo "> done: cm = cov-missing"
341+
321342
############################################################################################
322343
############################ Linting ############################
323344
############################################################################################

β€Ždocs/pages/build-reliable-ai-workflows-with-pipelex/pipe-operators/PipeLLM.mdβ€Ž

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,6 @@ Analyze the document page shown in the image and explain how it relates to the p
133133
2. **Fixed Multiple Outputs**: Use `nb_output = N` (where N is a positive integer) when you need exactly N outputs. For example, `nb_output = 3` will try to generate 3 results. The parameter `_nb_output` will be available in the prompt template, e.g. "Give me the names of $_nb_output flowers".
134134

135135
3. **Variable Multiple Outputs**: Use `multiple_output = true` when you need a variable-length list where the LLM determines how many outputs to generate based on the content and context.
136-
| `output_multiplicity` | string or integer | Defines the number of outputs. Use `"list"` for a variable-length list, or an integer (e.g., `3`) for a fixed-size list. | No |
137136

138137
## Examples
139138

β€Žpipelex/core/stuff_factory.pyβ€Ž

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
from typing import Any, Dict, List, Optional, Tuple
22

33
import shortuuid
4-
from pydantic import BaseModel, Field
4+
from pydantic import BaseModel, Field, ValidationError
55

66
from pipelex.config import get_config
77
from pipelex.core.concept import Concept
@@ -11,6 +11,7 @@
1111
from pipelex.core.stuff_content import StuffContent, StuffContentInitableFromStr
1212
from pipelex.exceptions import ConceptError, PipelexError
1313
from pipelex.hub import get_class_registry, get_required_concept
14+
from pipelex.tools.typing.pydantic_utils import format_pydantic_validation_error
1415

1516

1617
class StuffFactoryError(PipelexError):
@@ -148,14 +149,22 @@ def make_multiple_stuff_from_str(cls, str_stuff_and_concepts_dict: Dict[str, Tup
148149
return result
149150

150151
@classmethod
151-
def combine_stuffs(cls, concept_code: str, stuff_contents: Dict[str, StuffContent], name: Optional[str] = None) -> Stuff:
152+
def combine_stuffs(
153+
cls,
154+
concept_code: str,
155+
stuff_contents: Dict[str, StuffContent],
156+
name: Optional[str] = None,
157+
) -> Stuff:
152158
"""
153159
Combine a dictionary of stuffs into a single stuff.
154160
"""
155161
the_concept = get_required_concept(concept_code=concept_code)
156162
the_subclass_name = the_concept.structure_class_name
157163
the_subclass = get_class_registry().get_required_subclass(name=the_subclass_name, base_class=StuffContent)
158-
the_stuff_content = the_subclass.model_validate(obj=stuff_contents)
164+
try:
165+
the_stuff_content = the_subclass.model_validate(obj=stuff_contents)
166+
except ValidationError as exc:
167+
raise StuffFactoryError(f"Error combining stuffs: {format_pydantic_validation_error(exc=exc)}") from exc
159168
return cls.make_stuff(
160169
concept_str=concept_code,
161170
content=the_stuff_content,

β€Žpipelex/libraries/library_manager.pyβ€Ž

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
StaticValidationError,
2525
)
2626
from pipelex.libraries.library_config import LibraryConfig
27+
from pipelex.tools.class_registry_utils import ClassRegistryUtils
2728
from pipelex.tools.misc.file_utils import find_files_in_dir
2829
from pipelex.tools.misc.json_utils import deep_update
2930
from pipelex.tools.misc.toml_utils import load_toml_from_path
@@ -74,13 +75,13 @@ def teardown(self) -> None:
7475
def load_libraries(self):
7576
log.debug("LibraryManager loading separate libraries")
7677

77-
KajsonManager.get_class_registry().register_classes_in_folder(
78+
ClassRegistryUtils.register_classes_in_folder(
7879
folder_path=LibraryConfig.loaded_pipelines_path,
7980
)
8081
library_paths = [LibraryConfig.loaded_pipelines_path]
8182
if runtime_manager.is_unit_testing:
8283
log.debug("Registering test pipeline structures for unit testing")
83-
KajsonManager.get_class_registry().register_classes_in_folder(
84+
ClassRegistryUtils.register_classes_in_folder(
8485
folder_path=LibraryConfig.test_pipelines_path,
8586
)
8687
library_paths += [LibraryConfig.test_pipelines_path]
@@ -117,6 +118,7 @@ def _load_combo_libraries(self, library_paths: List[str]):
117118
pattern="*.toml",
118119
is_recursive=True,
119120
)
121+
log.debug(f"Searching for TOML files in {libraries_path}, found '{found_file_paths}'")
120122
if not found_file_paths:
121123
log.warning(f"No TOML files found in library path: {libraries_path}")
122124
toml_file_paths.extend(found_file_paths)

β€Žpipelex/pipe_operators/pipe_llm_factory.pyβ€Ž

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@ def make_pipe_from_blueprint(
131131
user_images=user_images or None,
132132
)
133133

134-
llm_settings = LLMSettingChoices(
134+
llm_choices = LLMSettingChoices(
135135
for_text=pipe_blueprint.llm,
136136
for_object=pipe_blueprint.llm_to_structure,
137137
for_object_direct=pipe_blueprint.llm_to_structure_direct,
@@ -152,7 +152,7 @@ def make_pipe_from_blueprint(
152152
inputs=PipeInputSpec(root=pipe_blueprint.inputs or {}),
153153
output_concept_code=pipe_blueprint.output,
154154
pipe_llm_prompt=pipe_llm_prompt,
155-
llm_choices=llm_settings,
155+
llm_choices=llm_choices,
156156
structuring_method=pipe_blueprint.structuring_method,
157157
prompt_template_to_structure=pipe_blueprint.prompt_template_to_structure,
158158
system_prompt_to_structure=pipe_blueprint.system_prompt_to_structure,

β€Žpipelex/plugins/openai/openai_factory.pyβ€Ž

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,8 @@ def make_openai_client(cls, llm_platform: LLMPlatform) -> openai.AsyncClient:
5050
base_url=endpoint,
5151
)
5252
case LLMPlatform.OPENAI:
53-
openai_openai_config = get_config().plugins.openai_config
54-
api_key = openai_openai_config.get_api_key(secrets_provider=get_secrets_provider())
53+
openai_config = get_config().plugins.openai_config
54+
api_key = openai_config.get_api_key(secrets_provider=get_secrets_provider())
5555
the_client = openai.AsyncOpenAI(api_key=api_key)
5656
case LLMPlatform.VERTEXAI:
5757
vertexai_config = get_config().plugins.vertexai_config
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
import sys
2+
from pathlib import Path
3+
from typing import Any, List, Optional, Type
4+
5+
from pipelex.hub import get_class_registry
6+
from pipelex.tools.typing.module_inspector import find_classes_in_module, import_module_from_file
7+
8+
9+
class ClassRegistryUtils:
10+
@classmethod
11+
def register_classes_in_file(
12+
cls,
13+
file_path: str,
14+
base_class: Optional[Type[Any]],
15+
is_include_imported: bool,
16+
) -> None:
17+
"""Processes a Python file to find and register classes."""
18+
module = import_module_from_file(file_path)
19+
20+
# Find classes that match criteria
21+
classes_to_register = find_classes_in_module(
22+
module=module,
23+
base_class=base_class,
24+
include_imported=is_include_imported,
25+
)
26+
27+
# Clean up sys.modules to prevent memory leaks
28+
del sys.modules[module.__name__]
29+
30+
get_class_registry().register_classes(classes=classes_to_register)
31+
32+
@classmethod
33+
def register_classes_in_folder(
34+
cls,
35+
folder_path: str,
36+
base_class: Optional[Type[Any]] = None,
37+
is_recursive: bool = True,
38+
is_include_imported: bool = False,
39+
) -> None:
40+
"""
41+
Registers all classes in Python files within folders that are subclasses of base_class.
42+
If base_class is None, registers all classes.
43+
44+
Args:
45+
folder_paths: List of paths to folders containing Python files
46+
base_class: Optional base class to filter registerable classes
47+
recursive: Whether to search recursively in subdirectories
48+
exclude_files: List of filenames to exclude
49+
exclude_dirs: List of directory names to exclude
50+
include_imported: Whether to include classes imported from other modules
51+
"""
52+
53+
python_files = cls.find_files_in_dir(
54+
dir_path=folder_path,
55+
pattern="*.py",
56+
is_recursive=is_recursive,
57+
)
58+
59+
for python_file in python_files:
60+
cls.register_classes_in_file(
61+
file_path=str(python_file),
62+
base_class=base_class,
63+
is_include_imported=is_include_imported,
64+
)
65+
66+
@classmethod
67+
def find_files_in_dir(cls, dir_path: str, pattern: str, is_recursive: bool) -> List[Path]:
68+
"""
69+
Find files matching a pattern in a directory.
70+
71+
Args:
72+
dir_path: Directory path to search in
73+
pattern: File pattern to match (e.g. "*.py")
74+
recursive: Whether to search recursively in subdirectories
75+
76+
Returns:
77+
List of matching Path objects
78+
"""
79+
path = Path(dir_path)
80+
if is_recursive:
81+
return list(path.rglob(pattern))
82+
else:
83+
return list(path.glob(pattern))

β€Žpipelex/tools/typing/pydantic_utils.pyβ€Ž

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,15 @@ def format_pydantic_validation_error(exc: ValidationError) -> str:
2727
type_errors = [f"{'.'.join(map(str, err['loc']))}: expected {err['type']}" for err in exc.errors() if err["type"] == "type_error"]
2828
value_errors = [f"{'.'.join(map(str, err['loc']))}: {err['msg']}" for err in exc.errors() if err["type"] == "value_error"]
2929
enum_errors = [f"{'.'.join(map(str, err['loc']))}: invalid enum value" for err in exc.errors() if err["type"] == "enum"]
30+
model_type_errors: List[str] = []
31+
for err in exc.errors():
32+
if err["type"] == "model_type":
33+
field_path = ".".join(map(str, err["loc"]))
34+
# Extract expected type from context if available
35+
expected_type = err.get("ctx", {}).get("class_name", "unknown model type")
36+
actual_input = err.get("input", "unknown")
37+
actual_type = type(actual_input).__name__ if actual_input != "unknown" else "unknown"
38+
model_type_errors.append(f"{field_path}: expected {expected_type}, got {actual_type}")
3039

3140
# Add each type of error to the message if present
3241
if missing_fields:
@@ -39,9 +48,11 @@ def format_pydantic_validation_error(exc: ValidationError) -> str:
3948
error_msg += f"\nValue errors: {value_errors}"
4049
if enum_errors:
4150
error_msg += f"\nEnum errors: {enum_errors}"
51+
if model_type_errors:
52+
error_msg += f"\nModel type errors: {model_type_errors}"
4253

4354
# If none of the specific error types were found, add the raw error messages
44-
if not any([missing_fields, extra_fields, type_errors, value_errors, enum_errors]):
55+
if not any([missing_fields, extra_fields, type_errors, value_errors, enum_errors, model_type_errors]):
4556
error_msg += "\nOther validation errors:"
4657
for err in exc.errors():
4758
error_msg += f"\n{'.'.join(map(str, err['loc']))}: {err['type']}: {err['msg']}"

0 commit comments

Comments
Β (0)