CodeBoarding
diff --git a/‎CHANGELOG.md‎
Lines changed: 38 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎Makefile‎
Lines changed: 14 additions & 3 deletions b/‎Makefile‎
Lines changed: 14 additions & 3 deletions
diff --git a/‎pipelex/cli/_cli.py‎
Lines changed: 17 additions & 16 deletions b/‎pipelex/cli/_cli.py‎
Lines changed: 17 additions & 16 deletions
diff --git a/‎pipelex/cogt/content_generation/content_generator_dry.py‎
Lines changed: 249 additions & 0 deletions b/‎pipelex/cogt/content_generation/content_generator_dry.py‎
Lines changed: 249 additions & 0 deletions
@@ -1,5 +1,43 @@
 # Changelog
 
+## [v0.3.0] - 2025-06-10
+
+### Highlights
+
+- **Structured Input Specifications**: Pipe inputs are now defined as a dictionary mapping a required variable name to a concept code (`required_variable` -> `concept_code`). This replaces the previous single `input` field and allows for multiple, named inputs, making pipes more powerful and explicit. This is a **breaking change**.
+- **Static Validation for Inference Pipes**: You can now catch configuration and input mistakes in your pipelines *before* running any operations. This static validation checks `PipeLLM`, `PipeOcr`, and `PipeImgGen`. Static validation for controller pipes (PipeSequence, PipeParallel…) will come in a future release.
+    - Configure the behavior for different error types using the `static_validation_config` section in your settings. For each error type, choose to `raise`, `log`, or `ignore`.
+- **Dry Run Mode for Zero-Cost Pipeline Validation**: A powerful dry-run mode allows you to test entire pipelines without making any actual inference calls. It's fast, costs nothing, works offline, and is perfect for linting and validating pipeline logic.
+    - The new `dry_run_config` lets you control settings, like disabling Jinja2 rendering during a dry run.
+    - This feature leverages `polyfactory` to generate mock Pydantic models for simulated outputs.
+    - Error handling for bad inputs during `run_pipe` has been improved and is fully effective in dry-run mode.
+    - One limitation: currently, dry running doesn't work when the pipeline uses a PipeCondition. This will be fixed in a future release.
+
+### Added
+
+- **`native.Anything` Concept**: A new flexible native concept that is compatible with any other concept, simplifying pipe definitions where input types can vary.
+- Added dependency on `polyfactory` for mock Pydantic model generation in dry-run mode.
+
+### Changed
+
+- **Refactored Cognitive Workers**: The abstraction for `LLM`, `Imgg`, and `Ocr` workers has been elegantly simplified. The old decorator-based approach (`..._job_func`) has been replaced with a more robust pattern: a public base method now handles pre- and post-execution logic while calling a private abstract method that each worker implements.
+- The `b64_image_bytes` field in `PromptImageBytes` was renamed to `base_64` for better consistency.
+
+### Fixed
+
+- Resolved a logged error related to the pipe stack when using `PipeParallel`.
+- The pipe tracker functionality has been restored. It no longer crashes when using nested object attributes (e.g., `my_object.attribute`) as pipe inputs.
+
+### Tests
+
+- A new pytest command-line option `--pipe-run-mode` has been added to switch between `live` and `dry` runs (default is `dry`). All pipe tests now respect this mode.
+- Introduced the `pipelex_api` pytest marker for tests related to the Pipelex API client, separating them from general `inference` or `llm` tests.
+- Added a `make test-pipelex-api` target (shorthand: `make ta`) to exclusively run these new API client tests.
+
+### Removed
+
+- The `llm_job_func.py` file and the associated decorators have been removed as part of the cognitive worker refactoring.
+
 ## [v0.2.14] - 2025-06-06
 
 - Added a feature flag for the `ReportingManager` in the config: 
 
@@ -203,12 +203,12 @@ cleanall: cleanderived cleanenv cleanlibraries
 codex-tests: env
 	$(call PRINT_TITLE,"Unit testing for Codex")
 	@echo "• Running unit tests for Codex (excluding inference and codex_disabled)"
-	$(VENV_PYTEST) --exitfirst --quiet -m "not inference and not codex_disabled" || [ $$? = 5 ]
+	$(VENV_PYTEST) --exitfirst --quiet -m "not (inference or codex_disabled or pipelex_api)" || [ $$? = 5 ]
 
 gha-tests: env
 	$(call PRINT_TITLE,"Unit testing for github actions")
 	@echo "• Running unit tests for github actions (excluding inference and gha_disabled)"
-	$(VENV_PYTEST) --exitfirst --quiet -m "not inference and not gha_disabled" || [ $$? = 5 ]
+	$(VENV_PYTEST) --exitfirst --quiet -m "not (inference or gha_disabled or pipelex_api)" || [ $$? = 5 ]
 
 run-all-tests: env
 	$(call PRINT_TITLE,"Running all unit tests")
@@ -218,7 +218,7 @@ run-all-tests: env
 run-manual-trigger-gha-tests: env
 	$(call PRINT_TITLE,"Running GHA tests")
 	@echo "• Running GHA unit tests for inference, llm, and not gha_disabled"
-	$(VENV_PYTEST) --exitfirst --quiet -m "not gha_disabled and (inference or llm)" || [ $$? = 5 ]
+	$(VENV_PYTEST) --exitfirst --quiet -m "not (gha_disabled or pipelex_api) and (inference or llm)" || [ $$? = 5 ]
 
 run-gha_disabled-tests: env
 	$(call PRINT_TITLE,"Running GHA disabled tests")
@@ -303,6 +303,17 @@ test-imgg: env
 tg: test-imgg
 	@echo "> done: tg = test-imgg"
 
+test-pipelex-api: env
+	$(call PRINT_TITLE,"Unit testing")
+	@if [ -n "$(TEST)" ]; then \
+		$(VENV_PYTEST) --exitfirst -m "pipelex_api" -s -k "$(TEST)" $(if $(filter 1,$(VERBOSE)),-v,$(if $(filter 2,$(VERBOSE)),-vv,$(if $(filter 3,$(VERBOSE)),-vvv,))); \
+	else \
+		$(VENV_PYTEST) --exitfirst -m "pipelex_api" -s $(if $(filter 1,$(VERBOSE)),-v,$(if $(filter 2,$(VERBOSE)),-vv,$(if $(filter 3,$(VERBOSE)),-vvv,))); \
+	fi
+
+ta: test-pipelex-api
+	@echo "> done: ta = test-pipelex-api"
+
 ############################################################################################
 ############################               Linting              ############################
 ############################################################################################
 
@@ -137,22 +137,23 @@ def _format_concept_code(concept_code: Optional[str], current_domain: str) -> st
             pipes_dict[domain] = {}
 
             for pipe in domain_pipes:
-                if pipe.code:
-                    input_code = _format_concept_code(pipe.input_concept_code, domain)
-                    output_code = _format_concept_code(pipe.output_concept_code, domain)
-
-                    table.add_row(
-                        pipe.code,
-                        pipe.definition or "",
-                        input_code,
-                        output_code,
-                    )
-
-                    pipes_dict[domain][pipe.code] = {
-                        "definition": pipe.definition or "",
-                        "input": pipe.input_concept_code or "",
-                        "output": pipe.output_concept_code or "",
-                    }
+                inputs = pipe.inputs
+                formatted_inputs = [f"{name}: {_format_concept_code(concept_code, domain)}" for name, concept_code in inputs.items]
+                formatted_inputs_str = ", ".join(formatted_inputs)
+                output_code = _format_concept_code(pipe.output_concept_code, domain)
+
+                table.add_row(
+                    pipe.code,
+                    pipe.definition or "",
+                    formatted_inputs_str,
+                    output_code,
+                )
+
+                pipes_dict[domain][pipe.code] = {
+                    "definition": pipe.definition or "",
+                    "inputs": formatted_inputs_str,
+                    "output": pipe.output_concept_code,
+                }
 
             pretty_print(table)
 
 
@@ -0,0 +1,249 @@
+from typing import Any, Dict, List, Optional, Type
+
+from polyfactory.factories.pydantic_factory import ModelFactory
+from typing_extensions import override
+
+from pipelex import log
+from pipelex.cogt.content_generation.content_generator_protocol import ContentGeneratorProtocol, update_job_metadata
+from pipelex.cogt.image.generated_image import GeneratedImage
+from pipelex.cogt.imgg.imgg_handle import ImggHandle
+from pipelex.cogt.imgg.imgg_job_components import ImggJobConfig, ImggJobParams
+from pipelex.cogt.imgg.imgg_prompt import ImggPrompt
+from pipelex.cogt.llm.llm_models.llm_setting import LLMSetting
+from pipelex.cogt.llm.llm_prompt import LLMPrompt
+from pipelex.cogt.llm.llm_prompt_factory_abstract import LLMPromptFactoryAbstract
+from pipelex.cogt.ocr.ocr_handle import OcrHandle
+from pipelex.cogt.ocr.ocr_input import OcrInput
+from pipelex.cogt.ocr.ocr_job_components import OcrJobConfig, OcrJobParams
+from pipelex.cogt.ocr.ocr_output import ExtractedImageFromPage, OcrOutput, Page
+from pipelex.config import get_config
+from pipelex.pipeline.job_metadata import JobMetadata
+from pipelex.tools.templating.jinja2_environment import Jinja2TemplateCategory
+from pipelex.tools.templating.templating_models import PromptingStyle
+from pipelex.tools.typing.pydantic_utils import BaseModelTypeVar
+
+
+class ContentGeneratorDry(ContentGeneratorProtocol):
+    """
+    This class is used to generate mock content for testing purposes.
+    It does not use any inference.
+    """
+
+    @property
+    def _text_gen_truncate_length(self) -> int:
+        return get_config().pipelex.dry_run_config.text_gen_truncate_length
+
+    @override
+    @update_job_metadata
+    async def make_llm_text(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        llm_setting_main: LLMSetting,
+        llm_prompt_for_text: LLMPrompt,
+        wfid: Optional[str] = None,
+    ) -> str:
+        func_name = "make_llm_text"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        prompt_truncated = llm_prompt_for_text.desc(truncate_text_length=self._text_gen_truncate_length)
+        generated_text = f"DRY RUN: {func_name} • llm_setting={llm_setting_main.desc()} • prompt={prompt_truncated}"
+        return generated_text
+
+    @override
+    @update_job_metadata
+    async def make_object_direct(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        object_class: Type[BaseModelTypeVar],
+        llm_setting_for_object: LLMSetting,
+        llm_prompt_for_object: LLMPrompt,
+        wfid: Optional[str] = None,
+    ) -> BaseModelTypeVar:
+        func_name = "make_object_direct"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+
+        class ObjectFactory(ModelFactory[object_class]):  # type: ignore
+            __model__ = object_class
+
+        obj = ObjectFactory.build()
+        return obj
+
+    @override
+    @update_job_metadata
+    async def make_text_then_object(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        object_class: Type[BaseModelTypeVar],
+        llm_setting_main: LLMSetting,
+        llm_setting_for_object: LLMSetting,
+        llm_prompt_for_text: LLMPrompt,
+        llm_prompt_factory_for_object: Optional[LLMPromptFactoryAbstract] = None,
+        wfid: Optional[str] = None,
+    ) -> BaseModelTypeVar:
+        func_name = "make_text_then_object"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        return await self.make_object_direct(
+            job_metadata=job_metadata,
+            object_class=object_class,
+            llm_setting_for_object=llm_setting_for_object,
+            llm_prompt_for_object=llm_prompt_for_text,
+        )
+
+    @override
+    @update_job_metadata
+    async def make_object_list_direct(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        object_class: Type[BaseModelTypeVar],
+        llm_setting_for_object_list: LLMSetting,
+        llm_prompt_for_object_list: LLMPrompt,
+        wfid: Optional[str] = None,
+    ) -> List[BaseModelTypeVar]:
+        func_name = "make_object_list_direct"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        object_1 = await self.make_object_direct(
+            job_metadata=job_metadata,
+            object_class=object_class,
+            llm_setting_for_object=llm_setting_for_object_list,
+            llm_prompt_for_object=llm_prompt_for_object_list,
+        )
+        object_2 = await self.make_object_direct(
+            job_metadata=job_metadata,
+            object_class=object_class,
+            llm_setting_for_object=llm_setting_for_object_list,
+            llm_prompt_for_object=llm_prompt_for_object_list,
+        )
+        two_objects = [object_1, object_2]
+        return two_objects
+
+    @override
+    @update_job_metadata
+    async def make_text_then_object_list(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        object_class: Type[BaseModelTypeVar],
+        llm_setting_main: LLMSetting,
+        llm_setting_for_object_list: LLMSetting,
+        llm_prompt_for_text: LLMPrompt,
+        llm_prompt_factory_for_object_list: Optional[LLMPromptFactoryAbstract] = None,
+        wfid: Optional[str] = None,
+    ) -> List[BaseModelTypeVar]:
+        func_name = "make_text_then_object_list"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        return await self.make_object_list_direct(
+            job_metadata=job_metadata,
+            object_class=object_class,
+            llm_setting_for_object_list=llm_setting_for_object_list,
+            llm_prompt_for_object_list=llm_prompt_for_text,
+        )
+
+    @override
+    @update_job_metadata
+    async def make_single_image(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        imgg_handle: ImggHandle,
+        imgg_prompt: ImggPrompt,
+        imgg_job_params: Optional[ImggJobParams] = None,
+        imgg_job_config: Optional[ImggJobConfig] = None,
+        wfid: Optional[str] = None,
+    ) -> GeneratedImage:
+        func_name = "make_single_image"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        generated_image = GeneratedImage(
+            url="https://storage.googleapis.com/public_test_files_7fa6_4277_9ab/fashion/fashion_photo_1.jpg",
+            width=1536,
+            height=2752,
+        )
+        return generated_image
+
+    @override
+    @update_job_metadata
+    async def make_image_list(  # pyright: ignore[reportIncompatibleMethodOverride]
+        self,
+        job_metadata: JobMetadata,
+        imgg_handle: ImggHandle,
+        imgg_prompt: ImggPrompt,
+        nb_images: int,
+        imgg_job_params: Optional[ImggJobParams] = None,
+        imgg_job_config: Optional[ImggJobConfig] = None,
+        wfid: Optional[str] = None,
+    ) -> List[GeneratedImage]:
+        func_name = "make_image_list"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        generated_image_list = [
+            GeneratedImage(
+                url="https://storage.googleapis.com/public_test_files_7fa6_4277_9ab/fashion/fashion_photo_1.jpg",
+                width=1536,
+                height=2752,
+            ),
+            GeneratedImage(
+                url="https://storage.googleapis.com/public_test_files_7fa6_4277_9ab/fashion/fashion_photo_2.png",
+                width=1024,
+                height=1536,
+            ),
+        ]
+        return generated_image_list
+
+    @override
+    async def make_jinja2_text(
+        self,
+        context: Dict[str, Any],
+        jinja2_name: Optional[str] = None,
+        jinja2: Optional[str] = None,
+        prompting_style: Optional[PromptingStyle] = None,
+        template_category: Jinja2TemplateCategory = Jinja2TemplateCategory.LLM_PROMPT,
+        wfid: Optional[str] = None,
+    ) -> str:
+        func_name = "make_jinja2_text"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        jinja2_truncated = jinja2[: self._text_gen_truncate_length] if jinja2 else None
+        jinja2_text = (
+            f"DRY RUN: {func_name} • context={context} • jinja2_name={jinja2_name} • "
+            f"jinja2={jinja2_truncated} • prompting_style={prompting_style} • template_category={template_category}"
+        )
+        return jinja2_text
+
+    @override
+    async def make_ocr_extract_pages(
+        self,
+        job_metadata: JobMetadata,
+        ocr_input: OcrInput,
+        ocr_handle: OcrHandle,
+        ocr_job_params: Optional[OcrJobParams] = None,
+        ocr_job_config: Optional[OcrJobConfig] = None,
+        wfid: Optional[str] = None,
+    ) -> OcrOutput:
+        func_name = "make_ocr_extract_pages"
+        log.dev(f"🤡 DRY RUN: {self.__class__.__name__}.{func_name}")
+        if ocr_input.image_uri:
+            ocr_image_as_page = Page(
+                text="DRY RUN: OCR text",
+                extracted_images=[],
+                page_view=None,
+            )
+            ocr_output = OcrOutput(
+                pages={1: ocr_image_as_page},
+            )
+        else:
+            ocr_page_1 = Page(
+                text="DRY RUN: OCR text",
+                extracted_images=[],
+                page_view=ExtractedImageFromPage(
+                    image_id="page_view_1",
+                    base_64="",
+                    caption="DRY RUN: OCR text",
+                ),
+            )
+            ocr_page_2 = Page(
+                text="DRY RUN: OCR text",
+                extracted_images=[],
+                page_view=ExtractedImageFromPage(
+                    image_id="page_view_2",
+                    base_64="",
+                    caption="DRY RUN: OCR text",
+                ),
+            )
+            ocr_output = OcrOutput(
+                pages={1: ocr_page_1, 2: ocr_page_2},
+            )
+        return ocr_output