Skip to content

Commit 9ca4e6b

Browse files
Merge pull request Pipelex#79 from Pipelex/release/v0.3.2
Release/v0.3.2
2 parents 08e00ab + 7b42792 commit 9ca4e6b

14 files changed

Lines changed: 1451 additions & 578 deletions

File tree

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
# Changelog
22

3+
## [v0.3.2] - 2025-06-13
4+
5+
- Improved automatic insertion of class structure from BaseModel into prompts, based on the PipeLLM's `output_concept`. New unit test included.
6+
- The ReportingManager now reports costs for all pipeline IDs when no `pipeline_run_id` is specified.
7+
- The `make_from_str` method from the `StuffFactory` class now uses `Text` context by default.
8+
39
## [v0.3.1] - 2025-06-10
410

511
### Added

README.md

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
<div align="center">
22
<a href="https://www.pipelex.com/"><img src="https://raw.githubusercontent.com/Pipelex/pipelex/main/.github/assets/logo.png" alt="Pipelex Logo" width="400" style="max-width: 100%; height: auto;"></a>
33

4-
<h3 align="center">The simpler way to build reliable LLM Pipelines</h3>
5-
<p align="center">Pipelex is an open‑source dev tool based on a simple declarative language<br/>that lets you define replicable, structured, composable LLM pipelines.</p>
4+
<h2 align="center">Lean-code language for repeatable workflows</h2>
5+
<p align="center">Pipelex is based on a simple declarative language that lets you define repeatable, structured, composable AI workflows.</p>
66

77
<div>
88
<a href="https://www.pipelex.com/demo"><strong>Demo</strong></a> -
@@ -19,14 +19,24 @@
1919
<br/>
2020
<br/>
2121
<a href="https://www.youtube.com/@PipelexAI"><img src="https://img.shields.io/badge/YouTube-FF0000?logo=youtube&logoColor=white" alt="YouTube"></a>
22-
<a href="https://pipelex.com"><img src="https://img.shields.io/badge/Web-pipelex.com-03bb95?logo=google-chrome&logoColor=white&style=flat" alt="Website"></a>
22+
<a href="https://pipelex.com"><img src="https://img.shields.io/badge/Homepage-03bb95?logo=google-chrome&logoColor=white&style=flat" alt="Website"></a>
23+
<a href="https://github.com/Pipelex/pipelex-cookbook"><img src="https://img.shields.io/badge/Cookbook-03bb95?logo=github&logoColor=white&style=flat" alt="Cookbook"></a>
2324
<a href="https://discord.gg/SReshKQjWt"><img src="https://img.shields.io/badge/Discord-5865F2?logo=discord&logoColor=white" alt="Discord"></a>
2425
<br/>
2526
<br/>
2627
</div>
2728

2829
<div align="center">
29-
<a href="https://www.pipelex.com/demo"><strong>Checkout our demo!</strong></a>
30+
<h2 align="center">📜 The Knowledge Pipeline Manifesto</h2>
31+
<p align="center">
32+
<a href="https://www.pipelex.com/post/the-knowledge-pipeline-manifesto"><strong>Read why we built Pipelex to transform unreliable AI workflows into deterministic pipelines 🔗</strong></a>
33+
</p>
34+
35+
<h2 align="center">🚀 See Pipelex in Action</h2>
36+
<p align="center">
37+
<a href="https://www.pipelex.com/demo"><strong>Checkout our Demo</strong></a>
38+
</p>
39+
3040
</div>
3141

3242
# 📑 Table of Contents
@@ -42,15 +52,32 @@
4252

4353
# Introduction
4454

45-
Pipelex™ is a developer tool designed to simplify building reliable AI applications. At its core is a clear, declarative pipeline language specifically crafted for knowledge-processing tasks.
55+
Pipelex makes it easy for developers to define and run repeatable AI workflows. At its core is a clear, declarative pipeline language specifically crafted for knowledge-processing tasks.
4656

47-
**The Pipelex language uses pipelines,** or "pipes", each capable of integrating different language models (LLMs) or software to process knowledge. Pipes consistently deliver **structured, predictable outputs** at each stage.
57+
Build **pipelines** from modular pipes that snap together. Each pipe can use a different language model (LLM) or software to process knowledge. Pipes consistently deliver **structured, predictable outputs** at each stage.
4858

49-
Pipelex employs user-friendly TOML syntax, enabling developers to intuitively define workflows in a narrative-like manner. This approach facilitates collaboration between business professionals, developers, and language models (LLMs), ensuring clarity and ease of communication.
59+
Pipelex uses TOML syntax, making workflows readable and shareable. Business professionals, developers, and AI coding agents can all understand and modify the same pipeline definitions.
60+
61+
Example:
62+
```toml
63+
[concept]
64+
Buyer = "The person who made the purchase"
65+
PurchaseDocumentText = "Transcript of a receipt, invoice, or order confirmation"
66+
67+
[pipe.extract_buyer]
68+
PipeLLM = "Extract buyer from purchase document"
69+
inputs = { purchase_document_text = "PurchaseDocumentText" }
70+
output = "Buyer"
71+
llm = "llm_to_extract_info"
72+
prompt_template = """
73+
Extract the first and last name of the buyer from this purchase document:
74+
@purchase_document_text
75+
"""
76+
```
5077

51-
Pipes function like modular building blocks, **assembled by connecting other pipes sequentially, in parallel, or by calling sub-pipes.** This assembly resembles function calls in traditional programming but emphasizes a more intuitive, plug-and-play structure, focused explicitly on clear knowledge input and output.
78+
Pipes are modular building blocks that **connect sequentially, run in parallel, or call sub-pipes.** Like function calls in traditional programming, but with a clear contract: knowledge-in, knowledge-out. This modularity makes pipelines perfect for sharing: fork someone's invoice processor, adapt it for receipts, share it back.
5279

53-
Pipelex is distributed as an **open-source Python library,** with a hosted API launching soon, enabling effortless integration into existing software systems and automation frameworks. Additionally, Pipelex will provide an MCP server that will enable AI Agents to run pipelines like any other tool.
80+
Pipelex is an **open-source Python library** with a hosted API launching soon. It integrates seamlessly into existing systems and automation frameworks. Plus, it works as an [MCP server](https://github.com/Pipelex/pipelex-mcp) so AI agents can use pipelines as tools.
5481

5582
# 🚀 Quick start
5683

pipelex/cogt/llm/llm_worker_abstract.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,9 @@ async def gen_object(
131131
result = await self._gen_object(llm_job=llm_job, schema=schema)
132132
except InstructorRetryException as exc:
133133
raise LLMCompletionError(
134-
f"LLM Worker error: Instructor failed after retry with llm '{self.llm_engine.tag}': {exc}\nLLMPrompt: {llm_job.llm_prompt.desc}"
134+
f"""Instructor failed to generate object: {schema} after retry with llm '{self.llm_engine.tag}'
135+
Reason: {exc}
136+
LLMPrompt: {llm_job.llm_prompt.desc}"""
135137
) from exc
136138

137139
# Cleanup result

pipelex/core/stuff.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ def as_list_of_fixed_content_type(self, item_type: Type[StuffContentType]) -> Li
136136
# Validate all items are of the expected type
137137
for i, item in enumerate(list_content.items):
138138
if not isinstance(item, item_type):
139-
raise TypeError(f"Item {i} in list is of type {type(item)}, not {item_type}")
139+
raise TypeError(f"Item {i} in list is of type {type(item)}, not {item_type}, in {self.stuff_name=} and {self.concept_code=}")
140140

141141
return list_content
142142

pipelex/core/stuff_factory.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,9 +95,9 @@ def make_from_blueprint(cls, blueprint: StuffBlueprint) -> "Stuff":
9595
@classmethod
9696
def make_from_str(
9797
cls,
98-
concept_code: str,
9998
str_value: str,
10099
name: Optional[str] = None,
100+
concept_code: str = NativeConcept.TEXT.code,
101101
pipelex_session_id: Optional[str] = None,
102102
) -> Stuff:
103103
if not Concept.concept_str_contains_domain(concept_code):

pipelex/pipe_operators/pipe_llm.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,11 +219,14 @@ async def _run_operator_pipe(
219219
# TODO: This DYNAMIC_OUTPUT_CONCEPT should not be a field in the params attribute of PipeRunParams.
220220
# It should be an attribute of PipeRunParams.
221221
output_concept_code = pipe_run_params.dynamic_output_concept_code or pipe_run_params.params.get(PipeRunParamKey.DYNAMIC_OUTPUT_CONCEPT)
222+
222223
if not output_concept_code:
223224
raise RuntimeError(f"No output concept code provided for dynamic output pipe '{self.code}'")
224225
else:
225226
output_concept_code = self.output_concept_code
226227

228+
self.pipe_llm_prompt.output_concept_code = output_concept_code
229+
227230
applied_output_multiplicity, is_multiple_output, fixed_nb_output = output_multiplicity_to_apply(
228231
output_multiplicity_base=self.output_multiplicity,
229232
output_multiplicity_override=pipe_run_params.output_multiplicity,
@@ -318,6 +321,7 @@ async def _run_operator_pipe(
318321
domain=self.domain,
319322
user_pipe_jinja2=user_pipe_jinja2,
320323
system_prompt=system_prompt,
324+
output_concept_code=output_concept_code,
321325
)
322326
llm_prompt_2_factory = PipedLLMPromptFactory(
323327
pipe_llm_prompt=pipe_llm_prompt_2,
@@ -340,6 +344,7 @@ async def _run_operator_pipe(
340344
domain=self.domain,
341345
user_pipe_jinja2=user_pipe_jinja2,
342346
system_prompt=system_prompt,
347+
output_concept_code=output_concept_code,
343348
)
344349
llm_prompt_2_factory = PipedLLMPromptFactory(
345350
pipe_llm_prompt=pipe_llm_prompt_2,

pipelex/pipe_operators/pipe_llm_prompt.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,10 @@ async def _run_operator_pipe(
174174
# Append output structure prompt if needed
175175
if pipe_run_params.dynamic_output_concept_code:
176176
user_text += PipeLLMPrompt.get_output_structure_prompt(output_concept=pipe_run_params.dynamic_output_concept_code)
177+
else:
178+
user_text += PipeLLMPrompt.get_output_structure_prompt(output_concept=self.output_concept_code)
179+
180+
log.verbose(f"User text with {self.output_concept_code=}:\n {user_text}")
177181

178182
############################################################
179183
# System text
@@ -219,16 +223,18 @@ def get_output_structure_prompt(output_concept: str) -> str:
219223
if not output_class:
220224
return ""
221225

222-
fields = get_type_structure(output_class, base_class=StuffContent)
226+
class_structure = get_type_structure(output_class, base_class=StuffContent)
223227

224-
if not fields:
228+
if not class_structure:
225229
return ""
226230

227231
output_structure_prompt = (
228-
f"\n\n---\nRequested output format: The output should contain the following fields:\n"
229-
f"{chr(10).join(fields)}\n"
232+
f"\n\n---\nRequested output format: The output should be the following class: {class_name}\n"
233+
f"{chr(10).join(class_structure)}\n"
230234
"You do NOT need to output a formatted JSON object, another LLM will take care of that. "
235+
"If you cannot find a value that is Optional, output None for that field."
231236
"However, you MUST clearly output the values for each of these fields in your response.\n---\n"
237+
"DO NOT create information. If the information is not present, output None."
232238
)
233239
return output_structure_prompt
234240

pipelex/reporting/reporting_manager.py

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,6 @@ def report_inference_job(self, inference_job: InferenceJobAbstract):
9494

9595
@override
9696
def generate_report(self, pipeline_run_id: Optional[str] = None):
97-
pipeline_run_id = pipeline_run_id or SpecialPipelineId.UNTITLED
9897
cost_report_file_path: Optional[str] = None
9998
if self._reporting_config.is_generate_cost_report_file_enabled:
10099
ensure_path(self._reporting_config.cost_report_dir_path)
@@ -104,13 +103,19 @@ def generate_report(self, pipeline_run_id: Optional[str] = None):
104103
extension=self._reporting_config.cost_report_extension,
105104
)
106105

107-
registry = self._get_registry(pipeline_run_id)
108-
CostRegistry.generate_report(
109-
pipeline_run_id=pipeline_run_id,
110-
llm_tokens_usages=registry.get_current_tokens_usage(),
111-
unit_scale=self._reporting_config.cost_report_unit_scale,
112-
cost_report_file_path=cost_report_file_path,
113-
)
106+
registries_to_process: Dict[str, UsageRegistry] = {}
107+
if pipeline_run_id:
108+
registries_to_process = {pipeline_run_id: self._get_registry(pipeline_run_id)}
109+
else:
110+
registries_to_process = self._usage_registries
111+
112+
for run_id, registry in registries_to_process.items():
113+
CostRegistry.generate_report(
114+
pipeline_run_id=run_id,
115+
llm_tokens_usages=registry.get_current_tokens_usage(),
116+
unit_scale=self._reporting_config.cost_report_unit_scale,
117+
cost_report_file_path=cost_report_file_path,
118+
)
114119

115120
@override
116121
def close_registry(self, pipeline_run_id: str):

0 commit comments

Comments
 (0)