LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI#2028
Conversation
|
Warning Review limit reached
Next review available in: 48 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (8)
WalkthroughAdds a new Konflux Tekton pipeline for RHEL AI/vLLM E2E tests, along with supporting script, manifest, config, and documentation updates to wire vLLM-specific secrets, models, and runtime settings through the test flow. ChangesRHEL AI Konflux Integration Pipeline
Estimated code review effort: 4 (Complex) | ~60 minutes Sequence Diagram(s)sequenceDiagram
participant ComponentA
participant ComponentB
ComponentA->>ComponentB: observable interaction
Possibly related PRs
Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml:
- Around line 395-399: The lightspeed-stack repo/revision selection in the
Tekton test script is being overwritten by a hardcoded fork and branch, so the
pipeline ignores the SNAPSHOT-derived values. Remove the unconditional REPO_URL
and REPO_REV reassignment in the lightspeed-stack test step and keep using the
values parsed from SNAPSHOT in that block, leaving any temporary override behind
the existing TODO only if it is explicitly gated for local use. Reference the
REPO_URL and REPO_REV assignments in the lightspeed-stack pipeline step when
updating this logic.
- Around line 112-171: In the spot provisioning path of the Tekton step, the
exit status of `mapt aws rhel-ai create` is not checked, so failures can fall
through and emit empty results. Update the spot branch in the shell block to
guard the `mapt aws rhel-ai create` call the same way the on-demand path uses
`CREATED`, and fail fast with a clear error if creation does not succeed. Keep
the fix localized around the existing `if [[ "$(params.spot)" == "true" ]]`
branch and the subsequent result-writing commands so `host` and `vllm-api-key`
are only written after a successful create.
- Around line 348-349: Remove the onError: continue setting from the
run-e2e-tests task so failures are not masked when PIPELINE_EXIT is non-zero.
Update the task definition in lightspeed-stack-rhelai-test.yaml for
run-e2e-tests, and keep destroy-rhelai in finally as the cleanup path so the
pipeline correctly fails on e2e errors.
In `@tests/e2e-prow/rhoai/pipeline-konflux.sh`:
- Line 54: The OPENAI_API_KEY check in pipeline-konflux.sh is incorrectly tied
to log()’s return value, so `QUIET=1` can trigger the failure path even when the
key exists. Update the validation near the OPENAI_API_KEY guard to use an
explicit conditional instead of `&& ... || ...`, and keep the existence check
separate from the `log` side effect so `log()` cannot influence the exit
behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: aa3c3a80-cd0a-4ff3-8b89-20dfd359c0dd
📒 Files selected for processing (7)
.tekton/integration-tests/README.md.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yamltests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yamltests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yamltests/e2e-prow/rhoai/pipeline-konflux.shtests/e2e/configs/run-rhelai.yamltests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
📜 Review details
⏰ Context from checks skipped due to timeout. (12)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: integration_tests (3.12)
- GitHub Check: integration_tests (3.13)
⚠️ CI failures not shown inline (4)
GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt: LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI
Conclusion: failure
##[group]Run set -euo pipefail
�[36;1mset -euo pipefail�[0m
�[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
�[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
�[36;1m echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m
GitHub Actions: OpenAPI (Spectral) / spectral: LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI
Conclusion: failure
##[group]Run set -euo pipefail
�[36;1mset -euo pipefail�[0m
�[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
�[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
�[36;1m echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m
GitHub Actions: Unit tests / 1_unit_tests (3.13).txt: LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI
Conclusion: failure
##[group]Run uv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing
�[36;1muv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing�[0m
shell: /usr/bin/bash -e {0}
env:
UV_PYTHON: 3.13
VIRTUAL_ENV: /home/runner/work/lightspeed-stack/lightspeed-stack/.venv
UV_CACHE_DIR: /home/runner/work/_temp/setup-uv-cache
##[endgroup]
Uninstalled 1 package in 3ms
Installed 1 package in 3ms
============================= test session starts ==============================
platform linux -- Python 3.13.14, pytest-9.1.1, pluggy-1.6.0
benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/runner/work/lightspeed-stack/lightspeed-stack
configfile: pyproject.toml
plugins: asyncio-1.4.0, benchmark-5.2.3, anyio-4.14.1, order-1.5.0, mock-3.15.1, cov-7.1.0, logfire-4.37.0
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 2925 items
tests/unit/a2a_storage/test_in_memory_context_store.py ........ [ 0%]
tests/unit/a2a_storage/test_sqlite_context_store.py .......... [ 0%]
tests/unit/a2a_storage/test_storage_factory.py ........... [ 0%]
tests/unit/app/endpoints/test_a2a.py .............................. [ 2%]
tests/unit/app/endpoints/test_authorized.py ... [ 2%]
tests/unit/app/endpoints/test_config.py .. [ 2%]
tests/unit/app/endpoints/test_conversations.py ......................... [ 3%]
................. [ 3%]
tests/unit/app/endpoints/test_conversations_v2.py ...................... [ 4%]
............... [ 4%]
tests/unit/app/endpoints/test_feedback.py ....................... [ 5%]
tests/unit/ap...
GitHub Actions: Unit tests / unit_tests (3.13): LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI
Conclusion: failure
##[group]Run uv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing
�[36;1muv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing�[0m
shell: /usr/bin/bash -e {0}
env:
UV_PYTHON: 3.13
VIRTUAL_ENV: /home/runner/work/lightspeed-stack/lightspeed-stack/.venv
UV_CACHE_DIR: /home/runner/work/_temp/setup-uv-cache
##[endgroup]
Uninstalled 1 package in 3ms
Installed 1 package in 3ms
============================= test session starts ==============================
platform linux -- Python 3.13.14, pytest-9.1.1, pluggy-1.6.0
benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/runner/work/lightspeed-stack/lightspeed-stack
configfile: pyproject.toml
plugins: asyncio-1.4.0, benchmark-5.2.3, anyio-4.14.1, order-1.5.0, mock-3.15.1, cov-7.1.0, logfire-4.37.0
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 2925 items
tests/unit/a2a_storage/test_in_memory_context_store.py ........ [ 0%]
tests/unit/a2a_storage/test_sqlite_context_store.py .......... [ 0%]
tests/unit/a2a_storage/test_storage_factory.py ........... [ 0%]
tests/unit/app/endpoints/test_a2a.py .............................. [ 2%]
tests/unit/app/endpoints/test_authorized.py ... [ 2%]
tests/unit/app/endpoints/test_config.py .. [ 2%]
tests/unit/app/endpoints/test_conversations.py ......................... [ 3%]
................. [ 3%]
tests/unit/app/endpoints/test_conversations_v2.py ...................... [ 4%]
............... [ 4%]
tests/unit/app/endpoints/test_feedback.py ....................... [ 5%]
tests/unit/ap...
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-02-19T10:06:50.647Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1181
File: tests/e2e-prow/rhoai/manifests/lightspeed/mock-jwks.yaml:32-34
Timestamp: 2026-02-19T10:06:50.647Z
Learning: In the rhoai tests under tests/e2e-prow/rhoai/manifests, avoid static ConfigMap definitions for mock-jwks-script and mcp-mock-server-script since these ConfigMaps are created dynamically by the pipeline.sh deployment script using 'oc create configmap'. Ensure there are no static ConfigMap resources for these names in the manifests. If such ConfigMaps are added in the future, coordinate with the pipeline to reflect dynamic creation or adjust tests to rely on the dynamic provisioning.
Applied to files:
tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yamltests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
📚 Learning: 2026-05-20T08:09:30.641Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.
Applied to files:
tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yamltests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yamltests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yamltests/e2e/configs/run-rhelai.yaml
🪛 markdownlint-cli2 (0.22.1)
.tekton/integration-tests/README.md
[warning] 38-38: Files should end with a single newline character
(MD047, single-trailing-newline)
🔇 Additional comments (6)
tests/e2e/configs/run-rhelai.yaml (1)
24-32: LGTM!tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml (1)
21-35: LGTM!tests/e2e-prow/rhoai/pipeline-konflux.sh (1)
75-79: LGTM!Also applies to: 385-391
tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml (1)
33-38: LGTM!tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml (1)
146-166: LGTM!.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml (1)
90-91: 🩺 Stability & Availability
#!/bin/shis fine for this image.ghcr.io/redhat-developer/mapt:pr-848is based on UBI 9, so the shell supports the[[ ... ]]andpipefailusage here.> Likely an incorrect or invalid review comment.
| [[ -n "$QUAY_ROBOT_NAME" ]] && log "✅ QUAY_ROBOT_NAME is set" || { echo "❌ Missing QUAY_ROBOT_NAME"; exit 1; } | ||
| [[ -n "$QUAY_ROBOT_PASSWORD" ]] && log "✅ QUAY_ROBOT_PASSWORD is set" || { echo "❌ Missing QUAY_ROBOT_PASSWORD"; exit 1; } | ||
| [[ -n "$OPENAI_API_KEY" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; } | ||
| [[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
&& log || { exit 1 } can falsely fail when QUIET=1.
log() returns the status of [ "$QUIET" != "1" ], which is non-zero whenever QUIET=1. In that case the && log "…" arm returns non-zero and the || arm executes, printing ❌ Missing OPENAI_API_KEY and exiting 1 even though the key is set. Use an explicit if to decouple validation from the side-effecting log.
🐛 Proposed fix
-[[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; }
+if [[ -n "${OPENAI_API_KEY:-}" ]]; then
+ log "✅ OPENAI_API_KEY is set"
+else
+ echo "❌ Missing OPENAI_API_KEY"; exit 1
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| [[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; } | |
| if [[ -n "${OPENAI_API_KEY:-}" ]]; then | |
| log "✅ OPENAI_API_KEY is set" | |
| else | |
| echo "❌ Missing OPENAI_API_KEY"; exit 1 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/e2e-prow/rhoai/pipeline-konflux.sh` at line 54, The OPENAI_API_KEY
check in pipeline-konflux.sh is incorrectly tied to log()’s return value, so
`QUIET=1` can trigger the failure path even when the key exists. Update the
validation near the OPENAI_API_KEY guard to use an explicit conditional instead
of `&& ... || ...`, and keep the existence check separate from the `log` side
effect so `log()` cannot influence the exit behavior.
4dda48f to
66520fa
Compare
|
Note: The SNAPSHOT image is currently missing the |
- Update run-rhelai.yaml: use base_url, VLLM_* env vars, restore comments - Add lightspeed-stack-rhelai.yaml: LCS config with vllm provider - Sync examples/vllm-rhelai.yaml with test config - Parameterize pipeline-konflux.sh for LLAMA_STACK_CONFIG, LCS_CONFIG, VLLM_URL, VLLM_MODEL, VLLM_API_KEY - Add optional VLLM env vars to pod manifests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml:
- Around line 61-62: The pipeline definition is exposing the vLLM API key
through Tekton params/results, which leaks it into TaskRun/PipelineRun metadata.
Update the lightspeed stack test pipeline to stop using `params.vllm-api-key`
and `results.vllm-api-key`, and switch the relevant tasks/steps to read the key
from a Secret-backed runtime handoff instead. Make the same change anywhere the
key is passed or referenced in the pipeline spec so the `vllm-api-key` stays out
of metadata.
- Around line 108-124: The shell snippets are currently treated as /bin/sh but
rely on Bash-only features, so make them consistent by either converting the
script logic in create-instance and the other script blocks to POSIX sh syntax
or explicitly switching them to Bash with a Bash-capable shebang/image. Update
the create-instance script to remove non-POSIX constructs like pipefail and [[
... ]], and audit the script blocks around the other referenced Tekton steps
that use here-strings so they also run under the chosen shell.
- Around line 150-166: The teardown path is hard-coding a region that may differ
from the one selected by `mapt aws rhel-ai create`, which can leave
non-us-east-1 resources behind. Update the `lightspeed-stack-rhelai-test` flow
so the chosen region from the create loop is persisted and reused by `mapt aws
rhel-ai destroy`, or iterate destroy over the same `REGIONS` set used during
provisioning. Make sure the logic around the `REGION` loop and the destroy step
stays aligned so cleanup targets the actual provisioned region.
In `@examples/vllm-rhelai.yaml`:
- Around line 145-148: The `llama-guard` shield is currently wired to a chat
model instead of a real guard model, so update the `shields` entry in the
`vllm-rhelai` example to use a proper guard-model `provider_shield_id` rather
than `openai/gpt-4o-mini`. Keep the `shield_id`/`provider_id` mapping for
`llama-guard`, but point `provider_shield_id` at an actual Llama Guard model
identifier that can enforce safety checks in the deployed vLLM setup.
- Around line 51-68: The vector store configuration is inconsistent because
vector_stores.default_provider_id still points to faiss while
providers.vector_io is empty. Update the vllm-rhelai.yaml config by either
restoring the matching providers.vector_io entry with provider_id faiss or
removing/changing the default_provider_id so it references an existing vector_io
provider. Keep the vector_stores and providers.vector_io sections aligned so the
default can resolve correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5ed83875-41f7-472d-b41f-b303c3ce28da
📒 Files selected for processing (8)
.tekton/integration-tests/README.md.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yamlexamples/vllm-rhelai.yamltests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yamltests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yamltests/e2e-prow/rhoai/pipeline-konflux.shtests/e2e/configs/run-rhelai.yamltests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
📜 Review details
⏰ Context from checks skipped due to timeout. (7)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
- GitHub Check: integration_tests (3.12)
- GitHub Check: pydocstyle
- GitHub Check: mypy
- GitHub Check: Pyright
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-02-19T10:06:50.647Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1181
File: tests/e2e-prow/rhoai/manifests/lightspeed/mock-jwks.yaml:32-34
Timestamp: 2026-02-19T10:06:50.647Z
Learning: In the rhoai tests under tests/e2e-prow/rhoai/manifests, avoid static ConfigMap definitions for mock-jwks-script and mcp-mock-server-script since these ConfigMaps are created dynamically by the pipeline.sh deployment script using 'oc create configmap'. Ensure there are no static ConfigMap resources for these names in the manifests. If such ConfigMaps are added in the future, coordinate with the pipeline to reflect dynamic creation or adjust tests to rely on the dynamic provisioning.
Applied to files:
tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yamltests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
📚 Learning: 2026-05-20T08:09:30.641Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.
Applied to files:
tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yamltests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yamltests/e2e/configs/run-rhelai.yamltests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yamlexamples/vllm-rhelai.yaml
🪛 markdownlint-cli2 (0.22.1)
.tekton/integration-tests/README.md
[warning] 38-38: Files should end with a single newline character
(MD047, single-trailing-newline)
🔇 Additional comments (14)
.tekton/integration-tests/README.md (1)
19-19: 🎯 Functional CorrectnessClarify the exact source of the per-run S3 prefix.
The README states the prefix uses
<pipelinerun-name>, but the pipeline script uses${RUN_ID}. Please confirm thatRUN_IDis set to the PipelineRun name (or a unique derivative), and document the exact variable or mechanism used so operators don't assume a different naming scheme.tests/e2e-prow/rhoai/pipeline-konflux.sh (6)
54-54: 🎯 Functional CorrectnessStill using
&& log || { exit 1 }for OPENAI_API_KEY check.This is the same pattern flagged in a prior review:
log()returns the exit status of[ "$QUIET" != "1" ], which is non-zero whenQUIET=1. In that case the&&arm "fails" and the||arm runs, printing❌ Missing OPENAI_API_KEYand exiting even though the key is present.🐛 Proposed fix
-[[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; } +if [[ -n "${OPENAI_API_KEY:-}" ]]; then + log "✅ OPENAI_API_KEY is set" +else + echo "❌ Missing OPENAI_API_KEY"; exit 1 +fi
2-3: LGTM!
68-79: LGTM!
174-183: LGTM!
385-391: LGTM!
54-57: 🎯 Functional Correctness
OPENAI_API_KEYis already supplied by the mounted secret The RHEL AI Tekton task mountsopenai-api-keyat/var/run/openai, andpipeline-konflux.shreads/var/run/openai/openai-api-keybefore this guard.> Likely an incorrect or invalid review comment.tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml (1)
33-38: LGTM!tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml (1)
141-166: LGTM!Marking
OPENAI_API_KEYoptional and adding the vLLM env vars (eachoptional: true) is consistent with supporting mode-switching between OpenAI and vLLM..tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml (2)
128-145: Spot provisioning failure handling is still missing.The previous finding still applies: the spot branch does not guard the
mapt aws rhel-ai createexit status before writinghostandvllm-api-keyresults.
364-365:onError: continuestill masks e2e failures.The previous finding still applies:
finallyalready performs cleanup, so this makes failed tests look successful.tests/e2e/configs/run-rhelai.yaml (1)
25-33: LGTM!Also applies to: 135-138
tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml (1)
1-36: LGTM!examples/vllm-rhelai.yaml (1)
15-15: LGTM!Also applies to: 25-33, 114-114, 135-144
| - name: vllm-api-key | ||
| description: "Random API key for vLLM authentication" |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Map the target file and inspect the referenced ranges.
file='.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml'
echo '== file exists =='
git ls-files "$file"
echo
echo '== line count =='
wc -l "$file"
echo
echo '== relevant lines =='
sed -n '50,80p;175,200p;305,320p;388,402p' "$file" | cat -nRepository: lightspeed-core/lightspeed-stack
Length of output: 4241
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file='.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml'
echo '== surrounding pipeline blocks =='
sed -n '160,220p;300,410p' "$file" | cat -n
echo
echo '== search for secret/workspace handoff patterns in Tekton pipelines =='
rg -n "secretRef|workspace|Secret|secretName|results\.(.*api-key|.*token)|params.*api-key|VLLM_API_KEY" .tekton -g '*.yaml'Repository: lightspeed-core/lightspeed-stack
Length of output: 16183
🌐 Web query:
Tekton Task results are stored in TaskRun status and PipelineRun status. Are Tekton results and params appropriate for secret values?
💡 Result:
No, Tekton results and parameters are not appropriate for storing secret values [1][2]. Storing sensitive information as Tekton parameters or results is a security risk because these values are stored as plain text within the TaskRun and PipelineRun Kubernetes resources [1]. Once a value is recorded in the status of a TaskRun or PipelineRun, it becomes visible to anyone with read access to those Kubernetes objects [1][3]. To securely handle secret values in Tekton, use the following established patterns: 1. Kubernetes Secrets: Reference existing Kubernetes Secret objects directly in your Task definitions using env, envFrom, or VolumeMounts [4][5][6][7]. This prevents the secret value from being exposed as a parameter or result. When using env or envFrom, you can map specific keys from a Secret to environment variables available only to the steps that require them [4][6]. 2. Scope Access: Avoid mounting sensitive secrets to all steps in a task. Use VolumeMounts to make a secret available only to specific steps that require it [8][7]. 3. External Secrets: For more advanced security, use tools like the External Secrets Operator or HashiCorp Vault to inject secrets at runtime [9][10]. This ensures that secrets are not permanently stored in the cluster as Kubernetes Secret objects or defined in plain text within Git repositories [10]. 4. Service Account Authorization: Ensure that the ServiceAccount associated with a TaskRun has the minimum necessary RBAC permissions (e.g., restricted to specific secrets via resourceNames) to access only the secrets it requires [9][11]. If you need to pass data between tasks, use Tekton results for non-sensitive data only [2]. If you encounter a scenario where you feel tempted to pass a secret as a result, reconsider your architecture to have the consuming task fetch the required credential directly from a secure source (e.g., Vault or a Kubernetes Secret) using an identity-based approach [9][11][10].
Citations:
- 1: Use secret values as parameters of a task tektoncd/pipeline#3443
- 2: Pipelinerun should be able to read from Secret and Configmap tektoncd/pipeline#8347
- 3: https://tekton.dev/docs/pipelines/pipelineruns/
- 4: https://tekton.dev/docs/pipelines/tasks/
- 5: https://tekton.dev/vault/pipelines-main/tasks/
- 6: https://github.com/tektoncd/pipeline/blob/main/docs/tasks.md
- 7: https://github.com/tektoncd/pipeline/blob/main/docs/auth.md
- 8: https://safeguard.sh/resources/blog/tekton-pipelines-hardening-guide
- 9: https://developer.hashicorp.com/well-architected-framework/secure-systems/secure-applications/ci-cd-secrets/tekton
- 10: https://developers.redhat.com/articles/2025/12/16/manage-credentials-tekton-openshift-ibm-cloud
- 11: https://safeguard.sh/resources/blog/tekton-pipeline-security-guide
Do not pass the vLLM API key through Tekton results or params. It ends up in TaskRun/PipelineRun metadata via results.vllm-api-key and params.vllm-api-key; use a Secret reference or another runtime secret handoff instead. Also applies to 185-187, 313-314, 394-397.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml around
lines 61 - 62, The pipeline definition is exposing the vLLM API key through
Tekton params/results, which leaks it into TaskRun/PipelineRun metadata. Update
the lightspeed stack test pipeline to stop using `params.vllm-api-key` and
`results.vllm-api-key`, and switch the relevant tasks/steps to read the key from
a Secret-backed runtime handoff instead. Make the same change anywhere the key
is passed or referenced in the pipeline spec so the `vllm-api-key` stays out of
metadata.
| #!/bin/sh | ||
| set -uo pipefail | ||
|
|
||
| BUCKET=$(cat /opt/aws-credentials/S3_BUCKET) | ||
| RUN_ID="$(context.pipelineRun.name)" | ||
| HF_TOKEN=$(cat /opt/aws-credentials/HUGGING_FACE_HUB_TOKEN) | ||
| VLLM_API_KEY=$(head -c 32 /dev/urandom | od -An -tx1 | tr -d ' \n') | ||
|
|
||
| echo "[mapt] ========== Version Info ==========" | ||
| echo "[mapt] RHEL AI AMI version: $(params.rhelai-version)" | ||
| echo "[mapt] Model: $(params.model)" | ||
| echo "[mapt] Run ID: ${RUN_ID}" | ||
| echo "[mapt] Spot: $(params.spot)" | ||
| echo "[mapt] ==================================" | ||
|
|
||
| SPOT_ARGS="" | ||
| if [[ "$(params.spot)" == "true" ]]; then |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml"
# Show the relevant sections around the reported lines.
sed -n '100,140p;280,300p;400,420p' "$FILE" | cat -n
# Inspect how these script blocks are embedded/executed.
rg -n 'script:|#!/bin/sh|#!/bin/bash|pipefail|\[\[|<<<|/bin/sh|/bin/bash' "$FILE"Repository: lightspeed-core/lightspeed-stack
Length of output: 6179
🌐 Web query:
Tekton step script shebang default shell /bin/sh bash shebang documentation
💡 Result:
In Tekton, when you use the script field in a Task step, the script is executed within the container image [1][2][3]. If you do not provide a shebang line at the top of your script, Tekton automatically prepends the following default preamble [1][2][3]: #!/bin/sh set -e This default configuration ensures that the script runs using /bin/sh and exits immediately if any command fails [4]. If you want to use a different interpreter or change this behavior, you can override the default by explicitly including a shebang line at the beginning of your script [1][2][3]. For example: - To use Bash: #!/usr/bin/env bash - To use Python: #!/usr/bin/env python3 - To use Node: #!/usr/bin/env node The interpreter you specify must be present within the container image used for that step [1][2][3]. For Windows-based nodes, the behavior differs slightly; you must use a special #!win shebang (e.g., #!win powershell.exe -File or simply #!win to execute a command as a .cmd file) [1][3].
Citations:
- 1: https://github.com/tektoncd/pipeline/blob/main/docs/tasks.md
- 2: https://tekton.dev/vault/pipelines-main/tasks/
- 3: https://tekton.dev/docs/pipelines/tasks/
- 4: Script mode set -xe by default when no shebang used tektoncd/pipeline#1736
🌐 Web query:
Red Hat UBI 9 minimal /bin/sh bash symlink documentation
💡 Result:
In Red Hat Universal Base Image (UBI) 9, including the Minimal variant, the directory /bin is a symbolic link to /usr/bin [1][2]. Consequently, system binaries such as bash are located at /usr/bin/bash rather than /bin/bash [1][3]. Standard UBI 9 container images are configured with /bin/bash as their default entry point (CMD) [4]. Because of the filesystem layout in RHEL 9 and UBI 9, any reference to /bin/bash or /bin/sh effectively resolves to the files residing in /usr/bin [1][2]. Users or automated security scanners looking for shells at the traditional /bin path may fail to detect them if they do not account for the symlink, as the literal path /bin/bash does not exist as an independent file entry in the container image's filesystem layer [1][3].
Citations:
- 1: fix(hardening): remove the shell ubi-micro ships + detect shells at real paths NWarila/ubi9-application-template#2
- 2: NWarila/ubi9-application-template@13a7cca
- 3: NWarila/ubi9-application-template@bb08a2f
- 4: https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/building_running_and_managing_containers/assembly_types-of-container-images_building-running-and-managing-containers
Make these Tekton scripts POSIX-sh compatible or switch them to Bash.
create-instance is #!/bin/sh but uses pipefail and [[ ... ]]; the script: blocks at 286-289 and 401-411 also run as /bin/sh by default and use <<<. Convert them to POSIX shell or add a Bash shebang with a Bash-capable image.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml around
lines 108 - 124, The shell snippets are currently treated as /bin/sh but rely on
Bash-only features, so make them consistent by either converting the script
logic in create-instance and the other script blocks to POSIX sh syntax or
explicitly switching them to Bash with a Bash-capable shebang/image. Update the
create-instance script to remove non-POSIX constructs like pipefail and [[ ...
]], and audit the script blocks around the other referenced Tekton steps that
use here-strings so they also run under the chosen shell.
| for REGION in $REGIONS; do | ||
| echo "[mapt] Trying on-demand in ${REGION}..." | ||
| export AWS_DEFAULT_REGION="$REGION" | ||
|
|
||
| if timeout $TIMEOUT mapt aws rhel-ai create \ | ||
| --project-name "mapt-rhel-ai-${RUN_ID}" \ | ||
| --backed-url "s3://${BUCKET}/mapt/rhel-ai/${RUN_ID}" \ | ||
| --conn-details-output /opt/host-info \ | ||
| --compute-sizes "$(params.instance-type)" \ | ||
| --version "$(params.rhelai-version)" \ | ||
| --auto-start \ | ||
| --model "$(params.model)" \ | ||
| --hf-token "${HF_TOKEN}" \ | ||
| --api-key "${VLLM_API_KEY}" \ | ||
| --expose-ports 8000 \ | ||
| --vllm-extra-args "--max-model-len 131072 --enable-auto-tool-choice --tool-call-parser llama3_json --chat-template /opt/app-root/template/tool_chat_template_llama3.1_json.jinja" \ | ||
| --tags "project=lightspeed-core,environment=konflux-ci"; then |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file=".tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml"
echo "== line count =="
wc -l "$file"
echo "== provisioning snippet (around lines 140-175) =="
sed -n '140,175p' "$file" | cat -n
echo "== teardown snippet (around lines 470-500) =="
sed -n '470,500p' "$file" | cat -n
echo "== search for destroy-rhelai and AWS_DEFAULT_REGION =="
rg -n "destroy-rhelai|AWS_DEFAULT_REGION|REGION|regions|us-east-1" .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml
echo "== search repo-wide for destroy-rhelai references =="
rg -n "destroy-rhelai" .Repository: lightspeed-core/lightspeed-stack
Length of output: 4396
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== context around spot/on-demand branches =="
sed -n '120,190p' .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml | cat -n
echo "== all mapt aws rhel-ai destroy invocations with context =="
rg -n -C 3 "mapt aws rhel-ai destroy" .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml
echo "== search for MAPT documentation and region handling =="
rg -n -i "AWS_DEFAULT_REGION|backed-url|rhel-ai destroy|mapt aws rhel-ai" . --glob '!**/.git/**'Repository: lightspeed-core/lightspeed-stack
Length of output: 5227
🌐 Web query:
"mapt aws rhel-ai destroy backed-url AWS_DEFAULT_REGION region"
💡 Result:
The command mapt aws rhel-ai destroy is used within the Multi Architecture Provisioning Tool (mapt) to tear down RHEL AI infrastructure previously provisioned on AWS [1]. When running this command, mapt requires the --backed-url flag to locate the state or metadata associated with the provisioned resource, ensuring the correct environment is destroyed [1]. The --backed-url typically points to a storage location (such as an S3 bucket or local directory) where the project's state was saved during the create operation [1][2][3]. Although the AWS_DEFAULT_REGION environment variable is not an explicit argument of the mapt command itself, it is standard practice to set it in the execution environment (e.g., as an environment variable in a CI/CD pipeline or shell session) to ensure the AWS SDK interacts with the correct region [3]. Example usage: mapt aws rhel-ai destroy \ --project-name mapt-rhel-ai-[ID] \ --backed-url s3://[BUCKET]/mapt/rhel-ai/[ID] In this command, [ID] refers to a unique identifier for the specific instance or deployment [1]. The tool uses this information to identify and clean up the AWS resources associated with that project [1][4].
Citations:
- 1: adrianriobo/mapt@9090627
- 2: feat(aws): add RHAIIS auto-start and expose-ports flags for RHEL AI provisioning redhat-developer/mapt#835
- 3: https://github.com/redhat-developer/mapt/blob/main/docs/aws/eks.md
- 4: https://github.com/redhat-developer/mapt
🌐 Web query:
"site:github.com/redhat-developer/mapt docs aws eks destroy region backed-url AWS_DEFAULT_REGION"
💡 Result:
To destroy an EKS cluster provisioned by the Red Hat Multi Architecture Provisioning Tool (mapt), you must use the same project name and backend state information that were used during the cluster creation [1]. The state for the resources is stored at the location specified by the --backed-url flag [1]. When running mapt via a container (such as podman or docker), you must ensure that the --backed-url location is correctly mounted as a volume so the tool can access the state files [1]. You should also provide the appropriate AWS credentials and set the AWS_DEFAULT_REGION environment variable to match the region where the cluster was deployed [1][2]. Example command for destroying an EKS cluster: podman run -d --rm \ -v ${PWD}:/workspace:z \ -e AWS_ACCESS_KEY_ID=XXX \ -e AWS_SECRET_ACCESS_KEY=XXX \ -e AWS_DEFAULT_REGION=us-east-2 \ quay.io/redhat-developer/mapt:v1.0.0-dev aws eks destroy \ --project-name "mapt-eks" \ --backed-url file:///workspace Key considerations for the destroy operation: - Project Name: Must match the name used during creation [1]. - Backed URL: Must point to the directory or bucket containing the Pulumi state files [1][2]. - Region: Ensure AWS_DEFAULT_REGION is set correctly; mapt uses this environment variable if no explicit region flag is provided to the destroy command [2]. - Force Destroy: If the creation process was interrupted and a state lock prevents destruction, you can use the --force-destroy flag to remove the lock and proceed [3]. Note that this is generally supported for remote backends (e.g., S3) [3][2].
Citations:
- 1: https://github.com/redhat-developer/mapt/blob/main/docs/aws/eks.md
- 2: https://github.com/redhat-developer/mapt/blob/v0.14.1/pkg/provider/aws/aws.go
- 3: redhat-developer/mapt@5cdec34
🌐 Web query:
"site:github.com/redhat-developer/mapt \"AWS_DEFAULT_REGION\" \"rhel-ai destroy\""
💡 Result:
In the redhat-developer/mapt project, AWS_DEFAULT_REGION is a required environment variable used to specify the AWS region for operations [1][2]. When executing the mapt aws rhel-ai destroy command, this variable (or the region specified in the stack configuration) is utilized by the AWS provider to identify the correct environment and resources to clean up [1]. Technical usage patterns include: 1. Environment Configuration: Users typically export or pass AWS_DEFAULT_REGION as an environment variable when running mapt via container (e.g., podman run) or in CI/CD pipelines [3][4][2]. 2. Provider Logic: The mapt AWS provider checks AWS_DEFAULT_REGION (and AWS_REGION) when determining the hosting region for resources [1]. During a destroy operation, the provider ensures it targets the correct stack by resolving the region, either from the provided stack configuration or the environment [1]. 3. Automation: In CI/CD integrations like Tekton, this environment variable is commonly set alongside other AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) to ensure the mapt binary has the necessary context to perform the destroy operation [3]. Example usage for a destroy operation: podman run -d --rm \ -v ${PWD}:/workspace:z \ -e AWS_ACCESS_KEY_ID=XXX \ -e AWS_SECRET_ACCESS_KEY=XXX \ -e AWS_DEFAULT_REGION=us-east-1 \ quay.io/redhat-developer/mapt:latest aws rhel-ai destroy \ --project-name \ --backed-url
Citations:
- 1: https://github.com/redhat-developer/mapt/blob/v0.14.1/pkg/provider/aws/aws.go
- 2: https://github.com/redhat-developer/mapt/blob/main/docs/aws/eks.md
- 3: redhat-developer/mapt@9090627
- 4: https://github.com/redhat-developer/mapt/blob/main/docs/aws/openshift-snc.md
Reuse the provisioned region during destroy. mapt aws rhel-ai destroy uses AWS_DEFAULT_REGION, but final teardown hard-codes us-east-1 while on-demand provisioning can land in another region. Persist the chosen region from create, or destroy across the same region set, so non-us-east-1 instances don’t leak.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml around
lines 150 - 166, The teardown path is hard-coding a region that may differ from
the one selected by `mapt aws rhel-ai create`, which can leave non-us-east-1
resources behind. Update the `lightspeed-stack-rhelai-test` flow so the chosen
region from the create loop is persisted and reused by `mapt aws rhel-ai
destroy`, or iterate destroy over the same `REGIONS` set used during
provisioning. Make sure the logic around the `REGION` loop and the destroy step
stays aligned so cleanup targets the actual provisioned region.
| - provider_id: basic | ||
| provider_type: inline::basic | ||
| config: {} | ||
| - provider_id: llm-as-judge | ||
| provider_type: inline::llm-as-judge | ||
| config: {} | ||
| - provider_id: braintrust | ||
| provider_type: inline::braintrust | ||
| config: | ||
| openai_api_key: '********' | ||
| tool_runtime: | ||
| - config: {} | ||
| - config: {} # Enable the RAG tool | ||
| provider_id: rag-runtime | ||
| provider_type: inline::rag-runtime | ||
| vector_io: | ||
| - config: | ||
| persistence: | ||
| namespace: vector_io::faiss | ||
| backend: kv_default | ||
| provider_id: faiss | ||
| provider_type: inline::faiss | ||
| - config: {} # Enable MCP (Model Context Protocol) support | ||
| provider_id: model-context-protocol | ||
| provider_type: remote::model-context-protocol | ||
| vector_io: [] |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether the RHEL AI e2e flow relies on llama-stack's own vector_stores API
# (which would need a registered vector_io provider) vs. lightspeed-stack's byok_rag path.
rg -n "default_provider_id|vector_stores|vector_io" tests/e2e/configs/run-rhelai.yaml examples/vllm-rhelai.yaml tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 690
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the relevant sections in both files.
sed -n '1,220p' examples/vllm-rhelai.yaml | nl -ba | sed -n '1,220p'
printf '\n---\n'
sed -n '1,220p' tests/e2e/configs/run-rhelai.yaml | nl -ba | sed -n '1,220p'
# Check for any faiss/vector_io provider registration elsewhere in the repo.
rg -n "provider_id:\s*faiss|vector_io:|default_provider_id:\s*faiss|vector_stores:" .Repository: lightspeed-core/lightspeed-stack
Length of output: 210
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== examples/vllm-rhelai.yaml =="
sed -n '1,220p' examples/vllm-rhelai.yaml | cat -n | sed -n '130,170p'
echo
echo "== tests/e2e/configs/run-rhelai.yaml =="
sed -n '1,220p' tests/e2e/configs/run-rhelai.yaml | cat -n | sed -n '130,170p'
echo
echo "== repo search =="
rg -n "provider_id:\s*faiss|default_provider_id:\s*faiss|vector_io:|vector_stores:" examples/vllm-rhelai.yaml tests/e2e/configs/run-rhelai.yaml tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 2956
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find other examples/docs that explain how vector store defaults are wired.
rg -n "default_provider_id:\s*faiss|default_provider_id:|vector_stores:|vector_io:" \
examples tests docs .github README.md
echo
echo "== files mentioning vector_stores/default_provider_id in configs =="
rg -n "vector_stores:" -g '*.yaml' -g '*.yml' .Repository: lightspeed-core/lightspeed-stack
Length of output: 7382
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== docs/deployment_guide.md around vector_io =="
sed -n '350,390p' docs/deployment_guide.md | cat -n
echo
sed -n '915,950p' docs/deployment_guide.md | cat -n
echo
echo "== tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml =="
sed -n '1,240p' tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml | cat -n
echo
echo "== examples/run.yaml vector_io/vector_stores section =="
sed -n '130,170p' examples/run.yaml | cat -nRepository: lightspeed-core/lightspeed-stack
Length of output: 6598
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect tests that describe how vector_io and vector_stores are rendered.
rg -n "vector_stores|vector_io|default_provider_id" tests/unit/test_llama_stack_configuration.py
echo
# Show the surrounding assertions for the faiss cases.
sed -n '240,360p' tests/unit/test_llama_stack_configuration.py | cat -n
echo
sed -n '400,460p' tests/unit/test_llama_stack_configuration.py | cat -n
echo
sed -n '600,640p' tests/unit/test_llama_stack_configuration.py | cat -nRepository: lightspeed-core/lightspeed-stack
Length of output: 16793
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the implementation of vector store configuration construction.
sed -n '1,240p' tests/unit/test_llama_stack_configuration.py | cat -n
echo
sed -n '240,420p' tests/unit/test_llama_stack_configuration.py | cat -n
echo
sed -n '1,260p' <(python3 - <<'PY'
from pathlib import Path
p = Path('tests/unit/test_llama_stack_configuration.py')
text = p.read_text()
for name in ['construct_vector_stores_section', 'construct_vector_io_providers_section', 'enrich_byok_rag', '_build_vector_io_config']:
idx = text.find(f'def {name}')
if idx != -1:
start = text.rfind('\n', 0, idx)
print(f'### {name} ###')
print(text[idx:idx+2200])
print()
PY
)Repository: lightspeed-core/lightspeed-stack
Length of output: 18102
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the implementation file first.
rg -n "def (construct_vector_stores_section|construct_vector_io_providers_section|enrich_byok_rag|_build_vector_io_config)" .Repository: lightspeed-core/lightspeed-stack
Length of output: 523
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the implementation that consumes vector_stores and vector_io.
sed -n '180,460p' src/llama_stack_configuration.py | cat -n
echo
# Inspect the config model definitions around these fields.
rg -n "vector_stores|vector_io" src models -g '*.py'Repository: lightspeed-core/lightspeed-stack
Length of output: 20510
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the config schema for vector_stores and providers.vector_io.
sed -n '1180,1265p' src/models/config.py | cat -n
echo
sed -n '1265,1345p' src/models/config.py | cat -n
echo
# Show how the run example wires vector_io and vector_stores in one complete config.
sed -n '1,220p' examples/run.yaml | cat -nRepository: lightspeed-core/lightspeed-stack
Length of output: 12786
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show the vector store config model and how the API validates/uses default_provider_id.
rg -n "default_provider_id|class .*VectorStore|vector_stores" src/models/config.py src/app/endpoints/vector_stores.py src -g '*.py'
echo
sed -n '1450,1565p' src/models/config.py | cat -n
echo
sed -n '150,260p' src/app/endpoints/vector_stores.py | cat -nRepository: lightspeed-core/lightspeed-stack
Length of output: 16595
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the exact vector store config model definition and any validation around default_provider_id.
rg -n "class .*Vector.*Store|default_provider_id|default_embedding_model" src/models/config.py src -g '*.py'
echo
# Show the surrounding model definition.
sed -n '1340,1465p' src/models/config.py | cat -nRepository: lightspeed-core/lightspeed-stack
Length of output: 6962
🌐 Web query:
Llama Stack vector_stores default_provider_id providers.vector_io configuration faiss
💡 Result:
In Llama Stack, the configuration for vector stores involves defining the vector_io providers and setting default behavior within the stack's distribution configuration (run.yaml). The FAISS provider is an inline vector database used for high-performance, in-process similarity search [1][2]. Configuration Overview To use FAISS as your vector store, you must define it within the vector_io section of your providers configuration [3][4]. A standard configuration for the FAISS provider includes the provider_id, provider_type (inline::faiss), and the persistence settings [5][3][4]. Example configuration: providers: vector_io: - provider_id: faiss provider_type: inline::faiss config: persistence: namespace: vector_io::faiss backend: kv_default Stack-Level Defaults Llama Stack supports configuring a default vector store provider and a default embedding model to ensure deterministic behavior when creating vector stores [6][7]. This is defined at the stack level, typically within the vector_stores section of your configuration [5][8][3]. Example of setting defaults: vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers model_id: nomic-ai/nomic-embed-text-v1.5 Key Considerations 1. Precedence Logic: If an API call explicitly specifies an embedding model, it takes precedence [6][7]. If not, the system uses the global default defined in vector_stores [6][7]. 2. Limitations: FAISS is primarily in-memory and does not support keyword or hybrid search [9][10]. It is optimized for scenarios where fast, local retrieval is required and is generally less suited for frequent incremental updates compared to disk-based alternatives like sqlite-vec [9]. 3. Registration: While run.yaml defines the providers, there have been known issues regarding the registration of specific vector store resources from configuration [11]. In many current workflows, vector stores are registered dynamically via the Llama Stack client API [11][12].
Citations:
- 1: https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/3.3/html/working_with_llama_stack/llama-stack-adv-examples_rag
- 2: cdoern/llama-stack@19ae4b3
- 3: bug: llama stack not erroring out when creating vector store with unknown embedding model ogx-ai/ogx#4047
- 4: ogx-ai/ogx@725a0c3
- 5: https://github.com/lightspeed-core/lightspeed-stack/blob/main/run.yaml
- 6: feat(vector-io): implement global default embedding model configuration ogx-ai/ogx#2918
- 7: Make embedding model configurable for vector-io providers ogx-ai/ogx#2729
- 8: ogx-ai/ogx@48581bf
- 9: Document and benchmark performance tradeoffs between sqlite-vec and FAISS inline VectorDB providers ogx-ai/ogx#1165
- 10: Implement
v1/vector_stores/{vector_store_id}/searchfor Faiss ogx-ai/ogx#3012 - 11: Config-based vector store registration not implemented ogx-ai/ogx#4613
- 12: https://milvus.io/docs/llama_stack_with_milvus.md
Remove the dangling faiss default or restore the matching vector_io provider.
providers.vector_io is empty here, but vector_stores.default_provider_id: faiss still points at a provider that no longer exists. FAISS-backed configs register provider_id: faiss under providers.vector_io; without that entry, any vector-store creation that relies on the default can fail. Either re-add the FAISS provider or drop this default if vector stores are no longer part of this config.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/vllm-rhelai.yaml` around lines 51 - 68, The vector store
configuration is inconsistent because vector_stores.default_provider_id still
points to faiss while providers.vector_io is empty. Update the vllm-rhelai.yaml
config by either restoring the matching providers.vector_io entry with
provider_id faiss or removing/changing the default_provider_id so it references
an existing vector_io provider. Keep the vector_stores and providers.vector_io
sections aligned so the default can resolve correctly.
| shields: | ||
| - shield_id: llama-guard | ||
| provider_id: llama-guard | ||
| provider_shield_id: openai/gpt-4o-mini |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Llama Guard shield backed by a chat model, not a guard model.
provider_shield_id: openai/gpt-4o-mini uses a generative chat model id for the llama-guard safety shield. Based on learnings, this pattern only indicates a landed override and does not mean the shield actually gates queries — Llama Guard needs a real guard model (e.g. meta-llama/Llama-Guard-3-8B) to function as intended, otherwise e2e coverage gives false confidence that content safety is enforced.
🛡️ Suggested fix (pending guard-model availability in the RHEL AI vLLM deployment)
shields:
- shield_id: llama-guard
provider_id: llama-guard
- provider_shield_id: openai/gpt-4o-mini
+ provider_shield_id: meta-llama/Llama-Guard-3-8B📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| shields: | |
| - shield_id: llama-guard | |
| provider_id: llama-guard | |
| provider_shield_id: openai/gpt-4o-mini | |
| shields: | |
| - shield_id: llama-guard | |
| provider_id: llama-guard | |
| provider_shield_id: meta-llama/Llama-Guard-3-8B |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/vllm-rhelai.yaml` around lines 145 - 148, The `llama-guard` shield
is currently wired to a chat model instead of a real guard model, so update the
`shields` entry in the `vllm-rhelai` example to use a proper guard-model
`provider_shield_id` rather than `openai/gpt-4o-mini`. Keep the
`shield_id`/`provider_id` mapping for `llama-guard`, but point
`provider_shield_id` at an actual Llama Guard model identifier that can enforce
safety checks in the deployed vLLM setup.
Source: Learnings
Description
Add a Konflux Tekton pipeline for running the full e2e test suite against RHEL AI instances provisioned on AWS. The pipeline uses MAPT to provision a GPU instance with vLLM (RHAIIS) auto-started, then deploys and tests lightspeed-stack as in the existing Konflux integration tests but configured to use the RHEL AI vLLM as its inference provider.
The pipeline provisions instances with 96GB+ total VRAM (4x GPU) because the e2e tests require a 131072-token context window — some test requests exceed 65K tokens and fail with smaller context. Single-GPU instances (24GB) cannot fit both the model weights and the required KV cache.
Key features:
pipeline-konflux.shto support both OpenAI and vLLM inference providersNew/modified files:
tests/e2e/configs/run-rhelai.yaml— Llama Stack config withremote::vllmprovidertests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml— LCS config with vllm as default providertests/e2e-prow/rhoai/pipeline-konflux.sh— parameterized for VLLM_URL, VLLM_MODEL, VLLM_API_KEYtests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml— optional vLLM env varstests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml— optional VLLM_MODEL env var.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml— full pipeline.tekton/integration-tests/README.md— documentationType of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit