Skip to content

LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI#2028

Merged
are-ces merged 1 commit into
lightspeed-core:mainfrom
are-ces:rhelai-konflux
Jul 1, 2026
Merged

LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI#2028
are-ces merged 1 commit into
lightspeed-core:mainfrom
are-ces:rhelai-konflux

Conversation

@are-ces

@are-ces are-ces commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Description

Add a Konflux Tekton pipeline for running the full e2e test suite against RHEL AI instances provisioned on AWS. The pipeline uses MAPT to provision a GPU instance with vLLM (RHAIIS) auto-started, then deploys and tests lightspeed-stack as in the existing Konflux integration tests but configured to use the RHEL AI vLLM as its inference provider.

The pipeline provisions instances with 96GB+ total VRAM (4x GPU) because the e2e tests require a 131072-token context window — some test requests exceed 65K tokens and fail with smaller context. Single-GPU instances (24GB) cannot fit both the model weights and the required KV cache.

Key features:

  • RHEL AI provisioning via MAPT with auto-start, tool calling, and configurable context window
  • Spot/on-demand toggle with multi-instance-type fallback (g5.12xlarge, g6.12xlarge, g5.24xlarge, g6.24xlarge)
  • On-demand mode retries across 6 AWS regions with 10-minute timeout per attempt
  • Per-run S3 state isolation using PipelineRun name (no concurrent run conflicts)
  • Random API key per run for vLLM authentication
  • Parameterized pipeline-konflux.sh to support both OpenAI and vLLM inference providers
  • Integration tests README documenting MAPT, S3 bucket, provisioning modes, and AMI versioning

New/modified files:

  • tests/e2e/configs/run-rhelai.yaml — Llama Stack config with remote::vllm provider
  • tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml — LCS config with vllm as default provider
  • tests/e2e-prow/rhoai/pipeline-konflux.sh — parameterized for VLLM_URL, VLLM_MODEL, VLLM_API_KEY
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml — optional vLLM env vars
  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml — optional VLLM_MODEL env var
  • .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml — full pipeline
  • .tekton/integration-tests/README.md — documentation

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

  • Assisted-by: Claude Opus 4.6
  • Generated by: N/A

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  1. Pipeline tested end-to-end in Konflux with RHEL AI 3.4.0 GA on g5.12xlarge
  2. 270/276 e2e scenarios pass (6 failures due to model behavior differences between Llama-3.1-8B and gpt-4o-mini)
  3. Spot and on-demand provisioning validated locally and in Konflux
  4. Per-run S3 isolation verified with concurrent pipeline runs

Summary by CodeRabbit

  • New Features
    • Added Konflux integration test documentation covering OpenAI and RHEL AI/vLLM E2E variants, including AWS provisioning flow.
    • Introduced a new E2E pipeline for provisioning RHEL AI/vLLM, deploying an ephemeral hypershift cluster, and running the full Konflux suite.
    • Added vLLM-driven configuration for Lightspeed Core Service server-mode with RAG and external llama-stack connectivity.
  • Bug Fixes
    • Improved flexibility by making relevant secret/env keys optional for more environments.
    • Enhanced cleanup to reduce leftover cloud resources after failed or interrupted runs.

@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@are-ces, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 48 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d88ebb40-e530-44c4-9870-7a14cfed70ef

📥 Commits

Reviewing files that changed from the base of the PR and between 66520fa and a6a366f.

📒 Files selected for processing (8)
  • .tekton/integration-tests/README.md
  • .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml
  • examples/vllm-rhelai.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
  • tests/e2e-prow/rhoai/pipeline-konflux.sh
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml

Walkthrough

Adds a new Konflux Tekton pipeline for RHEL AI/vLLM E2E tests, along with supporting script, manifest, config, and documentation updates to wire vLLM-specific secrets, models, and runtime settings through the test flow.

Changes

RHEL AI Konflux Integration Pipeline

Layer / File(s) Summary
RHEL AI configs
tests/e2e/configs/run-rhelai.yaml, tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
Updates the RHEL AI run config to use VLLM_URL, VLLM_API_KEY, and VLLM_MODEL, and adds a new server-mode config with vLLM inference defaults and RAG wiring.
Example configuration
examples/vllm-rhelai.yaml
Updates the example vLLM configuration for the new provider wiring, registrations, storage path, and embedding defaults.
Pipeline script and manifests
tests/e2e-prow/rhoai/pipeline-konflux.sh, tests/e2e-prow/rhoai/manifests/lightspeed/*
Updates the Konflux E2E script and pod manifests to create optional vLLM secrets, use configurable config paths, and switch defaults when VLLM_URL is set.
Tekton pipeline
.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml
Defines the new Tekton pipeline for RHEL AI provisioning, cluster setup, image extraction, E2E execution, and MAPT cleanup.
Integration tests README
.tekton/integration-tests/README.md
Adds documentation for the Konflux integration-test pipelines and the RHEL AI provisioning flow.

Estimated code review effort: 4 (Complex) | ~60 minutes

Sequence Diagram(s)

sequenceDiagram
  participant ComponentA
  participant ComponentB
  ComponentA->>ComponentB: observable interaction
Loading

Possibly related PRs

Suggested labels: Review effort 2/5

Suggested reviewers: tisnik, radofuchs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a reliable CI path to provision RHEL AI instances.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml:
- Around line 395-399: The lightspeed-stack repo/revision selection in the
Tekton test script is being overwritten by a hardcoded fork and branch, so the
pipeline ignores the SNAPSHOT-derived values. Remove the unconditional REPO_URL
and REPO_REV reassignment in the lightspeed-stack test step and keep using the
values parsed from SNAPSHOT in that block, leaving any temporary override behind
the existing TODO only if it is explicitly gated for local use. Reference the
REPO_URL and REPO_REV assignments in the lightspeed-stack pipeline step when
updating this logic.
- Around line 112-171: In the spot provisioning path of the Tekton step, the
exit status of `mapt aws rhel-ai create` is not checked, so failures can fall
through and emit empty results. Update the spot branch in the shell block to
guard the `mapt aws rhel-ai create` call the same way the on-demand path uses
`CREATED`, and fail fast with a clear error if creation does not succeed. Keep
the fix localized around the existing `if [[ "$(params.spot)" == "true" ]]`
branch and the subsequent result-writing commands so `host` and `vllm-api-key`
are only written after a successful create.
- Around line 348-349: Remove the onError: continue setting from the
run-e2e-tests task so failures are not masked when PIPELINE_EXIT is non-zero.
Update the task definition in lightspeed-stack-rhelai-test.yaml for
run-e2e-tests, and keep destroy-rhelai in finally as the cleanup path so the
pipeline correctly fails on e2e errors.

In `@tests/e2e-prow/rhoai/pipeline-konflux.sh`:
- Line 54: The OPENAI_API_KEY check in pipeline-konflux.sh is incorrectly tied
to log()’s return value, so `QUIET=1` can trigger the failure path even when the
key exists. Update the validation near the OPENAI_API_KEY guard to use an
explicit conditional instead of `&& ... || ...`, and keep the existence check
separate from the `log` side effect so `log()` cannot influence the exit
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: aa3c3a80-cd0a-4ff3-8b89-20dfd359c0dd

📥 Commits

Reviewing files that changed from the base of the PR and between 8efa018 and cb7ad02.

📒 Files selected for processing (7)
  • .tekton/integration-tests/README.md
  • .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
  • tests/e2e-prow/rhoai/pipeline-konflux.sh
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
📜 Review details
⏰ Context from checks skipped due to timeout. (12)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: integration_tests (3.13)
⚠️ CI failures not shown inline (4)

GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt: LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

GitHub Actions: OpenAPI (Spectral) / spectral: LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI

Conclusion: failure

View job details

##[group]Run set -euo pipefail
 �[36;1mset -euo pipefail�[0m
 �[36;1muv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json�[0m
 �[36;1mif ! diff -u docs/openapi.json /tmp/openapi-generated.json; then�[0m
 �[36;1m  echo "::error::docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json"�[0m

GitHub Actions: Unit tests / 1_unit_tests (3.13).txt: LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI

Conclusion: failure

View job details

##[group]Run uv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing
 �[36;1muv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing�[0m
 shell: /usr/bin/bash -e {0}
 env:
   UV_PYTHON: 3.13
   VIRTUAL_ENV: /home/runner/work/lightspeed-stack/lightspeed-stack/.venv
   UV_CACHE_DIR: /home/runner/work/_temp/setup-uv-cache
 ##[endgroup]
 Uninstalled 1 package in 3ms
 Installed 1 package in 3ms
 ============================= test session starts ==============================
 platform linux -- Python 3.13.14, pytest-9.1.1, pluggy-1.6.0
 benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
 rootdir: /home/runner/work/lightspeed-stack/lightspeed-stack
 configfile: pyproject.toml
 plugins: asyncio-1.4.0, benchmark-5.2.3, anyio-4.14.1, order-1.5.0, mock-3.15.1, cov-7.1.0, logfire-4.37.0
 asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
 collected 2925 items
 tests/unit/a2a_storage/test_in_memory_context_store.py ........          [  0%]
 tests/unit/a2a_storage/test_sqlite_context_store.py ..........           [  0%]
 tests/unit/a2a_storage/test_storage_factory.py ...........               [  0%]
 tests/unit/app/endpoints/test_a2a.py ..............................      [  2%]
 tests/unit/app/endpoints/test_authorized.py ...                          [  2%]
 tests/unit/app/endpoints/test_config.py ..                               [  2%]
 tests/unit/app/endpoints/test_conversations.py ......................... [  3%]
 .................                                                        [  3%]
 tests/unit/app/endpoints/test_conversations_v2.py ...................... [  4%]
 ...............                                                          [  4%]
 tests/unit/app/endpoints/test_feedback.py .......................        [  5%]
 tests/unit/ap...

GitHub Actions: Unit tests / unit_tests (3.13): LCORE-1724: Establish a reliable method for deploying RHEL AI instances in CI

Conclusion: failure

View job details

##[group]Run uv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing
 �[36;1muv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing�[0m
 shell: /usr/bin/bash -e {0}
 env:
   UV_PYTHON: 3.13
   VIRTUAL_ENV: /home/runner/work/lightspeed-stack/lightspeed-stack/.venv
   UV_CACHE_DIR: /home/runner/work/_temp/setup-uv-cache
 ##[endgroup]
 Uninstalled 1 package in 3ms
 Installed 1 package in 3ms
 ============================= test session starts ==============================
 platform linux -- Python 3.13.14, pytest-9.1.1, pluggy-1.6.0
 benchmark: 5.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
 rootdir: /home/runner/work/lightspeed-stack/lightspeed-stack
 configfile: pyproject.toml
 plugins: asyncio-1.4.0, benchmark-5.2.3, anyio-4.14.1, order-1.5.0, mock-3.15.1, cov-7.1.0, logfire-4.37.0
 asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
 collected 2925 items
 tests/unit/a2a_storage/test_in_memory_context_store.py ........          [  0%]
 tests/unit/a2a_storage/test_sqlite_context_store.py ..........           [  0%]
 tests/unit/a2a_storage/test_storage_factory.py ...........               [  0%]
 tests/unit/app/endpoints/test_a2a.py ..............................      [  2%]
 tests/unit/app/endpoints/test_authorized.py ...                          [  2%]
 tests/unit/app/endpoints/test_config.py ..                               [  2%]
 tests/unit/app/endpoints/test_conversations.py ......................... [  3%]
 .................                                                        [  3%]
 tests/unit/app/endpoints/test_conversations_v2.py ...................... [  4%]
 ...............                                                          [  4%]
 tests/unit/app/endpoints/test_feedback.py .......................        [  5%]
 tests/unit/ap...
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-02-19T10:06:50.647Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1181
File: tests/e2e-prow/rhoai/manifests/lightspeed/mock-jwks.yaml:32-34
Timestamp: 2026-02-19T10:06:50.647Z
Learning: In the rhoai tests under tests/e2e-prow/rhoai/manifests, avoid static ConfigMap definitions for mock-jwks-script and mcp-mock-server-script since these ConfigMaps are created dynamically by the pipeline.sh deployment script using 'oc create configmap'. Ensure there are no static ConfigMap resources for these names in the manifests. If such ConfigMaps are added in the future, coordinate with the pipeline to reflect dynamic creation or adjust tests to rely on the dynamic provisioning.

Applied to files:

  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
📚 Learning: 2026-05-20T08:09:30.641Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
  • tests/e2e/configs/run-rhelai.yaml
🪛 markdownlint-cli2 (0.22.1)
.tekton/integration-tests/README.md

[warning] 38-38: Files should end with a single newline character

(MD047, single-trailing-newline)

🔇 Additional comments (6)
tests/e2e/configs/run-rhelai.yaml (1)

24-32: LGTM!

tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml (1)

21-35: LGTM!

tests/e2e-prow/rhoai/pipeline-konflux.sh (1)

75-79: LGTM!

Also applies to: 385-391

tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml (1)

33-38: LGTM!

tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml (1)

146-166: LGTM!

.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml (1)

90-91: 🩺 Stability & Availability

#!/bin/sh is fine for this image. ghcr.io/redhat-developer/mapt:pr-848 is based on UBI 9, so the shell supports the [[ ... ]] and pipefail usage here.

			> Likely an incorrect or invalid review comment.

Comment thread .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml Outdated
[[ -n "$QUAY_ROBOT_NAME" ]] && log "✅ QUAY_ROBOT_NAME is set" || { echo "❌ Missing QUAY_ROBOT_NAME"; exit 1; }
[[ -n "$QUAY_ROBOT_PASSWORD" ]] && log "✅ QUAY_ROBOT_PASSWORD is set" || { echo "❌ Missing QUAY_ROBOT_PASSWORD"; exit 1; }
[[ -n "$OPENAI_API_KEY" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; }
[[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

&& log || { exit 1 } can falsely fail when QUIET=1.

log() returns the status of [ "$QUIET" != "1" ], which is non-zero whenever QUIET=1. In that case the && log "…" arm returns non-zero and the || arm executes, printing ❌ Missing OPENAI_API_KEY and exiting 1 even though the key is set. Use an explicit if to decouple validation from the side-effecting log.

🐛 Proposed fix
-[[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; }
+if [[ -n "${OPENAI_API_KEY:-}" ]]; then
+  log "✅ OPENAI_API_KEY is set"
+else
+  echo "❌ Missing OPENAI_API_KEY"; exit 1
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; }
if [[ -n "${OPENAI_API_KEY:-}" ]]; then
log "✅ OPENAI_API_KEY is set"
else
echo "❌ Missing OPENAI_API_KEY"; exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/e2e-prow/rhoai/pipeline-konflux.sh` at line 54, The OPENAI_API_KEY
check in pipeline-konflux.sh is incorrectly tied to log()’s return value, so
`QUIET=1` can trigger the failure path even when the key exists. Update the
validation near the OPENAI_API_KEY guard to use an explicit conditional instead
of `&& ... || ...`, and keep the existence check separate from the `log` side
effect so `log()` cannot influence the exit behavior.

@are-ces are-ces marked this pull request as draft June 30, 2026 10:38
@are-ces are-ces force-pushed the rhelai-konflux branch 2 times, most recently from 4dda48f to 66520fa Compare July 1, 2026 06:57
@are-ces

are-ces commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Note: The SNAPSHOT image is currently missing the pydantic_ai package, so the RHEL AI e2e tests cannot run against the SNAPSHOT-built lightspeed-stack image. With the 0.6 release this should be resolved and the pipeline will start working properly with the SNAPSHOT image.

@are-ces are-ces marked this pull request as ready for review July 1, 2026 07:02
@are-ces are-ces requested a review from tisnik July 1, 2026 07:02
- Update run-rhelai.yaml: use base_url, VLLM_* env vars, restore comments
- Add lightspeed-stack-rhelai.yaml: LCS config with vllm provider
- Sync examples/vllm-rhelai.yaml with test config
- Parameterize pipeline-konflux.sh for LLAMA_STACK_CONFIG, LCS_CONFIG,
  VLLM_URL, VLLM_MODEL, VLLM_API_KEY
- Add optional VLLM env vars to pod manifests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml:
- Around line 61-62: The pipeline definition is exposing the vLLM API key
through Tekton params/results, which leaks it into TaskRun/PipelineRun metadata.
Update the lightspeed stack test pipeline to stop using `params.vllm-api-key`
and `results.vllm-api-key`, and switch the relevant tasks/steps to read the key
from a Secret-backed runtime handoff instead. Make the same change anywhere the
key is passed or referenced in the pipeline spec so the `vllm-api-key` stays out
of metadata.
- Around line 108-124: The shell snippets are currently treated as /bin/sh but
rely on Bash-only features, so make them consistent by either converting the
script logic in create-instance and the other script blocks to POSIX sh syntax
or explicitly switching them to Bash with a Bash-capable shebang/image. Update
the create-instance script to remove non-POSIX constructs like pipefail and [[
... ]], and audit the script blocks around the other referenced Tekton steps
that use here-strings so they also run under the chosen shell.
- Around line 150-166: The teardown path is hard-coding a region that may differ
from the one selected by `mapt aws rhel-ai create`, which can leave
non-us-east-1 resources behind. Update the `lightspeed-stack-rhelai-test` flow
so the chosen region from the create loop is persisted and reused by `mapt aws
rhel-ai destroy`, or iterate destroy over the same `REGIONS` set used during
provisioning. Make sure the logic around the `REGION` loop and the destroy step
stays aligned so cleanup targets the actual provisioned region.

In `@examples/vllm-rhelai.yaml`:
- Around line 145-148: The `llama-guard` shield is currently wired to a chat
model instead of a real guard model, so update the `shields` entry in the
`vllm-rhelai` example to use a proper guard-model `provider_shield_id` rather
than `openai/gpt-4o-mini`. Keep the `shield_id`/`provider_id` mapping for
`llama-guard`, but point `provider_shield_id` at an actual Llama Guard model
identifier that can enforce safety checks in the deployed vLLM setup.
- Around line 51-68: The vector store configuration is inconsistent because
vector_stores.default_provider_id still points to faiss while
providers.vector_io is empty. Update the vllm-rhelai.yaml config by either
restoring the matching providers.vector_io entry with provider_id faiss or
removing/changing the default_provider_id so it references an existing vector_io
provider. Keep the vector_stores and providers.vector_io sections aligned so the
default can resolve correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5ed83875-41f7-472d-b41f-b303c3ce28da

📥 Commits

Reviewing files that changed from the base of the PR and between cb7ad02 and 66520fa.

📒 Files selected for processing (8)
  • .tekton/integration-tests/README.md
  • .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml
  • examples/vllm-rhelai.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
  • tests/e2e-prow/rhoai/pipeline-konflux.sh
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
📜 Review details
⏰ Context from checks skipped due to timeout. (7)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-0-6-on-pull-request
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: pydocstyle
  • GitHub Check: mypy
  • GitHub Check: Pyright
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-02-19T10:06:50.647Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1181
File: tests/e2e-prow/rhoai/manifests/lightspeed/mock-jwks.yaml:32-34
Timestamp: 2026-02-19T10:06:50.647Z
Learning: In the rhoai tests under tests/e2e-prow/rhoai/manifests, avoid static ConfigMap definitions for mock-jwks-script and mcp-mock-server-script since these ConfigMaps are created dynamically by the pipeline.sh deployment script using 'oc create configmap'. Ensure there are no static ConfigMap resources for these names in the manifests. If such ConfigMaps are added in the future, coordinate with the pipeline to reflect dynamic creation or adjust tests to rely on the dynamic provisioning.

Applied to files:

  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
📚 Learning: 2026-05-20T08:09:30.641Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

  • tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
  • examples/vllm-rhelai.yaml
🪛 markdownlint-cli2 (0.22.1)
.tekton/integration-tests/README.md

[warning] 38-38: Files should end with a single newline character

(MD047, single-trailing-newline)

🔇 Additional comments (14)
.tekton/integration-tests/README.md (1)

19-19: 🎯 Functional Correctness

Clarify the exact source of the per-run S3 prefix.

The README states the prefix uses <pipelinerun-name>, but the pipeline script uses ${RUN_ID}. Please confirm that RUN_ID is set to the PipelineRun name (or a unique derivative), and document the exact variable or mechanism used so operators don't assume a different naming scheme.

tests/e2e-prow/rhoai/pipeline-konflux.sh (6)

54-54: 🎯 Functional Correctness

Still using && log || { exit 1 } for OPENAI_API_KEY check.

This is the same pattern flagged in a prior review: log() returns the exit status of [ "$QUIET" != "1" ], which is non-zero when QUIET=1. In that case the && arm "fails" and the || arm runs, printing ❌ Missing OPENAI_API_KEY and exiting even though the key is present.

🐛 Proposed fix
-[[ -n "${OPENAI_API_KEY:-}" ]] && log "✅ OPENAI_API_KEY is set" || { echo "❌ Missing OPENAI_API_KEY"; exit 1; }
+if [[ -n "${OPENAI_API_KEY:-}" ]]; then
+  log "✅ OPENAI_API_KEY is set"
+else
+  echo "❌ Missing OPENAI_API_KEY"; exit 1
+fi

2-3: LGTM!


68-79: LGTM!


174-183: LGTM!


385-391: LGTM!


54-57: 🎯 Functional Correctness

OPENAI_API_KEY is already supplied by the mounted secret The RHEL AI Tekton task mounts openai-api-key at /var/run/openai, and pipeline-konflux.sh reads /var/run/openai/openai-api-key before this guard.

			> Likely an incorrect or invalid review comment.
tests/e2e-prow/rhoai/manifests/lightspeed/lightspeed-stack.yaml (1)

33-38: LGTM!

tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml (1)

141-166: LGTM!

Marking OPENAI_API_KEY optional and adding the vLLM env vars (each optional: true) is consistent with supporting mode-switching between OpenAI and vLLM.

.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml (2)

128-145: Spot provisioning failure handling is still missing.

The previous finding still applies: the spot branch does not guard the mapt aws rhel-ai create exit status before writing host and vllm-api-key results.


364-365: onError: continue still masks e2e failures.

The previous finding still applies: finally already performs cleanup, so this makes failed tests look successful.

tests/e2e/configs/run-rhelai.yaml (1)

25-33: LGTM!

Also applies to: 135-138

tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml (1)

1-36: LGTM!

examples/vllm-rhelai.yaml (1)

15-15: LGTM!

Also applies to: 25-33, 114-114, 135-144

Comment on lines +61 to +62
- name: vllm-api-key
description: "Random API key for vLLM authentication"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Map the target file and inspect the referenced ranges.
file='.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml'

echo '== file exists =='
git ls-files "$file"

echo
echo '== line count =='
wc -l "$file"

echo
echo '== relevant lines =='
sed -n '50,80p;175,200p;305,320p;388,402p' "$file" | cat -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 4241


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file='.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml'

echo '== surrounding pipeline blocks =='
sed -n '160,220p;300,410p' "$file" | cat -n

echo
echo '== search for secret/workspace handoff patterns in Tekton pipelines =='
rg -n "secretRef|workspace|Secret|secretName|results\.(.*api-key|.*token)|params.*api-key|VLLM_API_KEY" .tekton -g '*.yaml'

Repository: lightspeed-core/lightspeed-stack

Length of output: 16183


🌐 Web query:

Tekton Task results are stored in TaskRun status and PipelineRun status. Are Tekton results and params appropriate for secret values?

💡 Result:

No, Tekton results and parameters are not appropriate for storing secret values [1][2]. Storing sensitive information as Tekton parameters or results is a security risk because these values are stored as plain text within the TaskRun and PipelineRun Kubernetes resources [1]. Once a value is recorded in the status of a TaskRun or PipelineRun, it becomes visible to anyone with read access to those Kubernetes objects [1][3]. To securely handle secret values in Tekton, use the following established patterns: 1. Kubernetes Secrets: Reference existing Kubernetes Secret objects directly in your Task definitions using env, envFrom, or VolumeMounts [4][5][6][7]. This prevents the secret value from being exposed as a parameter or result. When using env or envFrom, you can map specific keys from a Secret to environment variables available only to the steps that require them [4][6]. 2. Scope Access: Avoid mounting sensitive secrets to all steps in a task. Use VolumeMounts to make a secret available only to specific steps that require it [8][7]. 3. External Secrets: For more advanced security, use tools like the External Secrets Operator or HashiCorp Vault to inject secrets at runtime [9][10]. This ensures that secrets are not permanently stored in the cluster as Kubernetes Secret objects or defined in plain text within Git repositories [10]. 4. Service Account Authorization: Ensure that the ServiceAccount associated with a TaskRun has the minimum necessary RBAC permissions (e.g., restricted to specific secrets via resourceNames) to access only the secrets it requires [9][11]. If you need to pass data between tasks, use Tekton results for non-sensitive data only [2]. If you encounter a scenario where you feel tempted to pass a secret as a result, reconsider your architecture to have the consuming task fetch the required credential directly from a secure source (e.g., Vault or a Kubernetes Secret) using an identity-based approach [9][11][10].

Citations:


Do not pass the vLLM API key through Tekton results or params. It ends up in TaskRun/PipelineRun metadata via results.vllm-api-key and params.vllm-api-key; use a Secret reference or another runtime secret handoff instead. Also applies to 185-187, 313-314, 394-397.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml around
lines 61 - 62, The pipeline definition is exposing the vLLM API key through
Tekton params/results, which leaks it into TaskRun/PipelineRun metadata. Update
the lightspeed stack test pipeline to stop using `params.vllm-api-key` and
`results.vllm-api-key`, and switch the relevant tasks/steps to read the key from
a Secret-backed runtime handoff instead. Make the same change anywhere the key
is passed or referenced in the pipeline spec so the `vllm-api-key` stays out of
metadata.

Comment on lines +108 to +124
#!/bin/sh
set -uo pipefail

BUCKET=$(cat /opt/aws-credentials/S3_BUCKET)
RUN_ID="$(context.pipelineRun.name)"
HF_TOKEN=$(cat /opt/aws-credentials/HUGGING_FACE_HUB_TOKEN)
VLLM_API_KEY=$(head -c 32 /dev/urandom | od -An -tx1 | tr -d ' \n')

echo "[mapt] ========== Version Info =========="
echo "[mapt] RHEL AI AMI version: $(params.rhelai-version)"
echo "[mapt] Model: $(params.model)"
echo "[mapt] Run ID: ${RUN_ID}"
echo "[mapt] Spot: $(params.spot)"
echo "[mapt] =================================="

SPOT_ARGS=""
if [[ "$(params.spot)" == "true" ]]; then

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml"

# Show the relevant sections around the reported lines.
sed -n '100,140p;280,300p;400,420p' "$FILE" | cat -n

# Inspect how these script blocks are embedded/executed.
rg -n 'script:|#!/bin/sh|#!/bin/bash|pipefail|\[\[|<<<|/bin/sh|/bin/bash' "$FILE"

Repository: lightspeed-core/lightspeed-stack

Length of output: 6179


🌐 Web query:

Tekton step script shebang default shell /bin/sh bash shebang documentation

💡 Result:

In Tekton, when you use the script field in a Task step, the script is executed within the container image [1][2][3]. If you do not provide a shebang line at the top of your script, Tekton automatically prepends the following default preamble [1][2][3]: #!/bin/sh set -e This default configuration ensures that the script runs using /bin/sh and exits immediately if any command fails [4]. If you want to use a different interpreter or change this behavior, you can override the default by explicitly including a shebang line at the beginning of your script [1][2][3]. For example: - To use Bash: #!/usr/bin/env bash - To use Python: #!/usr/bin/env python3 - To use Node: #!/usr/bin/env node The interpreter you specify must be present within the container image used for that step [1][2][3]. For Windows-based nodes, the behavior differs slightly; you must use a special #!win shebang (e.g., #!win powershell.exe -File or simply #!win to execute a command as a .cmd file) [1][3].

Citations:


🌐 Web query:

Red Hat UBI 9 minimal /bin/sh bash symlink documentation

💡 Result:

In Red Hat Universal Base Image (UBI) 9, including the Minimal variant, the directory /bin is a symbolic link to /usr/bin [1][2]. Consequently, system binaries such as bash are located at /usr/bin/bash rather than /bin/bash [1][3]. Standard UBI 9 container images are configured with /bin/bash as their default entry point (CMD) [4]. Because of the filesystem layout in RHEL 9 and UBI 9, any reference to /bin/bash or /bin/sh effectively resolves to the files residing in /usr/bin [1][2]. Users or automated security scanners looking for shells at the traditional /bin path may fail to detect them if they do not account for the symlink, as the literal path /bin/bash does not exist as an independent file entry in the container image's filesystem layer [1][3].

Citations:


Make these Tekton scripts POSIX-sh compatible or switch them to Bash.

create-instance is #!/bin/sh but uses pipefail and [[ ... ]]; the script: blocks at 286-289 and 401-411 also run as /bin/sh by default and use <<<. Convert them to POSIX shell or add a Bash shebang with a Bash-capable image.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml around
lines 108 - 124, The shell snippets are currently treated as /bin/sh but rely on
Bash-only features, so make them consistent by either converting the script
logic in create-instance and the other script blocks to POSIX sh syntax or
explicitly switching them to Bash with a Bash-capable shebang/image. Update the
create-instance script to remove non-POSIX constructs like pipefail and [[ ...
]], and audit the script blocks around the other referenced Tekton steps that
use here-strings so they also run under the chosen shell.

Comment on lines +150 to +166
for REGION in $REGIONS; do
echo "[mapt] Trying on-demand in ${REGION}..."
export AWS_DEFAULT_REGION="$REGION"

if timeout $TIMEOUT mapt aws rhel-ai create \
--project-name "mapt-rhel-ai-${RUN_ID}" \
--backed-url "s3://${BUCKET}/mapt/rhel-ai/${RUN_ID}" \
--conn-details-output /opt/host-info \
--compute-sizes "$(params.instance-type)" \
--version "$(params.rhelai-version)" \
--auto-start \
--model "$(params.model)" \
--hf-token "${HF_TOKEN}" \
--api-key "${VLLM_API_KEY}" \
--expose-ports 8000 \
--vllm-extra-args "--max-model-len 131072 --enable-auto-tool-choice --tool-call-parser llama3_json --chat-template /opt/app-root/template/tool_chat_template_llama3.1_json.jinja" \
--tags "project=lightspeed-core,environment=konflux-ci"; then

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

file=".tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml"

echo "== line count =="
wc -l "$file"

echo "== provisioning snippet (around lines 140-175) =="
sed -n '140,175p' "$file" | cat -n

echo "== teardown snippet (around lines 470-500) =="
sed -n '470,500p' "$file" | cat -n

echo "== search for destroy-rhelai and AWS_DEFAULT_REGION =="
rg -n "destroy-rhelai|AWS_DEFAULT_REGION|REGION|regions|us-east-1" .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml

echo "== search repo-wide for destroy-rhelai references =="
rg -n "destroy-rhelai" .

Repository: lightspeed-core/lightspeed-stack

Length of output: 4396


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== context around spot/on-demand branches =="
sed -n '120,190p' .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml | cat -n

echo "== all mapt aws rhel-ai destroy invocations with context =="
rg -n -C 3 "mapt aws rhel-ai destroy" .tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml

echo "== search for MAPT documentation and region handling =="
rg -n -i "AWS_DEFAULT_REGION|backed-url|rhel-ai destroy|mapt aws rhel-ai" . --glob '!**/.git/**'

Repository: lightspeed-core/lightspeed-stack

Length of output: 5227


🌐 Web query:

"mapt aws rhel-ai destroy backed-url AWS_DEFAULT_REGION region"

💡 Result:

The command mapt aws rhel-ai destroy is used within the Multi Architecture Provisioning Tool (mapt) to tear down RHEL AI infrastructure previously provisioned on AWS [1]. When running this command, mapt requires the --backed-url flag to locate the state or metadata associated with the provisioned resource, ensuring the correct environment is destroyed [1]. The --backed-url typically points to a storage location (such as an S3 bucket or local directory) where the project's state was saved during the create operation [1][2][3]. Although the AWS_DEFAULT_REGION environment variable is not an explicit argument of the mapt command itself, it is standard practice to set it in the execution environment (e.g., as an environment variable in a CI/CD pipeline or shell session) to ensure the AWS SDK interacts with the correct region [3]. Example usage: mapt aws rhel-ai destroy \ --project-name mapt-rhel-ai-[ID] \ --backed-url s3://[BUCKET]/mapt/rhel-ai/[ID] In this command, [ID] refers to a unique identifier for the specific instance or deployment [1]. The tool uses this information to identify and clean up the AWS resources associated with that project [1][4].

Citations:


🌐 Web query:

"site:github.com/redhat-developer/mapt docs aws eks destroy region backed-url AWS_DEFAULT_REGION"

💡 Result:

To destroy an EKS cluster provisioned by the Red Hat Multi Architecture Provisioning Tool (mapt), you must use the same project name and backend state information that were used during the cluster creation [1]. The state for the resources is stored at the location specified by the --backed-url flag [1]. When running mapt via a container (such as podman or docker), you must ensure that the --backed-url location is correctly mounted as a volume so the tool can access the state files [1]. You should also provide the appropriate AWS credentials and set the AWS_DEFAULT_REGION environment variable to match the region where the cluster was deployed [1][2]. Example command for destroying an EKS cluster: podman run -d --rm \ -v ${PWD}:/workspace:z \ -e AWS_ACCESS_KEY_ID=XXX \ -e AWS_SECRET_ACCESS_KEY=XXX \ -e AWS_DEFAULT_REGION=us-east-2 \ quay.io/redhat-developer/mapt:v1.0.0-dev aws eks destroy \ --project-name "mapt-eks" \ --backed-url file:///workspace Key considerations for the destroy operation: - Project Name: Must match the name used during creation [1]. - Backed URL: Must point to the directory or bucket containing the Pulumi state files [1][2]. - Region: Ensure AWS_DEFAULT_REGION is set correctly; mapt uses this environment variable if no explicit region flag is provided to the destroy command [2]. - Force Destroy: If the creation process was interrupted and a state lock prevents destruction, you can use the --force-destroy flag to remove the lock and proceed [3]. Note that this is generally supported for remote backends (e.g., S3) [3][2].

Citations:


🌐 Web query:

"site:github.com/redhat-developer/mapt \"AWS_DEFAULT_REGION\" \"rhel-ai destroy\""

💡 Result:

In the redhat-developer/mapt project, AWS_DEFAULT_REGION is a required environment variable used to specify the AWS region for operations [1][2]. When executing the mapt aws rhel-ai destroy command, this variable (or the region specified in the stack configuration) is utilized by the AWS provider to identify the correct environment and resources to clean up [1]. Technical usage patterns include: 1. Environment Configuration: Users typically export or pass AWS_DEFAULT_REGION as an environment variable when running mapt via container (e.g., podman run) or in CI/CD pipelines [3][4][2]. 2. Provider Logic: The mapt AWS provider checks AWS_DEFAULT_REGION (and AWS_REGION) when determining the hosting region for resources [1]. During a destroy operation, the provider ensures it targets the correct stack by resolving the region, either from the provided stack configuration or the environment [1]. 3. Automation: In CI/CD integrations like Tekton, this environment variable is commonly set alongside other AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) to ensure the mapt binary has the necessary context to perform the destroy operation [3]. Example usage for a destroy operation: podman run -d --rm \ -v ${PWD}:/workspace:z \ -e AWS_ACCESS_KEY_ID=XXX \ -e AWS_SECRET_ACCESS_KEY=XXX \ -e AWS_DEFAULT_REGION=us-east-1 \ quay.io/redhat-developer/mapt:latest aws rhel-ai destroy \ --project-name \ --backed-url

Citations:


Reuse the provisioned region during destroy. mapt aws rhel-ai destroy uses AWS_DEFAULT_REGION, but final teardown hard-codes us-east-1 while on-demand provisioning can land in another region. Persist the chosen region from create, or destroy across the same region set, so non-us-east-1 instances don’t leak.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.tekton/integration-tests/pipeline/lightspeed-stack-rhelai-test.yaml around
lines 150 - 166, The teardown path is hard-coding a region that may differ from
the one selected by `mapt aws rhel-ai create`, which can leave non-us-east-1
resources behind. Update the `lightspeed-stack-rhelai-test` flow so the chosen
region from the create loop is persisted and reused by `mapt aws rhel-ai
destroy`, or iterate destroy over the same `REGIONS` set used during
provisioning. Make sure the logic around the `REGION` loop and the destroy step
stays aligned so cleanup targets the actual provisioned region.

Comment thread examples/vllm-rhelai.yaml
Comment on lines +51 to +68
- provider_id: basic
provider_type: inline::basic
config: {}
- provider_id: llm-as-judge
provider_type: inline::llm-as-judge
config: {}
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
tool_runtime:
- config: {}
- config: {} # Enable the RAG tool
provider_id: rag-runtime
provider_type: inline::rag-runtime
vector_io:
- config:
persistence:
namespace: vector_io::faiss
backend: kv_default
provider_id: faiss
provider_type: inline::faiss
- config: {} # Enable MCP (Model Context Protocol) support
provider_id: model-context-protocol
provider_type: remote::model-context-protocol
vector_io: []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether the RHEL AI e2e flow relies on llama-stack's own vector_stores API
# (which would need a registered vector_io provider) vs. lightspeed-stack's byok_rag path.
rg -n "default_provider_id|vector_stores|vector_io" tests/e2e/configs/run-rhelai.yaml examples/vllm-rhelai.yaml tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 690


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the relevant sections in both files.
sed -n '1,220p' examples/vllm-rhelai.yaml | nl -ba | sed -n '1,220p'
printf '\n---\n'
sed -n '1,220p' tests/e2e/configs/run-rhelai.yaml | nl -ba | sed -n '1,220p'

# Check for any faiss/vector_io provider registration elsewhere in the repo.
rg -n "provider_id:\s*faiss|vector_io:|default_provider_id:\s*faiss|vector_stores:" .

Repository: lightspeed-core/lightspeed-stack

Length of output: 210


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== examples/vllm-rhelai.yaml =="
sed -n '1,220p' examples/vllm-rhelai.yaml | cat -n | sed -n '130,170p'

echo
echo "== tests/e2e/configs/run-rhelai.yaml =="
sed -n '1,220p' tests/e2e/configs/run-rhelai.yaml | cat -n | sed -n '130,170p'

echo
echo "== repo search =="
rg -n "provider_id:\s*faiss|default_provider_id:\s*faiss|vector_io:|vector_stores:" examples/vllm-rhelai.yaml tests/e2e/configs/run-rhelai.yaml tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 2956


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find other examples/docs that explain how vector store defaults are wired.
rg -n "default_provider_id:\s*faiss|default_provider_id:|vector_stores:|vector_io:" \
  examples tests docs .github README.md

echo
echo "== files mentioning vector_stores/default_provider_id in configs =="
rg -n "vector_stores:" -g '*.yaml' -g '*.yml' .

Repository: lightspeed-core/lightspeed-stack

Length of output: 7382


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== docs/deployment_guide.md around vector_io =="
sed -n '350,390p' docs/deployment_guide.md | cat -n

echo
sed -n '915,950p' docs/deployment_guide.md | cat -n

echo
echo "== tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml =="
sed -n '1,240p' tests/e2e/configuration/server-mode/lightspeed-stack-rhelai.yaml | cat -n

echo
echo "== examples/run.yaml vector_io/vector_stores section =="
sed -n '130,170p' examples/run.yaml | cat -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 6598


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect tests that describe how vector_io and vector_stores are rendered.
rg -n "vector_stores|vector_io|default_provider_id" tests/unit/test_llama_stack_configuration.py

echo
# Show the surrounding assertions for the faiss cases.
sed -n '240,360p' tests/unit/test_llama_stack_configuration.py | cat -n

echo
sed -n '400,460p' tests/unit/test_llama_stack_configuration.py | cat -n

echo
sed -n '600,640p' tests/unit/test_llama_stack_configuration.py | cat -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 16793


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the implementation of vector store configuration construction.
sed -n '1,240p' tests/unit/test_llama_stack_configuration.py | cat -n
echo
sed -n '240,420p' tests/unit/test_llama_stack_configuration.py | cat -n
echo
sed -n '1,260p' <(python3 - <<'PY'
from pathlib import Path
p = Path('tests/unit/test_llama_stack_configuration.py')
text = p.read_text()
for name in ['construct_vector_stores_section', 'construct_vector_io_providers_section', 'enrich_byok_rag', '_build_vector_io_config']:
    idx = text.find(f'def {name}')
    if idx != -1:
        start = text.rfind('\n', 0, idx)
        print(f'### {name} ###')
        print(text[idx:idx+2200])
        print()
PY
)

Repository: lightspeed-core/lightspeed-stack

Length of output: 18102


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the implementation file first.
rg -n "def (construct_vector_stores_section|construct_vector_io_providers_section|enrich_byok_rag|_build_vector_io_config)" .

Repository: lightspeed-core/lightspeed-stack

Length of output: 523


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the implementation that consumes vector_stores and vector_io.
sed -n '180,460p' src/llama_stack_configuration.py | cat -n

echo
# Inspect the config model definitions around these fields.
rg -n "vector_stores|vector_io" src models -g '*.py'

Repository: lightspeed-core/lightspeed-stack

Length of output: 20510


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the config schema for vector_stores and providers.vector_io.
sed -n '1180,1265p' src/models/config.py | cat -n

echo
sed -n '1265,1345p' src/models/config.py | cat -n

echo
# Show how the run example wires vector_io and vector_stores in one complete config.
sed -n '1,220p' examples/run.yaml | cat -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 12786


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the vector store config model and how the API validates/uses default_provider_id.
rg -n "default_provider_id|class .*VectorStore|vector_stores" src/models/config.py src/app/endpoints/vector_stores.py src -g '*.py'

echo
sed -n '1450,1565p' src/models/config.py | cat -n

echo
sed -n '150,260p' src/app/endpoints/vector_stores.py | cat -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 16595


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the exact vector store config model definition and any validation around default_provider_id.
rg -n "class .*Vector.*Store|default_provider_id|default_embedding_model" src/models/config.py src -g '*.py'

echo
# Show the surrounding model definition.
sed -n '1340,1465p' src/models/config.py | cat -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 6962


🌐 Web query:

Llama Stack vector_stores default_provider_id providers.vector_io configuration faiss

💡 Result:

In Llama Stack, the configuration for vector stores involves defining the vector_io providers and setting default behavior within the stack's distribution configuration (run.yaml). The FAISS provider is an inline vector database used for high-performance, in-process similarity search [1][2]. Configuration Overview To use FAISS as your vector store, you must define it within the vector_io section of your providers configuration [3][4]. A standard configuration for the FAISS provider includes the provider_id, provider_type (inline::faiss), and the persistence settings [5][3][4]. Example configuration: providers: vector_io: - provider_id: faiss provider_type: inline::faiss config: persistence: namespace: vector_io::faiss backend: kv_default Stack-Level Defaults Llama Stack supports configuring a default vector store provider and a default embedding model to ensure deterministic behavior when creating vector stores [6][7]. This is defined at the stack level, typically within the vector_stores section of your configuration [5][8][3]. Example of setting defaults: vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers model_id: nomic-ai/nomic-embed-text-v1.5 Key Considerations 1. Precedence Logic: If an API call explicitly specifies an embedding model, it takes precedence [6][7]. If not, the system uses the global default defined in vector_stores [6][7]. 2. Limitations: FAISS is primarily in-memory and does not support keyword or hybrid search [9][10]. It is optimized for scenarios where fast, local retrieval is required and is generally less suited for frequent incremental updates compared to disk-based alternatives like sqlite-vec [9]. 3. Registration: While run.yaml defines the providers, there have been known issues regarding the registration of specific vector store resources from configuration [11]. In many current workflows, vector stores are registered dynamically via the Llama Stack client API [11][12].

Citations:


Remove the dangling faiss default or restore the matching vector_io provider.
providers.vector_io is empty here, but vector_stores.default_provider_id: faiss still points at a provider that no longer exists. FAISS-backed configs register provider_id: faiss under providers.vector_io; without that entry, any vector-store creation that relies on the default can fail. Either re-add the FAISS provider or drop this default if vector stores are no longer part of this config.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/vllm-rhelai.yaml` around lines 51 - 68, The vector store
configuration is inconsistent because vector_stores.default_provider_id still
points to faiss while providers.vector_io is empty. Update the vllm-rhelai.yaml
config by either restoring the matching providers.vector_io entry with
provider_id faiss or removing/changing the default_provider_id so it references
an existing vector_io provider. Keep the vector_stores and providers.vector_io
sections aligned so the default can resolve correctly.

Comment thread examples/vllm-rhelai.yaml
Comment on lines 145 to 148
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Llama Guard shield backed by a chat model, not a guard model.

provider_shield_id: openai/gpt-4o-mini uses a generative chat model id for the llama-guard safety shield. Based on learnings, this pattern only indicates a landed override and does not mean the shield actually gates queries — Llama Guard needs a real guard model (e.g. meta-llama/Llama-Guard-3-8B) to function as intended, otherwise e2e coverage gives false confidence that content safety is enforced.

🛡️ Suggested fix (pending guard-model availability in the RHEL AI vLLM deployment)
   shields:
   - shield_id: llama-guard
     provider_id: llama-guard
-    provider_shield_id: openai/gpt-4o-mini
+    provider_shield_id: meta-llama/Llama-Guard-3-8B
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: meta-llama/Llama-Guard-3-8B
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/vllm-rhelai.yaml` around lines 145 - 148, The `llama-guard` shield
is currently wired to a chat model instead of a real guard model, so update the
`shields` entry in the `vllm-rhelai` example to use a proper guard-model
`provider_shield_id` rather than `openai/gpt-4o-mini`. Keep the
`shield_id`/`provider_id` mapping for `llama-guard`, but point
`provider_shield_id` at an actual Llama Guard model identifier that can enforce
safety checks in the deployed vLLM setup.

Source: Learnings

@tisnik tisnik left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@radofuchs radofuchs left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@are-ces are-ces merged commit eb8c5ec into lightspeed-core:main Jul 1, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants