Skip to content

Commit f17d9ff

Browse files
committed
Add inject-based test case support to evaluation function and docs
Implemented support for `inject`-based test cases, allowing variables to be pre-set before student code execution instead of relying on stdin. Updated the evaluation function, added unit tests for `inject` mode, and revised documentation (`CLAU
1 parent 36fd5b2 commit f17d9ff

5 files changed

Lines changed: 117 additions & 17 deletions

File tree

CLAUDE.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,16 @@ All source lives in `evaluation_function/`:
3636
"mode": "io_test",
3737
"tests": [
3838
{
39+
# stdin-based: student code calls input()
3940
"input": "5\n", # stdin fed to student code
4041
"expected_output": "25\n", # expected stdout
4142
"hidden": False # True = suppress input/output in feedback
43+
},
44+
{
45+
# inject-based: variables are set before student code runs (no input() needed)
46+
"inject": {"n": 5}, # dict of {variable_name: value} to inject
47+
"expected_output": "25\n",
48+
"hidden": False
4249
}
4350
]
4451
}

docs/dev.md

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,21 +43,36 @@ Feedback tags produced: `output` (stdout + any plots), or `error` (timeout / run
4343

4444
Run student code against a list of stdin/stdout test cases.
4545

46+
Each test case uses either `input` (stdin-based) or `inject` (variable injection):
47+
4648
```json
4749
{
4850
"mode": "io_test",
4951
"tests": [
5052
{
51-
"input": "5\n", // stdin fed to the process
52-
"expected_output": "25\n", // expected stdout (trailing whitespace stripped before comparison)
53-
"hidden": false // true = suppress input/output values from feedback
53+
"input": "5\n", // stdin — student code calls input()
54+
"expected_output": "25\n",
55+
"hidden": false
56+
},
57+
{
58+
"inject": {"n": 5}, // variables set before student code runs — no input() needed
59+
"expected_output": "25\n",
60+
"hidden": false
5461
}
5562
]
5663
}
5764
```
5865

66+
| Field | Description |
67+
|-------|-------------|
68+
| `input` | Text piped to stdin. Mutually exclusive with `inject`. |
69+
| `inject` | Dict of `{variable_name: value}` prepended as assignments before student code. Values can be any JSON type. Mutually exclusive with `input`. |
70+
| `expected_output` | Expected stdout; trailing whitespace stripped before comparison. |
71+
| `hidden` | `true` = suppress input/variables and expected output from feedback. |
72+
5973
- `tests` is required; an empty list sets `is_correct = true` with `0/0 tests passed`.
60-
- `hidden: true` replaces input/output details with `"Hidden test N: failed."` so students cannot reverse-engineer the answer.
74+
- `hidden: true` replaces details with `"Hidden test N: failed."` so students cannot reverse-engineer the answer.
75+
- With `inject`, feedback shows a "Variables:" block (e.g. `n = 5`) instead of "Input:".
6176
- Matplotlib figures generated during a test are uploaded to S3 and embedded in the feedback.
6277

6378
Feedback tags produced per test: `pass`, `fail`, or `hidden_fail`. Global: `summary`, `error` (timeout / runtime error).

docs/user.md

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,14 @@ Runs the student's code once per test case, feeding it a string via stdin and co
4444

4545
### Test case fields
4646

47-
| Field | Required | Description |
48-
|-------|----------|-------------|
49-
| `input` | No | Text sent to the program's stdin. Use `\n` for newlines. Omit or use `""` if the program reads no input. |
50-
| `expected_output` | Yes | The exact stdout the program should produce. Trailing whitespace is ignored during comparison. |
51-
| `hidden` | No | Set to `true` to hide the input and expected output from the student. They see only "Hidden test N: passed/failed." |
47+
Each test case uses **either** `input` (student reads via `input()`) **or** `inject` (variables are pre-set, no `input()` needed):
48+
49+
| Field | Description |
50+
|-------|-------------|
51+
| `input` | Text sent to stdin. Student code reads it with `input()`. Use `\n` for newlines. |
52+
| `inject` | Dict of variable names and values injected before student code runs. Student uses the variables directly — no `input()` required. Values can be numbers, strings, lists, or dicts. |
53+
| `expected_output` | The exact stdout the program should produce. Trailing whitespace is ignored. |
54+
| `hidden` | `true` = hide the input/variables and expected output from the student. They see only "Hidden test N: passed/failed." |
5255

5356
### Tips
5457

@@ -57,7 +60,7 @@ Runs the student's code once per test case, feeding it a string via stdin and co
5760
- Matplotlib figures produced during a passing or failing test are shown to the student.
5861
- A 25-second per-test timeout applies; timed-out tests count as failures.
5962

60-
### Example — square a number
63+
### Example — square a number (stdin-based)
6164

6265
Student code:
6366
```python
@@ -77,6 +80,27 @@ Params:
7780
}
7881
```
7982

83+
### Example — square a number (inject-based)
84+
85+
Use `inject` when students shouldn't need to handle input themselves — they just write an expression or use the named variable directly:
86+
87+
Student code:
88+
```python
89+
print(n * n)
90+
```
91+
92+
Params:
93+
```json
94+
{
95+
"mode": "io_test",
96+
"tests": [
97+
{ "inject": {"n": 5}, "expected_output": "25\n" },
98+
{ "inject": {"n": 0}, "expected_output": "0\n" },
99+
{ "inject": {"n": -3}, "expected_output": "9\n", "hidden": true }
100+
]
101+
}
102+
```
103+
80104
---
81105

82106
## Mode: `unit_test`

evaluation_function/evaluation.py

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -138,11 +138,22 @@ def _evaluate_io(response: str, tests: list, result: Result) -> Result:
138138
passed = 0
139139

140140
for i, test in enumerate(tests, 1):
141+
inject = test.get("inject")
141142
stdin = test.get("input", "")
142143
expected = test.get("expected_output", "").rstrip()
143144
hidden = test.get("hidden", False)
144145

145-
stdout, stderr, timed_out, images = _run_code(response, stdin)
146+
if inject:
147+
prefix = "".join(f"{k} = {v!r}\n" for k, v in inject.items())
148+
run_code = prefix + response
149+
run_stdin = ""
150+
input_block = _code_block("Variables", "\n".join(f"{k} = {v!r}" for k, v in inject.items()))
151+
else:
152+
run_code = response
153+
run_stdin = stdin
154+
input_block = _code_block("Input", stdin.rstrip()) if stdin.strip() else None
155+
156+
stdout, stderr, timed_out, images = _run_code(run_code, run_stdin)
146157
actual = stdout.rstrip()
147158
label = f"Hidden test {i}" if hidden else f"Test {i}"
148159

@@ -155,8 +166,8 @@ def _evaluate_io(response: str, tests: list, result: Result) -> Result:
155166
result.add_feedback(tag, f"{label}: runtime error.")
156167
else:
157168
parts = [f"{label}: runtime error."]
158-
if stdin.strip():
159-
parts.append(_code_block("Input", stdin.rstrip()))
169+
if input_block:
170+
parts.append(input_block)
160171
parts.append(_code_block("Error", stderr.strip()))
161172
result.add_feedback(tag, "\n\n".join(parts))
162173
elif actual == expected:
@@ -165,8 +176,8 @@ def _evaluate_io(response: str, tests: list, result: Result) -> Result:
165176
result.add_feedback("pass", f"{label}: passed.")
166177
else:
167178
parts = [f"{label}: passed."]
168-
if stdin.strip():
169-
parts.append(_code_block("Input", stdin.rstrip()))
179+
if input_block:
180+
parts.append(input_block)
170181
parts.append(_code_block("Output", actual or "(no output)"))
171182
parts.extend(_upload_plots(images))
172183
result.add_feedback("pass", "\n\n".join(parts))
@@ -176,8 +187,8 @@ def _evaluate_io(response: str, tests: list, result: Result) -> Result:
176187
result.add_feedback(tag, f"{label}: failed.")
177188
else:
178189
parts = [f"{label}: failed."]
179-
if stdin.strip():
180-
parts.append(_code_block("Input", stdin.rstrip()))
190+
if input_block:
191+
parts.append(input_block)
181192
parts.append(_code_block("Your output", actual or "(no output)"))
182193
parts.append(_code_block("Expected", expected))
183194
parts.extend(_upload_plots(images))

evaluation_function/evaluation_test.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,10 @@ def _test(inp, expected, hidden=False):
1818
return {"input": inp, "expected_output": expected, "hidden": hidden}
1919

2020

21+
def _inject_test(inject, expected, hidden=False):
22+
return {"inject": inject, "expected_output": expected, "hidden": hidden}
23+
24+
2125
class TestEvaluationFunction(unittest.TestCase):
2226

2327
def test_all_pass(self):
@@ -66,6 +70,45 @@ def test_missing_mode(self):
6670
self.assertIn("mode", result["feedback"])
6771

6872

73+
class TestInjectMode(unittest.TestCase):
74+
75+
def test_inject_pass(self):
76+
params = _params(_inject_test({"n": 5}, "25\n"))
77+
result = evaluation_function("print(n * n)", None, params).to_dict()
78+
79+
self.assertTrue(result["is_correct"])
80+
self.assertIn("1/1 tests passed", result["feedback"])
81+
self.assertIn("Variables", result["feedback"])
82+
self.assertIn("n = 5", result["feedback"])
83+
84+
def test_inject_fail_shows_variables(self):
85+
params = _params(_inject_test({"n": 5}, "999\n"))
86+
result = evaluation_function("print(n * n)", None, params).to_dict()
87+
88+
self.assertFalse(result["is_correct"])
89+
self.assertIn("Variables", result["feedback"])
90+
self.assertIn("n = 5", result["feedback"])
91+
92+
def test_inject_multiple_vars(self):
93+
params = _params(_inject_test({"a": 3, "b": 4}, "7\n"))
94+
result = evaluation_function("print(a + b)", None, params).to_dict()
95+
96+
self.assertTrue(result["is_correct"])
97+
98+
def test_inject_hidden_suppresses_variables(self):
99+
params = _params(_inject_test({"n": 5}, "999\n", hidden=True))
100+
result = evaluation_function("print(n * n)", None, params).to_dict()
101+
102+
self.assertFalse(result["is_correct"])
103+
self.assertNotIn("n = 5", result["feedback"])
104+
105+
def test_inject_string_value(self):
106+
params = _params(_inject_test({"name": "Alice"}, "Hello, Alice\n"))
107+
result = evaluation_function('print(f"Hello, {name}")', None, params).to_dict()
108+
109+
self.assertTrue(result["is_correct"])
110+
111+
69112
_PLOT_CODE = "import matplotlib.pyplot as plt\nplt.plot([1, 2, 3])\n"
70113
_MULTI_PLOT_CODE = (
71114
"import matplotlib.pyplot as plt\n"

0 commit comments

Comments
 (0)