You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Description of Changes
The current keynote-2 benchmarks pipelines operations via
`MAX_INFLIGHT_PER_WORKER` in order to simulate a large number of client
connections while running the benchmark locally or on a single machine.
This patch adds a distributed benchmark mode for `templates/keynote-2`
so explicit SpacetimeDB client connections can be spread across multiple
machines without changing the existing single-process benchmark flow.
This is a pure extension. `npm run bench` and the current
`src/core/runner.ts` path remain intact. The new distributed path adds a
small coordinator/generator/control-plane harness specifically for
multi-machine ts client runs.
- New CLI entry points `bench-dist-coordinator`, `bench-dist-generator`,
and `bench-dist-control` were added
- The coordinator defines the benchmark window
- Generators begin submitting requests during warmup, but warmup
transactions are excluded from TPS
- Throughput is measured from the server-side committed transfer
counter, not client-local TPS
- Each connection runs closed-loop with one request at a time in this
distributed mode
- Connection startup is bounded-parallel (`--open-parallelism`) to avoid
a connection storm
- Verification is run by the coordinator after the epoch
- Late generators can be registered after a run to increase load on the
server incrementally
- If a participating generator dies and never sends `/stopped`, the
epoch result is flagged with an error so the run can be retried cleanly
See `DEVELOP.md` for instructions on how to run.
# API and ABI breaking changes
N/A
# Expected complexity level and risk
3
# Testing
Manual
Copy file name to clipboardExpand all lines: templates/keynote-2/DEVELOP.md
+179Lines changed: 179 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -186,6 +186,185 @@ npm run bench test-1 --alpha 2.0
186
186
npm run bench test-1 --connectors spacetimedb,sqlite
187
187
```
188
188
189
+
### 2. Run the distributed TypeScript SpacetimeDB benchmark
190
+
191
+
Use this mode when you want to spread explicit TypeScript client connections across multiple machines. The existing `npm run bench` flow is still the single-process benchmark; the distributed flow is a separate coordinator + generator setup.
192
+
193
+
The commands below are written so they run unchanged on a single machine. For a true multi-machine run, replace `127.0.0.1` with the actual coordinator and server hostnames or IP addresses reachable from each generator machine.
194
+
195
+
#### Machine roles
196
+
197
+
-**Server machine**: runs SpacetimeDB and hosts the benchmarked module.
198
+
-**Coordinator machine**: runs `bench-dist-coordinator` and `bench-dist-control`. It may also run one or more generators if you want.
199
+
-**Generator machines**: run `bench-dist-generator`. You can run multiple generator processes on the same machine as long as each one has a unique `--id`.
200
+
201
+
#### Distributed setup
202
+
203
+
All coordinator and generator machines should use the same `templates/keynote-2` checkout and have dependencies installed:
204
+
205
+
```bash
206
+
cd templates/keynote-2
207
+
pnpm install
208
+
cp .env.example .env
209
+
```
210
+
211
+
Generate TypeScript bindings in that checkout on each machine that will run the coordinator or a generator:
If you are using a named server instead of `local`, replace `--server local` with the correct server name.
241
+
242
+
#### Step 3: Start the coordinator
243
+
244
+
On the **coordinator machine**:
245
+
246
+
```bash
247
+
cd templates/keynote-2
248
+
249
+
pnpm run bench-dist-coordinator -- \
250
+
--test test-1 \
251
+
--connector spacetimedb \
252
+
--warmup-seconds 15 \
253
+
--window-seconds 30 \
254
+
--verify 1 \
255
+
--stdb-url ws://127.0.0.1:3000 \
256
+
--stdb-module test-1 \
257
+
--bind 127.0.0.1 \
258
+
--port 8080
259
+
```
260
+
261
+
Notes:
262
+
263
+
-`--warmup-seconds` is the unmeasured warmup period. Generators submit requests during warmup, but those transactions are excluded from TPS.
264
+
-`--window-seconds` is the measured interval.
265
+
-`--verify 1` preserves the existing benchmark semantics by running one verification pass centrally after the epoch completes.
266
+
- The coordinator derives the HTTP metrics endpoint from `--stdb-url` by switching to `http://` or `https://` and appending `/v1/metrics`.
267
+
- For a real multi-machine run, change `--bind 127.0.0.1` to `--bind 0.0.0.0` so remote generators can reach the coordinator.
268
+
- For a real multi-machine run, set `--stdb-url` to the server machine's reachable address.
269
+
270
+
#### Step 4: Start generators on one or more client machines
271
+
272
+
On **generator machine 1**:
273
+
274
+
```bash
275
+
cd templates/keynote-2
276
+
277
+
pnpm run bench-dist-generator -- \
278
+
--id gen-a \
279
+
--coordinator-url http://127.0.0.1:8080 \
280
+
--test test-1 \
281
+
--connector spacetimedb \
282
+
--concurrency 2500 \
283
+
--accounts 100000 \
284
+
--alpha 1.5 \
285
+
--open-parallelism 128 \
286
+
--control-retries 3 \
287
+
--stdb-url ws://127.0.0.1:3000 \
288
+
--stdb-module test-1
289
+
```
290
+
291
+
On **generator machine 2**:
292
+
293
+
```bash
294
+
cd templates/keynote-2
295
+
296
+
pnpm run bench-dist-generator -- \
297
+
--id gen-b \
298
+
--coordinator-url http://127.0.0.1:8080 \
299
+
--test test-1 \
300
+
--connector spacetimedb \
301
+
--concurrency 2500 \
302
+
--accounts 100000 \
303
+
--alpha 1.5 \
304
+
--open-parallelism 128 \
305
+
--control-retries 3 \
306
+
--stdb-url ws://127.0.0.1:3000 \
307
+
--stdb-module test-1
308
+
```
309
+
310
+
Repeat that on as many generator machines as needed, adjusting `--id` and `--concurrency` for each process.
311
+
For a real multi-machine run, replace `127.0.0.1` with the coordinator host in `--coordinator-url` and the SpacetimeDB server host in `--stdb-url`.
312
+
313
+
`--open-parallelism` controls connection ramp-up only. It deliberately avoids a connection storm by opening connections in bounded parallel batches.
314
+
`--control-retries` sets the retry cap for `register`, `ready`, `/state`, and `/stopped`. The default is `3`.
315
+
316
+
#### Step 5: Confirm generators are ready
317
+
318
+
On the **coordinator machine**:
319
+
320
+
```bash
321
+
cd templates/keynote-2
322
+
323
+
pnpm run bench-dist-control -- status --coordinator-url http://127.0.0.1:8080
324
+
```
325
+
326
+
Wait until each generator shows `state=ready` and `opened=N/N`.
327
+
328
+
#### Step 6: Start an epoch
329
+
330
+
On the **coordinator machine**:
331
+
332
+
```bash
333
+
cd templates/keynote-2
334
+
335
+
pnpm run bench-dist-control -- start-epoch --coordinator-url http://127.0.0.1:8080 --label run-1
336
+
```
337
+
338
+
`start-epoch` waits for the epoch to finish, then prints the final result.
339
+
340
+
#### Step 7: Check results
341
+
342
+
Each completed epoch writes one JSON result file on the coordinator machine under:
343
+
344
+
```text
345
+
templates/keynote-2/runs/distributed/
346
+
```
347
+
348
+
The result contains:
349
+
350
+
- participating generator IDs
351
+
- total participating connections
352
+
- committed transaction delta from the server metrics endpoint
353
+
- measured window duration
354
+
- computed TPS
355
+
- verification result
356
+
357
+
#### Operational notes
358
+
359
+
- Start the coordinator before the generators.
360
+
- Generators begin submitting requests when the coordinator enters `warmup`, not when the measured window begins.
361
+
- Throughput is measured only from the committed transaction counter delta recorded after warmup, so warmup transactions are excluded.
362
+
- For this distributed TypeScript mode, each connection runs closed-loop with one request at a time. There is no pipelining in this flow.
363
+
- Late generators are allowed to register and become ready while an epoch is already running, but they only participate in the next epoch.
364
+
- The coordinator does not use heartbeats. It includes generators that most recently reported `ready`.
365
+
- If a participating generator dies and never sends `/stopped`, the epoch result is written with an `error`, and that generator remains `running` in coordinator status until you restart it and let it register again.
366
+
- You can run multiple generator processes on the same machine if you want to test the harness locally. Just make sure each process uses a unique `--id`.
0 commit comments