Skip to content

Commit a673ec1

Browse files
Address PR review feedback for optimizer tutorial
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b58f9ef commit a673ec1

2 files changed

Lines changed: 33 additions & 33 deletions

File tree

docs/toolhive/guides-vmcp/optimizer.mdx

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -148,11 +148,21 @@ For the complete field reference, see the
148148

149149
:::warning[ARM64 compatibility]
150150

151-
The default TEI CPU images depend on Intel MKL, which is x86_64-only. No
152-
official ARM64 images exist yet. On ARM64 nodes (including Apple Silicon with
153-
kind), you can run the amd64 image under emulation as a workaround.
151+
The default TEI CPU images depend on Intel MKL, which is x86_64-only. Native
152+
ARM64 support has been merged upstream but is not yet included in a published
153+
release. Track the
154+
[TEI GitHub repository](https://github.com/huggingface/text-embeddings-inference)
155+
for updates on ARM64 image availability.
156+
157+
In the meantime, you can run the amd64 image under emulation on ARM64 nodes. If
158+
you are using Docker Desktop, you must first disable the containerd image store
159+
(**Settings > General > uncheck "Use containerd for pulling and storing
160+
images" > Apply & Restart**). Without this, `kind load docker-image` silently
161+
fails because the containerd store preserves multi-arch manifest indexes that
162+
kind cannot import. See
163+
[kind#3795](https://github.com/kubernetes-sigs/kind/issues/3795) for details.
154164

155-
First, pull the amd64 image and load it into your cluster:
165+
Then pull the amd64 image and load it into your cluster:
156166

157167
```bash
158168
docker pull --platform linux/amd64 \
@@ -178,10 +188,6 @@ spec:
178188
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.7
179189
```
180190

181-
Native ARM64 support is in progress upstream. Track the
182-
[TEI GitHub repository](https://github.com/huggingface/text-embeddings-inference)
183-
for updates.
184-
185191
:::
186192

187193
## Tune the optimizer

docs/toolhive/tutorials/mcp-optimizer.mdx

Lines changed: 19 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,8 @@ Server (vMCP) and an EmbeddingServer for semantic tool search.
3131
- How to create an MCPGroup with multiple backend MCP servers
3232
- How to deploy an EmbeddingServer for semantic search
3333
- How to create a VirtualMCPServer with the optimizer enabled
34-
- How to connect your AI client to the optimized endpoint
35-
- How to verify the optimizer reduces the visible toolset to `find_tool` and
36-
`call_tool`
34+
- How to connect your AI client to the optimized endpoint and verify it exposes
35+
only `find_tool` and `call_tool`
3736

3837
## About MCP Optimizer
3938

@@ -94,10 +93,12 @@ Before starting this tutorial, make sure you have:
9493
:::warning[ARM64 compatibility]
9594

9695
The default text embeddings inference (TEI) images depend on Intel MKL, which is
97-
x86_64-only. If you are using Apple Silicon or any other ARM64 node (including
98-
kind on macOS), you need to pre-pull the amd64 image before proceeding. See
99-
[ARM64 compatibility](../guides-vmcp/optimizer.mdx#arm64-compatibility) for the
100-
workaround steps.
96+
x86_64-only. Native ARM64 support has been merged upstream but is not yet
97+
included in a published release. If you are using Apple Silicon or any other
98+
ARM64 nodes (including kind on macOS), you can run the amd64 image under
99+
emulation as a workaround. See the
100+
[EmbeddingServer resource](../guides-vmcp/optimizer.mdx#embeddingserver-resource)
101+
section for the required steps, including a Docker Desktop configuration change.
101102

102103
:::
103104

@@ -125,7 +126,7 @@ kubectl apply -f mcpgroup.yaml
125126
Next, deploy two MCP servers in the group. Both reference `optimizer-demo` in
126127
the `groupRef` field:
127128

128-
```yaml {11,30} title="mcpservers.yaml"
129+
```yaml {11-12,31-32} title="mcpservers.yaml"
129130
apiVersion: toolhive.stacklok.dev/v1alpha1
130131
kind: MCPServer
131132
metadata:
@@ -136,7 +137,8 @@ spec:
136137
transport: streamable-http
137138
proxyPort: 8080
138139
mcpPort: 8080
139-
groupRef: optimizer-demo
140+
groupRef:
141+
name: optimizer-demo
140142
resources:
141143
limits:
142144
cpu: '100m'
@@ -155,7 +157,8 @@ spec:
155157
transport: streamable-http
156158
proxyPort: 8080
157159
mcpPort: 8080
158-
groupRef: optimizer-demo
160+
groupRef:
161+
name: optimizer-demo
159162
resources:
160163
limits:
161164
cpu: '100m'
@@ -234,7 +237,7 @@ Create a VirtualMCPServer that aggregates the backend servers and enables the
234237
optimizer. Adding `embeddingServerRef` is the only change needed to enable the
235238
optimizer - sensible defaults are applied automatically:
236239

237-
```yaml {8-9} title="virtualmcpserver.yaml"
240+
```yaml title="virtualmcpserver.yaml"
238241
apiVersion: toolhive.stacklok.dev/v1alpha1
239242
kind: VirtualMCPServer
240243
metadata:
@@ -249,7 +252,8 @@ spec:
249252
type: anonymous
250253
serviceType: ClusterIP
251254
config:
252-
groupRef: optimizer-demo
255+
groupRef:
256+
name: optimizer-demo
253257
aggregation:
254258
conflictResolution: prefix
255259
conflictResolutionConfig:
@@ -348,9 +352,10 @@ To check your token savings, send this prompt to your AI client:
348352

349353
## Clean up
350354

351-
Delete the resources when you're done:
355+
Remove the local workload and delete the Kubernetes resources when you're done:
352356

353357
```bash
358+
thv rm optimizer-vmcp
354359
kubectl delete virtualmcpserver optimizer-vmcp -n toolhive-system
355360
kubectl delete embeddingserver optimizer-embedding -n toolhive-system
356361
kubectl delete mcpserver fetch osv -n toolhive-system
@@ -363,19 +368,8 @@ To tear down the entire kind cluster from the K8s Quickstart:
363368
kind delete cluster --name toolhive
364369
```
365370

366-
:::note[Legacy: standalone MCP Optimizer]
367-
368-
The standalone `mcp-optimizer` container can also run alongside ToolHive on
369-
desktop (macOS, Windows, Linux) without Kubernetes. This approach is being
370-
replaced by the Kubernetes-based optimizer described above. For the standalone
371-
setup, see the [MCP Optimizer UI guide](../guides-ui/mcp-optimizer.mdx).
372-
373-
:::
374-
375371
## Next steps
376372

377-
Now that you've set up the MCP Optimizer, consider exploring these next steps:
378-
379373
- [Tune the optimizer](../guides-vmcp/optimizer.mdx#tune-the-optimizer) to
380374
adjust search parameters for your workload
381375
- [Configure authentication](../guides-vmcp/authentication.mdx) for production
@@ -396,6 +390,6 @@ Now that you've set up the MCP Optimizer, consider exploring these next steps:
396390
- [Virtual MCP Server overview](../concepts/vmcp.mdx) - conceptual overview of
397391
vMCP
398392
- [MCP Optimizer UI guide](../guides-ui/mcp-optimizer.mdx) - standalone desktop
399-
approach (legacy)
393+
approach without Kubernetes (legacy, being replaced by the vMCP path)
400394
- [Quickstart: Kubernetes Operator](../guides-k8s/quickstart.mdx) - prerequisite
401395
tutorial

0 commit comments

Comments
 (0)