Skip to content

Commit fdc70b9

Browse files
Use ARM64 TEI images, move groupRef to spec level
Replace the ARM64 emulation workaround with the now-published cpu-arm64-latest image. Move groupRef from spec.config to spec level in all VirtualMCPServer examples to match the current CRD. Address remaining PR review feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent a673ec1 commit fdc70b9

2 files changed

Lines changed: 23 additions & 43 deletions

File tree

docs/toolhive/guides-vmcp/optimizer.mdx

Lines changed: 6 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -146,37 +146,10 @@ are:
146146
For the complete field reference, see the
147147
[EmbeddingServer CRD specification](../reference/crd-spec.md#apiv1alpha1embeddingserver).
148148

149-
:::warning[ARM64 compatibility]
149+
:::tip[ARM64 support]
150150

151-
The default TEI CPU images depend on Intel MKL, which is x86_64-only. Native
152-
ARM64 support has been merged upstream but is not yet included in a published
153-
release. Track the
154-
[TEI GitHub repository](https://github.com/huggingface/text-embeddings-inference)
155-
for updates on ARM64 image availability.
156-
157-
In the meantime, you can run the amd64 image under emulation on ARM64 nodes. If
158-
you are using Docker Desktop, you must first disable the containerd image store
159-
(**Settings > General > uncheck "Use containerd for pulling and storing
160-
images" > Apply & Restart**). Without this, `kind load docker-image` silently
161-
fails because the containerd store preserves multi-arch manifest indexes that
162-
kind cannot import. See
163-
[kind#3795](https://github.com/kubernetes-sigs/kind/issues/3795) for details.
164-
165-
Then pull the amd64 image and load it into your cluster:
166-
167-
```bash
168-
docker pull --platform linux/amd64 \
169-
ghcr.io/huggingface/text-embeddings-inference:cpu-1.7
170-
kind load docker-image \
171-
ghcr.io/huggingface/text-embeddings-inference:cpu-1.7
172-
```
173-
174-
The `kind load` command is specific to kind. For other cluster distributions,
175-
use the equivalent image-loading mechanism (for example, `ctr images import` for
176-
containerd, or push the image to a registry your cluster can pull from).
177-
178-
Then, pin the image in your EmbeddingServer so the operator uses the pre-pulled
179-
tag instead of the default `cpu-latest`:
151+
The default TEI image (`cpu-latest`) is x86_64-only. If you are running on ARM64
152+
nodes (for example, Apple Silicon), override the image in your EmbeddingServer:
180153

181154
```yaml title="embedding-server.yaml"
182155
apiVersion: toolhive.stacklok.dev/v1alpha1
@@ -185,7 +158,7 @@ metadata:
185158
name: my-embedding
186159
namespace: toolhive-system
187160
spec:
188-
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.7
161+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-arm64-latest
189162
```
190163

191164
:::
@@ -294,6 +267,8 @@ metadata:
294267
name: full-vmcp
295268
namespace: toolhive-system
296269
spec:
270+
groupRef:
271+
name: my-tools
297272
embeddingServerRef:
298273
name: full-embedding
299274
groupRef:

docs/toolhive/tutorials/mcp-optimizer.mdx

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,7 @@ Server (vMCP) and an EmbeddingServer for semantic tool search.
3131
- How to create an MCPGroup with multiple backend MCP servers
3232
- How to deploy an EmbeddingServer for semantic search
3333
- How to create a VirtualMCPServer with the optimizer enabled
34-
- How to connect your AI client to the optimized endpoint and verify it exposes
35-
only `find_tool` and `call_tool`
34+
- How to connect your AI client to the optimized endpoint
3635

3736
## About MCP Optimizer
3837

@@ -90,15 +89,13 @@ Before starting this tutorial, make sure you have:
9089
- An MCP client (Visual Studio Code with GitHub Copilot is used in this
9190
tutorial)
9291

93-
:::warning[ARM64 compatibility]
92+
:::tip[ARM64 support]
9493

95-
The default text embeddings inference (TEI) images depend on Intel MKL, which is
96-
x86_64-only. Native ARM64 support has been merged upstream but is not yet
97-
included in a published release. If you are using Apple Silicon or any other
98-
ARM64 nodes (including kind on macOS), you can run the amd64 image under
99-
emulation as a workaround. See the
94+
The default TEI image is x86_64-only. If you are running on ARM64 nodes (for
95+
example, Apple Silicon with kind), set the `image` field in your EmbeddingServer
96+
to use the ARM64 image. See
10097
[EmbeddingServer resource](../guides-vmcp/optimizer.mdx#embeddingserver-resource)
101-
section for the required steps, including a Docker Desktop configuration change.
98+
for details.
10299

103100
:::
104101

@@ -245,15 +242,15 @@ metadata:
245242
namespace: toolhive-system
246243
spec:
247244
# highlight-start
245+
groupRef:
246+
name: optimizer-demo
248247
embeddingServerRef:
249248
name: optimizer-embedding
250249
# highlight-end
251250
incomingAuth:
252251
type: anonymous
253252
serviceType: ClusterIP
254253
config:
255-
groupRef:
256-
name: optimizer-demo
257254
aggregation:
258255
conflictResolution: prefix
259256
conflictResolutionConfig:
@@ -350,6 +347,14 @@ To check your token savings, send this prompt to your AI client:
350347

351348
- "How many tokens did I save using MCP Optimizer?"
352349

350+
:::note
351+
352+
With only two backend MCP servers and a small number of tools, the optimizer may
353+
report minimal or no token savings. The benefit becomes more significant as you
354+
add more backends and tools to your MCPGroup.
355+
356+
:::
357+
353358
## Clean up
354359

355360
Remove the local workload and delete the Kubernetes resources when you're done:
@@ -384,7 +389,7 @@ kind delete cluster --name toolhive
384389
## Related information
385390

386391
- [Optimize tool discovery](../guides-vmcp/optimizer.mdx) - full parameter
387-
reference, high availability, and ARM64 workaround details
392+
reference, high availability, and ARM64 support details
388393
- [Optimizing LLM context](../concepts/tool-optimization.mdx) - background on
389394
tool filtering and context pollution
390395
- [Virtual MCP Server overview](../concepts/vmcp.mdx) - conceptual overview of

0 commit comments

Comments
 (0)