Skip to content
View SuperMarioYL's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report SuperMarioYL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SuperMarioYL/README.md
EN  ⇄  中文
Leo — AI systems, made to run in production.

I build the infra that makes LLM agents reliable in production — inference serving, MCP tool layers, multi-agent orchestration, and eval/observability.

How my agents run

User → Orchestrator → Tools/MCP + Memory/RAG → Inference, on a Cloud Native AI substrate, instrumented by Eval & Observability

Every tool call and LLM span is traced; guardrails gate actions; eval feedback closes the loop — agents as observable, cost-bounded systems on a cloud-native substrate.

Capabilities

AI Agent: plan→act→reflect loop · Cloud Native: scheduler + pods · Inference: lower latency, higher throughput

Tech stack

Tech stack grouped by pillar: AI Agent, Cloud Native AI, Inference

Journey

From infrastructure to agents: Cloud Native AI → Inference Acceleration → AI Agent

Selected work


Let's build reliable AI systems together · blog.lei6393.com

blog email github views

Pinned Loading

  1. trouve trouve Public

    trouve : A built-in integrated service discovery, service registration, and service forwarding general component for Spring projects

    Java 31 9

  2. Bison Bison Public

    Enterprise GPU Resource Billing & Multi-Tenant Management Platform 企业级 GPU 资源计费与多租户管理平台

    TypeScript 7

  3. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 85.3k 18.9k

  4. NousResearch/hermes-agent NousResearch/hermes-agent Public

    The agent that grows with you

    Python 208k 37.9k

  5. kubernetes/kubernetes kubernetes/kubernetes Public

    Production-Grade Container Scheduling and Management

    Go 123k 43.4k

  6. sgl-project/sglang sgl-project/sglang Public

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Python 29.9k 6.9k