A self-hosted systems programming language with multi-language keywords
Quick Start · Examples · Features · Benchmarks · IDE Support · Contributing
Seen is a compiled systems programming language where the compiler is written entirely in Seen itself. It targets LLVM, ships with a built-in LSP, and lets you write code using keywords in English, Arabic, Spanish, Russian, Chinese, or Japanese.
fun main() {
let names = ["Alice", "Bob", "Charlie"]
for name in names {
println("Hello, {name}!")
}
}
The same program in Arabic:
دالة رئيسية() {
اجعل أسماء = ["أحمد", "سارة", "خالد"]
لكل اسم في أسماء {
اطبع("مرحبا، {اسم}!")
}
}
And in Chinese:
函数 主函数() {
让 名字列表 = ["小明", "小红", "小华"]
对于 名字 在 名字列表 {
打印("你好,{名字}!")
}
}
Performance -- Seen compiles through LLVM with ThinLTO, vectorization, and aggressive inlining. Benchmarks track within 1.0x--1.5x of equivalent Rust programs across 17 workloads (matrix multiplication, sieves, binary trees, n-body simulation, etc.).
Self-hosted -- The compiler (62,000+ lines of Seen across 123 source files) compiles itself. Bootstrap verification confirms the fixed-point: stage 2 and stage 3 produce identical binaries.
Fast compilation -- Fork-parallel IR generation across 50+ modules with content-addressed incremental caching. Only changed modules recompile.
Multi-language keywords -- Keywords are defined in TOML files under languages/. Adding a new language is adding a directory of TOML files -- no compiler changes required.
Region-based memory -- No garbage collector. Memory is managed through regions and arenas with compile-time lifetime tracking.
- LLVM 18+ (clang, opt, lld)
- GCC (for runtime compilation)
- Git
git clone https://github.com/codeyousef/SeenLang.git
cd SeenLang
./scripts/safe_rebuild.shThe production compiler lands at compiler_seen/target/seen.
sudo cp compiler_seen/target/seen /usr/local/bin/seenOr add to your shell profile:
export PATH="$PATH:/path/to/SeenLang/compiler_seen/target"echo 'fun main() { println("Hello, Seen!") }' > hello.seen
seen build hello.seen -o hello
./helloseen build source.seen -o output # Compile to native binary
seen build source.seen --fast # Fast build (skip Polly, O1)
seen run source.seen # JIT execution
seen check source.seen # Type check only
seen fmt source.seen # Format code
seen pkg fetch # Install package dependencies from Seen.toml
seen lsp # Start language server| Flag | Description |
|---|---|
--fast |
Skip heavy optimizations, use O1 |
--release |
Full optimization with LTO |
--emit-llvm |
Dump generated LLVM IR |
--backend c |
Use C backend instead of LLVM |
--debug |
Enable debug symbols and tracing |
--trace-llvm |
Trace LLVM IR generation |
--dump-struct-layouts |
Print struct field layouts |
--null-safety |
Enable null safety checks |
--warn-uninit |
Warn on uninitialized variables |
--stack-check |
Enable stack overflow checks |
fun main() {
let name = "Seen"
var count = 0
while count < 5 {
count = count + 1
if count == 3 {
println("Three!")
}
}
println("{name}: counted to {count}")
}
class Vec2 {
var x: Float
var y: Float
static fun new(x: Float, y: Float) r: Vec2 {
return Vec2 { x: x, y: y }
}
fun length() r: Float {
return sqrt(this.x * this.x + this.y * this.y)
}
fun add(other: Vec2) r: Vec2 {
return Vec2.new(this.x + other.x, this.y + other.y)
}
}
fun main() {
let a = Vec2.new(3.0, 4.0)
let b = Vec2.new(1.0, 2.0)
let c = a.add(b)
println("Length: {c.length()}")
}
enum Shape {
Circle(radius: Float)
Rectangle(width: Float, height: Float)
}
fun area(shape: Shape) r: Float {
return when shape {
is Circle(r) => 3.14159 * r * r
is Rectangle(w, h) => w * h
}
}
trait Printable {
fun display() r: String
}
impl Printable for Vec2 {
fun display() r: String {
return "({this.x}, {this.y})"
}
}
fun max<T>(a: T, b: T) r: T {
if a > b { return a }
return b
}
class Stack<T> {
var items: Array<T>
fun push(item: T) {
this.items.push(item)
}
fun pop() r: T {
return this.items.pop()
}
}
@async
fun fetchData(url: String) r: String {
let response = await httpGet(url)
return response.body
}
fun apply(arr: Array<Int>, f: Fun) r: Array<Int> {
var result = Array<Int>()
for item in arr {
result.push(f(item))
}
return result
}
fun main() {
let nums = [1, 2, 3, 4, 5]
let doubled = apply(nums, |x| x * 2)
}
fun dot_product(a: Array<Float>, b: Array<Float>, n: Int) r: Float {
var sum = f32x4(0.0, 0.0, 0.0, 0.0)
var i = 0
while i + 4 <= n {
let va = simd_load_f32x4(a, i)
let vb = simd_load_f32x4(b, i)
sum = sum + va * vb
i = i + 4
}
return reduce_add(sum)
}
@compute(workgroup_size = 64)
fun vector_add(a: Buffer<Float>, b: Buffer<Float>, out: Buffer<Float>) {
let idx = global_invocation_id.x
out[idx] = a[idx] + b[idx]
}
fun main() {
var results = Array<Int>.withLength(1000)
parallel_for i in 0..1000 {
results[i] = i * i
}
}
comptime fun factorial(n: Int) r: Int {
if n <= 1 { return 1 }
return n * factorial(n - 1)
}
let TABLE_SIZE = comptime { factorial(10) }
fun readFile(path: String) r: String {
let file = File.open(path)
defer { file.close() }
try {
return file.readAll()
} catch e {
println("Error: {e}")
return ""
}
}
- Immutable by default (
let), opt-in mutability (var) - Nullable types (
T?) with safe access (?.) and null coalescing (??) - Generics with constraints (
<T: Ord>) - Type aliases and distinct types
Result<T, E>andOption<T>types
- Classes with methods, inheritance, and traits
- Enums (simple and data-carrying)
- Structs (value types)
Array<T>,Vec<T>,HashMap<K, V>,BTreeMap<K, V>,LinkedList<T>,SmallVec<T, N>
- Region-based memory (no GC)
move,borrow,refsemanticsdeferfor cleanuparenaallocators@packed,@cache_linelayout control
async/awaitwith LLVM coroutinesparallel_forwith fork-based parallelismMutex,RwLock,Barrier,Channel,AtomicInt@send/@syncmarkers for thread safety
comptimeevaluation- Decorators:
@derive(Clone, Hash, Eq, Debug, Serialize, Deserialize, Json) @reflectfor runtime type information@intrinsicfor LLVM intrinsic mapping
@compute,@vertex,@fragmentshader annotationsBuffer<T>,Uniform<T>,Image<T>types- GLSL codegen with Vulkan runtime
--emit-glslto inspect generated shaders
- Vector types:
i8x16,i16x8,i32x4,i64x2,f32x4,f64x2 - Arithmetic, comparison, shuffle, swizzle
- Horizontal reductions (
reduce_add,reduce_min,reduce_max) - Aligned load/store, gather/scatter
extern funfor C FFI@cImportfor C header inclusion@repr(C)for C-compatible struct layout
- Word operators:
and,or,not(alongside&&,||,!) - String interpolation:
"Hello, {name}!" - Range:
0..n,0..=n - Pipe-style chaining
17 production benchmarks in benchmarks/production/:
| Benchmark | Description |
|---|---|
01_matrix_mult |
Dense matrix multiplication |
02_sieve |
Sieve of Eratosthenes |
03_binary_trees |
GC-stress binary tree allocation |
04_fasta |
FASTA sequence generation |
05_nbody |
N-body planetary simulation |
06_revcomp |
Reverse-complement DNA |
07_mandelbrot |
Mandelbrot set rendering |
08_lru_cache |
LRU cache with hash map |
09_json_serialize |
JSON serialization |
11_spectral_norm |
Spectral norm computation |
12_fannkuch |
Fannkuch-redux permutations |
13_great_circle |
Great-circle distance |
14_hyperbolic_pde |
Hyperbolic PDE solver |
15_dft_spectrum |
Discrete Fourier transform |
16_euler_totient |
Euler's totient function |
17_fibonacci |
Recursive Fibonacci |
Run benchmarks:
./scripts/run_production_benchmarks.shComparison benchmarks against C, C++, Rust, and Zig are in benchmarks/comparison/.
Seen's keywords are defined externally in TOML files. Six languages ship with the compiler:
| Language | Directory | Example keyword for fun |
|---|---|---|
| English | languages/en/ |
fun |
| Arabic | languages/ar/ |
دالة |
| Spanish | languages/es/ |
fun |
| Russian | languages/ru/ |
функция |
| Chinese | languages/zh/ |
函数 |
| Japanese | languages/ja/ |
関数 |
Each language has 17 TOML files covering keywords, operators, and standard library names.
- Create
languages/xx/(wherexxis the language code) - Copy the English TOML files as templates
- Translate keyword values
- The compiler auto-detects available languages
No compiler rebuild required.
The vscode-seen/ directory contains a full-featured extension:
- Syntax highlighting with TextMate grammar
- IntelliSense via built-in LSP
- Real-time error diagnostics
- Code formatting, debugging, REPL
- Snippets for common patterns
- Multi-language keyword support
cd vscode-seen
npm install
npm run package
code --install-extension seen-*.vsixSeen includes a built-in language server:
seen lspNeovim:
require'lspconfig'.seen.setup{
cmd = {"seen", "lsp"},
filetypes = {"seen"},
root_dir = require'lspconfig.util'.root_pattern("Seen.toml", ".git"),
}Emacs:
(lsp-register-client
(make-lsp-client :new-connection (lsp-stdio-connection '("seen" "lsp"))
:major-modes '(seen-mode)
:server-id 'seen-lsp))SeenLang/
├── compiler_seen/ # Self-hosted compiler (62K+ lines of Seen)
│ └── src/
│ ├── main.seen # Entry point
│ ├── lexer/ # Tokenizer with multi-language support
│ ├── parser/ # Recursive descent parser
│ ├── typechecker/ # Type inference and checking
│ ├── codegen/ # LLVM IR generation (13 modules)
│ ├── ir/ # IR builder and SSA construction
│ ├── bootstrap/ # Frontend orchestration
│ └── lsp/ # Language server implementation
├── bootstrap/ # Frozen bootstrap compiler
│ └── stage1_frozen # Verified binary (SHA-256 checked)
├── seen_std/ # Standard library (Seen)
├── seen_runtime/ # C runtime (memory, I/O, collections)
├── languages/ # Keyword definitions (6 languages, 102 TOML files)
├── vscode-seen/ # VS Code extension
├── tests/ # Test suites
│ └── e2e_multilang/ # 66 end-to-end tests across 6 languages
├── benchmarks/ # 17 production benchmarks + comparison suite
├── scripts/ # Build, test, and IR validation tools
├── installer/ # Platform installers (Linux, macOS, Windows)
└── docs/ # Design documents and specifications
The compiler follows a 5-stage pipeline:
Source (.seen)
→ Lexer (tokenize with language-specific keywords)
→ Parser (recursive descent → AST)
→ Type Checker (inference, validation, smart casts)
→ IR Generator (AST → LLVM IR, three-pass: signatures → types → bodies)
→ LLVM Backend (opt -O3 → ThinLTO → lld link)
→ Native Binary
Key architectural decisions:
- Fork-parallel codegen: Each module's IR is generated in a forked child process with copy-on-write memory
- Content-addressed IR cache: Cache key =
hash(declarations_digest + module_source), so editing one function only recompiles that module - Three-pass IR generation: First pass collects all signatures, second resolves types, third emits function bodies -- enables forward references without a separate declaration phase
- IR validation:
scripts/seen_ir_verify.shrunsllvm-asstructural checks andseen_ir_lintsemantic checks on every.llfile before optimization
The compiler compiles itself. After any change to compiler_seen/src/, verify bootstrap:
./scripts/safe_rebuild.shThis builds stage 2 from the frozen bootstrap, then stage 3 from stage 2. If stage 2 == stage 3, the fixed-point is confirmed.
# End-to-end tests (66 tests, 6 languages)
bash tests/e2e_multilang/run_all_e2e.sh
# IR validation on generated modules
./scripts/seen_ir_verify.sh /tmp/seen_module_*.ll# Type checker tracing
SEEN_DEBUG_TYPES=1 seen build program.seen
# LLVM IR generation tracing
SEEN_TRACE_LLVM=all seen build program.seen
# Struct layout debugging
SEEN_TRACE_LLVM=gep seen build program.seen- Fork the repository
- Create a feature branch
- Make changes
- Run tests:
bash tests/e2e_multilang/run_all_e2e.sh - Verify bootstrap:
./scripts/safe_rebuild.sh - Submit a pull request
MIT License. See LICENSE for details.
