GPU-instanced particle rendering

## Summary

Render particles using GPU instancing instead of individual \`addQuad()\` calls. All particle state (position, velocity, scale, rotation, alpha, tint) is uploaded to a GPU buffer and rendered in a single draw call. Particle physics (gravity, fade, spin) is computed in the vertex shader — zero per-particle CPU cost per frame.

## Current State

- \`ParticleEmitter\` creates individual \`Particle\` renderables
- Each particle is drawn via \`addQuad()\` in the quad batcher — 4 vertices pushed per particle per frame
- CPU handles all per-particle updates: position, velocity, gravity, rotation, alpha fade, scaling
- Performance caps around 500-1000 visible particles before frame drops
- Each particle's \`update()\` and \`draw()\` is called individually

## Proposed Architecture

### Instance buffer

A single vertex buffer containing per-instance data for all active particles:
- \`vec2 position\` — world position
- \`float rotation\` — current angle
- \`float scale\` — current size
- \`float alpha\` — current opacity
- \`uint32 tint\` — color tint (packed ARGB)
- \`float life\` — remaining lifetime (0-1 normalized)

Updated once per frame via \`bufferSubData()\` — one upload for all particles.

### Vertex shader

Computes the final quad corners for each instance:
- Applies position, rotation, scale from the instance buffer
- Optionally computes physics on GPU (velocity, gravity, fade) if spawn parameters are uploaded instead of per-frame state

### Single draw call

\`gl.drawArraysInstanced(gl.TRIANGLE_STRIP, 0, 4, particleCount)\` — one call renders all particles regardless of count.

### Integration

- New \`GPUParticleEmitter\` class (or opt-in mode on existing \`ParticleEmitter\`)
- Shares the same configuration API: \`minLife\`, \`maxLife\`, \`speed\`, \`gravity\`, \`wind\`, etc.
- Falls back to existing CPU particle system for Canvas renderer
- Requires WebGL2 for \`drawArraysInstanced\` (or use ANGLE_instanced_arrays extension for WebGL1)

## Performance expectations

- **CPU**: near-zero per-frame cost — no per-particle update loop, no per-particle vertex pushing
- **GPU**: single draw call, single texture bind, hardware instancing handles the rest
- **Capacity**: 50,000-100,000+ particles vs current ~500-1000 cap
- **Memory**: one instance buffer (~24 bytes per particle) — 100K particles = ~2.4MB

## API Sketch

```js
// opt-in GPU mode on existing emitter
const emitter = new ParticleEmitter(x, y, {
    image: texture,
    totalParticles: 10000,
    gpu: true, // enable GPU instancing
    // same config as before...
});

// or a dedicated class
const emitter = new GPUParticleEmitter(x, y, {
    image: texture,
    totalParticles: 50000,
    speed: { min: 1, max: 5 },
    gravity: 0.5,
    wind: 0.1,
});
```

## Challenges

- **Particle spawn/despawn**: need a ring buffer or free-list on the CPU side to manage active particles without reallocating
- **Per-particle animation**: frame-based sprite animation would need a frame index in the instance data + atlas UV lookup in the shader
- **Sorting**: transparent particles may need back-to-front sorting for correct blending — can be done on CPU before upload or approximated
- **WebGL1 fallback**: \`ANGLE_instanced_arrays\` extension provides instancing on WebGL1 but not universally available

## References

- \`ParticleEmitter\`: \`src/particles/emitter.js\`
- \`Particle\`: \`src/particles/particle.js\`
- \`QuadBatcher\`: \`src/video/webgl/batchers/quad_batcher.js\`
- WebGL2 instancing: \`gl.drawArraysInstanced()\`
- WebGL1 extension: \`ANGLE_instanced_arrays\`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU-instanced particle rendering #1404

Summary

Current State

Proposed Architecture

Instance buffer

Vertex shader

Single draw call

Integration

Performance expectations

API Sketch

Challenges

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

GPU-instanced particle rendering #1404

Description

Summary

Current State

Proposed Architecture

Instance buffer

Vertex shader

Single draw call

Integration

Performance expectations

API Sketch

Challenges

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions