Chat Demo
← Local AI Apps

Chat with an open language model running entirely in your browser. Everything runs locally on your machine so nothing is sent to a server. Models small enough to run in a browser are still limited, so this is just a demo for now, expect it to improve over time.

Responses come from Gemma 3 1B, an open instruction-tuned model from Google, run in the browser by Transformers.js 4.2.0 on ONNX Runtime Web with WebGPU.

Model
Gemma 3 1B (instruction-tuned)
Parameters
~1 billion
Context window
32K tokens (32,768)
Quantization
4-bit (q4)
Download size
~1 GB
Knowledge cutoff
August 2024
Languages
English

It runs entirely on your machine — no server, no upload. The model (onnx-community/gemma-3-1b-it-ONNX, ~1 GB) downloads once from the Hugging Face Hub, then is cached by the browser.