Hello, Gemma! - Part 1: The Build
Hello, Gemma3n!
Jetson + MatFormer + PLE Caching + Audio Input
Bringing Google latest open-source model, Gemma3n, to NVIDIA Jetson Orin to enable on-device, live audio chat!
Google Gemma 3n is nothing short of incredible! In addition to incredible multi-modal performance in a tiny, efficient package, they also managed to add multi-language audio input!!
Design Considerations:
- Python + transformers + torch
- Audio input is so new, we'll have to leverage the latest tranformers package from HuggingFace to leverage it.
- jetson-containers
- To make things easy, I'll build a container with the latest transformers for the Jetson using jetson-containers
- Piper for efficient on device text-to-speech
…video coming when I get a chance shrink it…
Full build and details on GitHub: GregariousEngineering/hello-gemma
Up Next!
- Wake word and query completion detection
- Internet access
- Remote LLMs
Comments
Post a Comment