Hello, Gemma! - Part 1: The Build

Hello, Gemma3n!

Jetson + MatFormer + PLE Caching + Audio Input

Bringing Google latest open-source model, Gemma3n, to NVIDIA Jetson Orin to enable on-device, live audio chat!


Google Gemma 3n is nothing short of incredible! In addition to incredible multi-modal performance in a tiny, efficient package, they also managed to add multi-language audio input!!

 

Design Considerations: 

  • Python + transformers + torch
    • Audio input is so new, we'll have to leverage the latest tranformers package from HuggingFace to leverage it.
  • jetson-containers
    • To make things easy, I'll build a container with the latest transformers for the Jetson using jetson-containers
  • Piper for efficient on device text-to-speech 

 …video coming when I get a chance shrink it…

Full build and details on GitHub: GregariousEngineering/hello-gemma 

Up Next!

  • Wake word and query completion detection
  • Internet access
  • Remote LLMs 


Comments

Popular posts from this blog

Turn Off Those Super Annoying Autoplay Ads in FireTV

Hackles Feedback Hypothesis

Change these “Off-Facebook” Facebook Privacy Settings