Installation Guide

Multiple ways to run Gemma 4 locally

Ollama (Recommended)

Download Ollama from ollama.ai

$ ollama run gemma4

Start chatting with Gemma 4!

LM Studio

Download LM Studio from lmstudio.ai

Search for 'Gemma 4' in the model library

Select your preferred model size and download

Load the model and start chatting

Hugging Face

Install transformers library:

$ pip install transformers torch

Load and run the model:

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('google/gemma-4-31b')
tokenizer = AutoTokenizer.from_pretrained('google/gemma-4-31b')

Docker

$ docker run -it --gpus all ghcr.io/google-deepmind/gemma4:latest

Hardware Requirements

E2B / E4B

Runs on mobile phones and Raspberry Pi

26B

Requires ~16GB GPU memory

31B

Requires ~20GB GPU memory