Our most capable open models yet & more related news here

Our most capable open models yet

 & more related news here


At the edge, our E2B and E4B models redefine on-device utility, prioritizing multimodal capabilities, low-latency processing, and seamless ecosystem integration over raw parameter counting.

Powerful, accessible, open

To drive the next generation of research and pioneering products, we’ve sized Gemma 4 models specifically to run and fit efficiently on hardware, from billions of Android devices around the world to laptop GPUs, to workstations and developer accelerators.

Using these highly optimized models, you can tune Gemma 4 to achieve cutting-edge performance for your specific tasks. We’ve already seen incredible success with this approach; for example, INSAIT created a pioneering Bulgarian language model (BgGPT) and we worked with Yale University on Cell2Sentence-Scale to discover new avenues for cancer therapy, among many others.

Here’s what makes Gemma 4 our most capable family of open models yet:

  • Advanced reasoning: Capable of multi-step planning and deep logic, Gemma 4 demonstrates significant improvements in math and instruction-following benchmarks that require it.
  • Agent Workflows: Native support for function calls, structured JSON output, and native system instructions allows you to create autonomous agents that can interact with different tools and APIs and reliably execute workflows.
  • Code generation: Gemma 4 supports high-quality offline code, turning your workstation into a local AI code assistant.
  • Vision and audio: All models natively process video and images, support variable resolutions, and excel at visual tasks like OCR and graphics understanding. Additionally, the E2B and E4B models feature native audio input for speech recognition and understanding.
  • Longer context: Process long-form content seamlessly. Edge models feature a 128 KB context window, while larger models offer up to 256 KB, allowing you to pass through long repositories or documents in a single message.
  • More than 140 languages: Natively trained in over 140 languages, Gemma 4 helps developers build high-performing, inclusive apps for a global audience.

Versatile models for diverse hardware

We’re launching Gemma 4 model weights in sizes designed for specific hardware and use cases, ensuring you get cutting-edge reasoning wherever you need it:

Models 26B and 31B: Frontier Intelligence, Offline on Your Personal Computers

Optimized to provide researchers and developers with next-generation reasoning on affordable hardware, our unquantized bfloat16 weights efficiently scale to a single NVIDIA H100 80GB GPU. For on-premises setups, quantized versions run natively on consumer GPUs to power your IDEs, coding assistants, and agent workflows. Our 26B Mix of Experts (MoE) focuses on latency, activating only 3.8 billion of its total parameters during inference to deliver exceptionally fast tokens per second, while our 31B Dense maximizes raw quality and provides a powerful foundation for tuning.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *