> ## Documentation Index
> Fetch the complete documentation index at: https://docs.zerotwo.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Groq

> Groq's ultra-fast LPU inference in ZeroTwo — open-source models at exceptional speeds.

Groq is not an AI model company — it's an **AI inference hardware company** that builds LPU (Language Processing Unit) chips designed specifically for running large language models. The result: open-source models running on Groq hardware can be 10x or more faster than the same models on conventional GPU infrastructure.

In ZeroTwo, Groq provides access to popular open-source models running on this fast inference stack.

***

## What Groq Offers

Groq provides **ultra-fast inference** for open-source models. When you select a Groq-hosted model in ZeroTwo, you're running well-known open-source models (Llama, Mixtral, and others) on Groq's LPU hardware, which delivers responses dramatically faster than typical cloud GPU inference.

**What this means in practice:**

* Responses start streaming almost instantly
* Full responses complete in seconds rather than tens of seconds for large outputs
* The underlying model capability is the same as running Llama or Mixtral elsewhere — just much faster

***

## Available Models

Groq hosts several open-source models in ZeroTwo. The specific selection may vary as Groq updates its model portfolio. Models may include variants of:

* **Llama** (Meta's open-source LLM family — various sizes)
* **Mixtral** (Mistral's mixture-of-experts open-source model)
* Other open-source models as added

Check the **Model Picker** in ZeroTwo for the current full list of Groq-hosted models and their specific names.

***

## Strengths

**Exceptional speed:** Groq's LPU hardware delivers response speeds that are substantially faster than GPU-based inference. For interactive use cases, this creates a noticeably snappier experience.

**Cost-effective:** Groq models typically use standard model classification in ZeroTwo — no premium quota consumed.

**Good for iteration:** When you're rapidly iterating (trying many variations, testing prompts, brainstorming), faster response times reduce friction significantly.

**Open-source models:** Llama and Mixtral are powerful, well-studied models with broad capability across many tasks.

***

## Best Use Cases

<CardGroup cols={2}>
  <Card title="Rapid prototyping" icon="zap">
    When you're iterating quickly through ideas, testing prompt variations, or exploring a problem space, fast responses reduce friction.
  </Card>

  <Card title="High-volume workflows" icon="layers">
    Tasks requiring many sequential AI responses benefit most from Groq's speed advantage.
  </Card>

  <Card title="Latency-sensitive applications" icon="cpu">
    Use cases where the speed of the first response token matters — interactive demos, live coding assistance.
  </Card>

  <Card title="Standard quality tasks" icon="check-circle">
    Summarization, drafting, Q\&A, and other everyday tasks where you want good results fast without using premium quota.
  </Card>
</CardGroup>

***

## Limitations

**Open-source model capability ceiling:** While Llama and Mixtral are strong models, they have a capability ceiling below the top frontier models (GPT-5, Claude Opus, Gemini 2.5 Pro). For the most complex reasoning or nuanced tasks, a premium frontier model will generally outperform Groq-hosted models.

**Model selection:** Groq's model portfolio is determined by what Groq chooses to host. It's more limited than ZeroTwo's full model library. Check the Model Picker for current availability.

***

<Tip>
  Use Groq models when speed matters more than maximum quality — for brainstorming, drafts, quick Q\&A, and iterative workflows. Switch to a premium model when you need the highest quality output.
</Tip>