Skip to content

Commit d6168de

Browse files
Update providers/groq.mdx
Co-Authored-By: mintlify[bot] <109931778+mintlify[bot]@users.noreply.github.com>
1 parent 8f724ad commit d6168de

1 file changed

Lines changed: 28 additions & 63 deletions

File tree

providers/groq.mdx

Lines changed: 28 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,45 @@
11
---
22
title: "Groq"
3-
description: "Learn how to configure and use Groq's lightning-fast inference to access models from OpenAI, Meta, DeepSeek, and more with Groq."
3+
description: "Configure Groq's ultra-fast LPU inference for models from OpenAI, Meta, and DeepSeek."
44
---
55

6-
Groq provides ultra-fast AI inference through their custom LPU™ (Language Processing Unit) architecture, purpose-built for inference rather than adapted from training hardware. Groq hosts open-source models from various providers including OpenAI, Meta, DeepSeek, Moonshot AI, and others.
6+
Groq provides ultra-fast AI inference through custom LPU™ (Language Processing Unit) architecture. Hosts open-source models from OpenAI, Meta, DeepSeek, and others.
77

88
**Website:** [https://groq.com/](https://groq.com/)
99

10-
### Getting an API Key
10+
## Getting an API Key
1111

12-
1. **Sign Up/Sign In:** Go to [Groq](https://groq.com/) and create an account or sign in.
13-
2. **Navigate to Console:** Go to the [Groq Console](https://console.groq.com/) to access your dashboard.
14-
3. **Create a Key:** Navigate to the API Keys section and create a new API key. Give your key a descriptive name (e.g., "CodinIT").
15-
4. **Copy the Key:** Copy the API key immediately. You will not be able to see it again. Store it securely.
12+
1. Go to [Groq Console](https://console.groq.com/) and sign in
13+
2. Navigate to API Keys section
14+
3. Create a new API key and name it (e.g., "CodinIT")
15+
4. Copy the key immediately - you won't see it again
1616

17-
### Supported Models
17+
## Configuration
1818

19-
CodinIT supports the following Groq models:
19+
1. Click the settings icon (⚙️) in CodinIT
20+
2. Select "Groq" as the API Provider
21+
3. Paste your API key
22+
4. Choose your model
2023

21-
- `llama-3.3-70b-versatile` (Meta) - Balanced performance with 131K context
22-
- `llama-3.1-8b-instant` (Meta) - Fast inference with 131K context
23-
- `openai/gpt-oss-120b` (OpenAI) - Featured flagship model with 131K context
24-
- `openai/gpt-oss-20b` (OpenAI) - Featured compact model with 131K context
25-
- `moonshotai/kimi-k2-instruct` (Moonshot AI) - 1 trillion parameter model with prompt caching
26-
- `deepseek-r1-distill-llama-70b` (DeepSeek/Meta) - Reasoning-optimized model
27-
- `qwen/qwen3-32b` (Alibaba Cloud) - Enhanced for Q&A tasks
28-
- `meta-llama/llama-4-maverick-17b-128e-instruct` (Meta) - Latest Llama 4 variant
29-
- `meta-llama/llama-4-scout-17b-16e-instruct` (Meta) - Latest Llama 4 variant
24+
## Supported Models
3025

31-
### Configuration in CodinIT
26+
- `llama-3.3-70b-versatile` (Meta) - 131K context
27+
- `openai/gpt-oss-120b` (OpenAI) - 131K context
28+
- `moonshotai/kimi-k2-instruct` - 1T parameters with caching
29+
- `deepseek-r1-distill-llama-70b` - Reasoning optimized
30+
- `qwen/qwen3-32b` (Alibaba) - Q&A enhanced
31+
- `meta-llama/llama-4-maverick-17b-128e-instruct`
3232

33-
1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
34-
2. **Select Provider:** Choose "Groq" from the "API Provider" dropdown.
35-
3. **Enter API Key:** Paste your Groq API key into the "Groq API Key" field.
36-
4. **Select Model:** Choose your desired model from the "Model" dropdown.
33+
## Key Features
3734

38-
### Groq's Speed Revolution
35+
- **Ultra-fast inference:** Sub-millisecond latency with LPU architecture
36+
- **Large context:** Up to 131K tokens
37+
- **Prompt caching:** Available on select models
38+
- **Vision support:** Available on select models
3939

40-
Groq's LPU architecture delivers several key advantages over traditional GPU-based inference:
40+
Learn more about [LPU architecture](https://groq.com/blog/inside-the-lpu-deconstructing-groq-speed).
4141

42-
#### LPU Architecture
43-
Unlike GPUs that are adapted from training workloads, Groq's LPU is purpose-built for inference. This eliminates architectural bottlenecks that create latency in traditional systems.
42+
## Notes
4443

45-
#### Unmatched Speed
46-
- **Sub-millisecond latency** that stays consistent across traffic, regions, and workloads
47-
- **Static scheduling** with pre-computed execution graphs eliminates runtime coordination delays
48-
- **Tensor parallelism** optimized for low-latency single responses rather than high-throughput batching
49-
50-
#### Quality Without Tradeoffs
51-
- **TruePoint numerics** reduce precision only in areas that don't affect accuracy
52-
- **100-bit intermediate accumulation** ensures lossless computation
53-
- **Strategic precision control** maintains quality while achieving 2-4× speedup over BF16
54-
55-
#### Memory Architecture
56-
- **SRAM as primary storage** (not cache) with hundreds of megabytes on-chip
57-
- **Eliminates DRAM/HBM latency** that plagues traditional accelerators
58-
- **Enables true tensor parallelism** by splitting layers across multiple chips
59-
60-
Learn more about Groq's technology in their [LPU architecture blog post](https://groq.com/blog/inside-the-lpu-deconstructing-groq-speed).
61-
62-
### Special Features
63-
64-
#### Prompt Caching
65-
The Kimi K2 model supports prompt caching, which can significantly reduce costs and latency for repeated prompts.
66-
67-
#### Vision Support
68-
Select models support image inputs and vision capabilities. Check the model details in the Groq Console for specific capabilities.
69-
70-
#### Reasoning Models
71-
Some models like DeepSeek variants offer enhanced reasoning capabilities with step-by-step thought processes.
72-
73-
### Tips and Notes
74-
75-
- **Model Selection:** Choose models based on your specific use case and performance requirements.
76-
- **Speed Advantage:** Groq excels at single-request latency rather than high-throughput batch processing.
77-
- **OSS Model Provider:** Groq hosts open-source models from multiple providers (OpenAI, Meta, DeepSeek, etc.) on their fast infrastructure.
78-
- **Context Windows:** Most models offer large context windows (up to 131K tokens) for including substantial code and context.
79-
- **Pricing:** Groq offers competitive pricing with their speed advantages. Check the [Groq Pricing](https://groq.com/pricing) page for current rates.
80-
- **Rate Limits:** Groq has generous rate limits, but check their documentation for current limits based on your usage tier.
44+
- **Speed:** Optimized for single-request latency
45+
- **Pricing:** See [Groq Pricing](https://groq.com/pricing)

0 commit comments

Comments
 (0)