Skip to content

Commit 049f116

Browse files
Update providers/fireworks.mdx
Co-Authored-By: mintlify[bot] <109931778+mintlify[bot]@users.noreply.github.com>
1 parent ca07101 commit 049f116

1 file changed

Lines changed: 28 additions & 115 deletions

File tree

providers/fireworks.mdx

Lines changed: 28 additions & 115 deletions
Original file line numberDiff line numberDiff line change
@@ -1,131 +1,44 @@
11
---
22
title: "Fireworks AI"
3-
description: "Configure Fireworks AI's lightning-fast inference platform with CodinIT for up to 4x faster performance and access to 40+ optimized models."
3+
description: "Configure Fireworks AI for fast inference with 40+ optimized models."
44
---
55

6-
Fireworks AI is a leading infrastructure platform for generative AI that focuses on delivering exceptional performance through optimized inference capabilities. With up to 4x faster inference speeds than alternative platforms and support for over 40 different AI models, Fireworks eliminates the operational complexity of running AI models at scale.
6+
Fireworks AI provides optimized inference with up to 4x faster performance than alternatives.
77

88
**Website:** [https://fireworks.ai/](https://fireworks.ai/)
99

10-
### Getting an API Key
10+
## Getting an API Key
1111

12-
1. **Sign Up/Sign In:** Go to [Fireworks AI](https://fireworks.ai/) and create an account or sign in.
13-
2. **Navigate to API Keys:** Access the API keys section in your dashboard.
14-
3. **Create a Key:** Generate a new API key. Give it a descriptive name (e.g., "CodinIT").
15-
4. **Copy the Key:** Copy the API key immediately. Store it securely.
12+
1. Go to [Fireworks AI](https://fireworks.ai/) and sign in
13+
2. Navigate to API Keys in your dashboard
14+
3. Create a new API key and name it (e.g., "CodinIT")
15+
4. Copy the key immediately
1616

17-
### Supported Models
17+
## Configuration
1818

19-
Fireworks AI supports a wide variety of models across different categories. Popular models include:
19+
1. Click the settings icon (⚙️) in CodinIT
20+
2. Select "Fireworks" as the API Provider
21+
3. Paste your API key
22+
4. Enter the model ID (e.g., "accounts/fireworks/models/llama-v3p1-70b-instruct")
2023

21-
**Text Generation Models:**
22-
- Llama 3.1 series (8B, 70B, 405B)
23-
- Mixtral 8x7B and 8x22B
24-
- Qwen 2.5 series
25-
- DeepSeek models with reasoning capabilities
26-
- Code Llama models for programming tasks
24+
## Supported Models
2725

28-
**Vision Models:**
29-
- Llama 3.2 Vision models
30-
- Qwen 2-VL models
26+
- Llama 3.1 series (8B, 70B, 405B)
27+
- Mixtral 8x7B and 8x22B
28+
- Qwen 2.5 series
29+
- DeepSeek models
30+
- Code Llama models
31+
- Vision models (Llama 3.2, Qwen 2-VL)
3132

32-
**Embedding Models:**
33-
- Various text embedding models for semantic search
33+
## Key Features
3434

35-
The platform curates, optimizes, and deploys models with custom kernels and inference optimizations for maximum performance.
35+
- **Ultra-fast inference:** Up to 4x faster than alternatives
36+
- **Custom optimizations:** Advanced kernels for maximum performance
37+
- **40+ models:** Wide selection of optimized models
38+
- **Fine-tuning:** Available for custom models
39+
- **OpenAI compatible:** Standard API format
3640

37-
### Configuration in CodinIT
41+
## Notes
3842

39-
1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
40-
2. **Select Provider:** Choose "Fireworks" from the "API Provider" dropdown.
41-
3. **Enter API Key:** Paste your Fireworks API key into the "Fireworks API Key" field.
42-
4. **Enter Model ID:** Specify the model you want to use (e.g., "accounts/fireworks/models/llama-v3p1-70b-instruct").
43-
5. **Configure Tokens:** Optionally set max completion tokens and context window size.
44-
45-
### Fireworks AI's Performance Focus
46-
47-
Fireworks AI's competitive advantages center on performance optimization and developer experience:
48-
49-
#### Lightning-Fast Inference
50-
- **Up to 4x faster inference** than alternative platforms
51-
- **250% higher throughput** compared to open source inference engines
52-
- **50% faster speed** with significantly reduced latency
53-
- **6x lower cost** than HuggingFace Endpoints with 2.5x generation speed
54-
55-
#### Advanced Optimization Technology
56-
- **Custom kernels** and inference optimizations increase throughput per GPU
57-
- **Multi-LoRA architecture** enables efficient resource sharing
58-
- **Hundreds of fine-tuned model variants** can run on shared base model infrastructure
59-
- **Asset-light model** focuses on optimization software rather than expensive GPU ownership
60-
61-
#### Comprehensive Model Support
62-
- **40+ different AI models** curated and optimized for performance
63-
- **Multiple GPU types** supported: A100, H100, H200, B200, AMD MI300X
64-
- **Pay-per-GPU-second billing** with no extra charges for start-up times
65-
- **OpenAI API compatibility** for seamless integration
66-
67-
### Pricing Structure
68-
69-
Fireworks AI uses a usage-based pricing model with competitive rates:
70-
71-
#### Text and Vision Models (2025)
72-
| Parameter Count | Price per 1M Input Tokens |
73-
|---|---|
74-
| Less than 4B parameters | $0.10 |
75-
| 4B - 16B parameters | $0.20 |
76-
| More than 16B parameters | $0.90 |
77-
| MoE 0B - 56B parameters | $0.50 |
78-
79-
#### Fine-Tuning Services
80-
| Base Model Size | Price per 1M Training Tokens |
81-
|---|---|
82-
| Up to 16B parameters | $0.50 |
83-
| 16.1B - 80B parameters | $3.00 |
84-
| DeepSeek R1 / V3 | $10.00 |
85-
86-
#### Dedicated Deployments
87-
| GPU Type | Price per Hour |
88-
|---|---|
89-
| A100 80GB | $2.90 |
90-
| H100 80GB | $5.80 |
91-
| H200 141GB | $6.99 |
92-
| B200 180GB | $11.99 |
93-
| AMD MI300X | $4.99 |
94-
95-
### Special Features
96-
97-
#### Fine-Tuning Capabilities
98-
Fireworks offers sophisticated fine-tuning services accessible through CLI interface, supporting JSON-formatted data from databases like MongoDB Atlas. Fine-tuned models cost the same as base models for inference.
99-
100-
#### Developer Experience
101-
- **Browser playground** for direct model interaction
102-
- **REST API** with OpenAI compatibility
103-
- **Comprehensive cookbook** with ready-to-use recipes
104-
- **Multiple deployment options** from serverless to dedicated GPUs
105-
106-
#### Enterprise Features
107-
- **HIPAA and SOC 2 Type II compliance** for regulated industries
108-
- **Self-serve onboarding** for developers
109-
- **Enterprise sales** for larger deployments
110-
- **Post-paid billing options** and Business tier
111-
112-
#### Reasoning Model Support
113-
Advanced support for reasoning models with `<think>` tag processing and reasoning content extraction, making complex multi-step reasoning practical for real-time applications.
114-
115-
### Performance Advantages
116-
117-
Fireworks AI's optimization delivers measurable improvements:
118-
- **250% higher throughput** vs open source engines
119-
- **50% faster speed** with reduced latency
120-
- **6x cost reduction** compared to alternatives
121-
- **2.5x generation speed** improvement per request
122-
123-
### Tips and Notes
124-
125-
- **Model Selection:** Choose models based on your specific use case - smaller models for speed, larger models for complex reasoning.
126-
- **Performance Focus:** Fireworks excels at making AI inference fast and cost-effective through advanced optimizations.
127-
- **Fine-Tuning:** Leverage fine-tuning capabilities to improve model accuracy with your proprietary data.
128-
- **Compliance:** HIPAA and SOC 2 Type II compliance enables use in regulated industries.
129-
- **Pricing Model:** Usage-based pricing scales with your success rather than traditional seat-based models.
130-
- **Developer Resources:** Extensive documentation and cookbook recipes accelerate implementation.
131-
- **GPU Options:** Multiple GPU types available for dedicated deployments based on performance needs.
43+
- **Pricing:** Usage-based, see [Fireworks Pricing](https://fireworks.ai/pricing)
44+
- **Compliance:** HIPAA and SOC 2 Type II certified

0 commit comments

Comments
 (0)