Grok 4.1 Fast
ActiveGrok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using...
Overview
Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using...
History
Grok 4.1 Fast became available via the xAI API on 2025-11-19.
Training & availability
xAI has not released the underlying model weights — access is via their hosted API only.
Capabilities
-
Context window: 2.0M tokens.
-
Input modalities: text, image, file.
Recommended for: vision, long-context, cheap.
Pricing
- Input: $0.2000 per 1M tokens
- Output: $0.5000 per 1M tokens
Use the cost calculator above to estimate monthly spend for your workload.
Quick start
Minimal example using the OpenRouter API. Copy, paste, replace the key.
from openai import OpenAI
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-...",
)
resp = client.chat.completions.create(
model="xai/grok-4-1-fast",
messages=[{"role": "user", "content": "Explain quantum computing in one sentence."}],
)
print(resp.choices[0].message.content)Cost calculator
Estimate your monthly bill. Presets are typical workload sizes.
Providers & performance
1 providerMulti-provider inference routes for this model — sorted by throughput. Latency is time-to-first-token; throughput is output tokens per second. Data from OpenRouter, measured over the last 30 minutes.
| Provider | Throughput | Latency (TTFT) | Input $ / 1M | Output $ / 1M | Context | Quant | Supports |
|---|---|---|---|---|---|---|---|
| xAI | 103tok/s | 641ms | $0.2 | $0.5 | 2.0M | — | tools · json |
Integrations & tooling support
- Tool calling
- Not supported
- Structured outputs
- Not supported
Price vs quality
This model has no benchmark scores recorded yet.
Community ratings
Rate Grok 4.1 Fast
Sign in to rate and review.
Comments
Sign in to leave a comment.