Connect to AI
AI & Machine Learning Bearer Token

Groq REST API

Ultra-fast AI inference with LPU technology

Groq provides lightning-fast AI inference powered by their custom Language Processing Unit (LPU) architecture. Developers use Groq's API to access state-of-the-art language models like Llama, Mixtral, and Gemma with industry-leading low latency and high throughput. The API is OpenAI-compatible, making it easy to switch or integrate into existing LLM applications.

Base URL https://api.groq.com/openai/v1

API Endpoints

MethodEndpointDescription
POST/chat/completionsCreate a chat completion with streaming or non-streaming responses
GET/modelsList all available models and their capabilities
GET/models/{model_id}Retrieve detailed information about a specific model
POST/completionsGenerate text completions from a prompt
POST/embeddingsCreate embeddings from input text for semantic search or RAG applications
POST/audio/transcriptionsTranscribe audio files to text using Whisper models
POST/audio/translationsTranslate audio files to English text
GET/usageRetrieve API usage statistics and quota information
POST/moderationsCheck text content for policy violations
DELETE/chat/completions/{completion_id}Cancel an ongoing streaming completion request
GET/rate_limitsGet current rate limit status and remaining quota
POST/chat/completions/function_callCreate chat completions with function calling capabilities

Code Examples

curl https://api.groq.com/openai/v1/chat/completions \
  -H "Authorization: Bearer gsk_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Use Groq from Claude / Cursor / ChatGPT

Get a hosted MCP endpoint for Groq. Paste your Groq API key, copy back one URL, drop it into Claude Desktop, Cursor, or any AI client that supports remote MCP. Your AI calls Groq directly with your credentials — no local install, works on mobile.

groq_chat_completion Generate AI responses using Groq's ultra-fast inference with support for multiple models including Llama, Mixtral, and Gemma
groq_stream_completion Stream chat completions in real-time for interactive applications with minimal latency
groq_function_call Execute structured function calls with AI models for tool use and API integration workflows
groq_list_models Retrieve available models and their specifications to select optimal models for specific tasks
groq_transcribe_audio Transcribe audio files to text using Whisper models with high accuracy and speed

Connect in 60 seconds

Paste your Groq key → get an MCP URL → paste into Claude/Cursor. Hosted by IOX, encrypted at rest.

Connect Groq to your AI →

Related APIs