API Documentation
An OpenAI-compatible chat completion LLM API to easily integrate AI into your applications.
Quick Start
All mammouth subscribers have some credits included.
Plan | Starter | Standard | Expert |
---|---|---|---|
Monthly credits | 2$ | 4$ | 10$ |
➡️ Get your API key.
With the Mammouth API directly
Generates a chat completion response based on your prompt.
python
import requests
url = "https://api.mammouth.ai/v1/chat/completions"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
data = {
"model": "gpt-4.1",
"messages": [
{
"role": "user",
"content": "Explain the basics of machine learning"
}
]
}
response = requests.post(url, headers=headers, json=data)
print(response.json())
javascript
const fetch = require("node-fetch");
async function callMammouth() {
const url = "https://api.mammouth.ai/v1/chat/completions";
const headers = {
Authorization: "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
};
const data = {
model: "gpt-4.1",
messages: [
{
role: "user",
content: "Create an example JavaScript function",
},
],
};
try {
const response = await fetch(url, {
method: "POST",
headers: headers,
body: JSON.stringify(data),
});
const result = await response.json();
console.log(result.choices[0].message.content);
} catch (error) {
console.error("Error:", error);
}
}
callMammouth();
bash
curl -X POST https://api.mammouth.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"messages": [
{
"role": "user",
"content": "Hello, how are you doing?"
}
]
}'
➡️ Get your API key in your settings.
With OpenAI Library
python
import openai
# Configure the client to use Mammouth.ai
openai.api_base = "https://api.mammouth.ai/v1"
openai.api_key = "YOUR_API_KEY"
response = openai.ChatCompletion.create(
model="gpt-4.1",
messages=[
{"role": "user", "content": "What are the benefits of renewable energy?"}
]
)
print(response.choices[0].message.content)
Response Format
Successful Response
json
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4.1",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm doing very well, thank you for asking. How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 19,
"total_tokens": 31
}
}
Streaming Response
When stream: true
is set, responses are returned as Server-Sent Events:
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4.1","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"gpt-4.1","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: [DONE]
Models & Pricing
Model | Input ($/M tokens) | Output ($/M tokens) |
---|---|---|
gpt-4.1 | 2 | 8 |
gpt-4.1-mini | 0.4 | 1.6 |
gpt-4.1-nano | 0.1 | 0.4 |
gpt-4o | 2.5 | 10 |
o4-mini | 1.1 | 4.4 |
o3 | 2 | 8 |
mistral-large-2411 | 2 | 6 |
mistral-medium-3 | 0.4 | 2 |
mistral-small-3.2-24b-instruct | 0.1 | 0.3 |
magistral-medium-2506 | 2 | 5 |
codestral-2501 | 0.3 | 0.9 |
grok-3 | 3 | 15 |
grok-3-mini | 0.3 | 0.5 |
gemini-2.5-flash | 0.3 | 2.5 |
gemini-2.5-pro | 2.5 | 15 |
deepseek-r1-0528 | 3 | 8 |
deepseek-v3-0324 | 0.9 | 0.9 |
llama-4-maverick | 0.22 | 0.88 |
llama-4-scout | 0.15 | 0.6 |
claude-3-5-haiku-20241022 | 0.8 | 4 |
claude-3-7-sonnet-20250219 | 3 | 15 |
claude-sonnet-4-20250514 | 3 | 15 |
claude-opus-4-20250514 | 15 | 75 |
Prices may vary and not be up to date in this table.
📜 Usage and cost are logged in your settings.
💡 We added aliases aligned with the Mammouth app to facilitate your model selection: if you write mistral
, it will use mistral-medium-3
.
Error Codes
Code | Description |
---|---|
400 | Bad Request - Missing or incorrect parameters |
401 | Unauthorized - Invalid API key |
429 | Too Many Requests - Rate limit exceeded |
500 | Internal Server Error - Server-side issue |
503 | Service Unavailable - Server temporarily unavailable |
Error Response Format
json
{
"error": {
"message": "Invalid API key provided",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}
Parameters
Required Parameters
Parameter | Type | Description |
---|---|---|
messages | array | List of messages in the conversation |
model | string | Model identifier to use |
Optional Parameters
Parameter | Type | Default | Description |
---|---|---|---|
temperature | number | 0.7 | Controls creativity (0.0 to 2.0) |
max_tokens | integer | 2048 | Maximum number of tokens to generate |
top_p | number | 1.0 | Controls response diversity |
stream | boolean | false | Real-time response streaming |
Optimization Tips
Temperature Settings
- 0.0 - 0.3: Very consistent and predictable responses
- 0.4 - 0.7: Balance between creativity and coherence
- 0.8 - 1.0: More creative and varied responses
Message Structure
json
{
"messages": [
{
"role": "system",
"content": "You are an AI assistant specialized in programming."
},
{
"role": "user",
"content": "How to optimize a for loop in Python?"
}
]
}
Role Types
system
: Sets the behavior and context for the assistantuser
: Represents messages from the userassistant
: Represents previous responses from the AI
Migration from OpenAI
If you're already using OpenAI's API, migrating to Mammouth.ai is simple:
- Change the base URL from
https://api.openai.com/v1
tohttps://api.mammouth.ai/v1
- Update your API key
- Keep all other parameters the same
OpenAI Python Library
python
import openai
# Before
openai.api_base = "https://api.openai.com/v1"
openai.api_key = "sk-openai-key"
# After
openai.api_base = "https://api.mammouth.ai/v1"
openai.api_key = "your-mammouth-key"
➡️ Get your API key.