Chat Completion
This endpoint generates chat completions based on a list of messages.
https://api.geodd.io/inference/v1/chat/completions
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
application/jsonModel name to use for text generation.
thedrummer/unslopnemo-12b
A list of user messages comprising the conversation input so far.
The role of the message's author.
user,
assistant
The contents of the user's input query or the assistant's previous response.
Maximum number of tokens to generate.
Controls randomness.
0 → deterministic1 → more creativeNucleus sampling. Limits tokens to a probability mass.
Stops generation when one of the sequences is generated.
Makes output reproducible (best effort).
Limits token selection to top-K candidates.
Filters tokens below a minimum probability threshold.
Penalizes tokens that appear frequently.
Encourages introducing new tokens.
Discourages repeating tokens or phrases.
Constrains output format.
{
"type": "json_object"
}{
"type": "json_schema",
"json_schema": {
"name": "response",
"schema": {
"type": "object",
"properties": {
"answer": { "type": "string" }
},
"required": ["answer"]
}
}
}Enables strict schema-constrained generation. When enabled, the model will follow JSON/schema constraints more reliably.
List of available tools (functions). Allows the model to request external function execution.
[
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather by city",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}
}
]Controls how tools are used.
"none"never call tools"auto"model decides"required"must call a tool
{
"type": "function",
"function": { "name": "get_weather" }
}Responses
Authorization header.Example Request:
{
"model": "thedrummer/unslopnemo-12b",
"messages": [
{ "role": "user", "content": "Weather in Colombo?" }
],
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 256,
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto"
}Notes
- Tool execution is not handled by the model — your application must run it locally and return the results.
- Structured outputs rely on constrained decoding and may vary slightly by underlying model architecture.
- Sampling parameters may be implicitly overridden by framework defaults if not explicitly set in the request JSON.