Nvidia Nemotron 3 Super | Model Details | Geodd AI
Model Library/Nvidia Nemotron 3 Super

Nvidia Nemotron 3 Super

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B
API Docs

Nemotron-3-Super-120B-A12B-FP8 is a large language model (LLM) trained by NVIDIA, designed to deliver strong agentic, reasoning, and conversational capabilities. It is optimized for collaborative agents and high-volume workloads such as IT ticket automation. Like other models in the family, it responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be configured through a flag in the chat template. This model is optimized for high-performance inferencing on the Geodd network, providing exceptional speed and reliability for production workloads.

Read more

Features

Serverless API

Pay per token via our optimized endpoints.

View Documentation
Available Serverless
Run queries immediately, pay only for usage
Input$0.09 / M Tokens
Output$0.50 / M Tokens

API Usage

cURL
curl --location '$https://api.geodd.io/gateway/v1/chat/completions' \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data '{
  "model": "nvidia/NVIDIA-Nemotron-3-Super-120B-A12B",
  "messages": [
    { "role": "user", "content": "Hello, how are you?" }
  ]
}'

Info

Providernvidia
Quantizationfp4
Created5/13/2026
Available RegionsUS

Supported Functionality

Context Length1,000,000
Max Output1,000,000
ServerlessSupported
Input Capabilitiestext
Output Capabilitiestext

Parameters

temperaturetop_ptop_kfrequency_penaltypresence_penaltyseedmax_tokensstop