NVIDIA: Nemotron 3 Nano 30B A3B

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Nemotron-3-Nano-30B-A3B-BF16 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. This model is optimized for high-performance inferencing on the Geodd network, providing exceptional speed and reliability for production workloads.

Serverless API

Pay per token via our optimized endpoints.

View Documentation

Available Serverless

Run queries immediately, pay only for usage

Input$0.050 / M Tokens

Output$0.200 / M Tokens

API Usage

cURL

curl --location '$https://api.geodd.io/gateway/v1/chat/completions' \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data '{
  "model": "nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16",
  "messages": [
    { "role": "user", "content": "Hello, how are you?" }
  ]
}'

Info

Providernvidia

Quantizationbf16

Created5/1/2026

Available RegionsUS

Supported Functionality

Context Length262,144

Max Output262,144

ServerlessSupported

Input Capabilitiestext

Output Capabilitiestext

Parameters

temperaturetop_ptop_kfrequency_penaltypresence_penaltyseedmax_tokensstop