API Reference

Build with
signal speed.

quickstart.ts
import { Axon } from '@axon/sdk';

const client = new Axon({
  apiKey: process.env.AXON_API_KEY,
  region: 'us-west-2',
});

// Generate a completion — P95 <20ms guaranteed
const completion = await client.inference.complete({
  model: 'llama-3-70b',
  messages: [{ role: 'user', content: 'Explain neural networks' }],
  max_tokens: 512,
});

console.log(completion.choices[0].message.content);
REST Endpoints
POST /v1/inference/completions Generate text completions
POST /v1/inference/embeddings Generate embedding vectors
GET /v1/models List available deployed models
POST /v1/models/{id}/deploy Deploy a new model version
GET /v1/pipelines List MLOps pipelines
POST /v1/vectors/upsert Upsert vectors to index
POST /v1/vectors/query Semantic search query
POST /v1/evals/run Run evaluation suite
Join the Waitlist

Ready to build at
signal speed?

2,400 teams are already in line. Request access today and we'll reach out when your spot is ready. No spam. No BS.

No credit card required · 14-day free trial · Cancel anytime