A
AXIOM Cloud
PORTAL
A

AXIOM CRM

Context Reasoning Manager — Intelligent LLM Gateway

One API endpoint. Five model tiers. Automatic complexity routing. Pay only for what you use — 75% of requests route to free models.

1
You send a prompt
Any OpenAI-compatible client
2
We classify complexity
Signal-based scoring in <1ms
3
Cheapest capable model
75% of requests route free

Five Intelligence Tiers

T0
TRIVIAL
Flash-Lite
FREE
Gemini Flash-Lite. Greetings, trivial math. Free cloud tier via Google, same model as T1 for ultra-trivial prompts.
"hello"
"what is 2+2"
T1
ROUTINE
Gemini Flash-Lite
FREE
File operations, boilerplate code, documentation, simple unit tests. Google's free tier handles 75% of daily coding work.
"list files in src"
"write a unit test"
T2
COMPLEX
Gemini Flash
$0.15 / 1M tokens
Debugging, cross-module analysis, build pipelines, firmware flashing. Supports tool calling for IDE integrations.
"debug the timeout"
"flash ESP32 to COM10"
T3
AGENT
Claude Sonnet 4
$3 / 1M tokens
Multi-step workflows, full-project refactoring, security audits. Anthropic's 200K context window for complex reasoning chains.
"refactor the entire auth"
"full rewrite of module"
T4
ARCHITECT
Claude Opus 4
$15 / 1M tokens
System architecture, distributed design, regulatory compliance (MaRisk, DORA, Basel). 1M context for the hardest problems.
"design distributed auth"
"MaRisk compliance audit"
94%
Average cost reduction vs single-model
75%
Requests routed to free tiers
<1ms
Classification overhead
$
Credit Balance
Loading...
~
Requests Today
Loading...
T
Active Tier
Most used model tier
K
API Keys
Active keys

Recent Transactions

TypeAmountDescriptionTime

Quick Actions

System Status

API Gateway Online
Database Connected

Credit Packs

Pre-paid credits for LLM API usage. 1 credit = $0.01 USD.
Starter
1,000 credits
$10 USD
~220 T2 requests or ~22 T3 requests
Enterprise
10,000 credits
$100 USD
~2,200 T2 requests or ~220 T3 requests

Transaction History

Complete ledger of all credit operations
TypeAmountBalance AfterDescriptionTime

API Keys

Manage authentication keys for the AXIOM Gateway API
New API key created. Copy it now — it won't be shown again.
KeyLabelStatusLast UsedCreated

Integration Guide

Use your API key in the Authorization header:

# OpenAI-compatible endpoint
curl -H "Authorization: Bearer axk_your_key" \
     -H "Content-Type: application/json" \
     https://api.dexsi.com/v1/chat/completions \
     -d '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}'

The gateway automatically classifies complexity and routes to the optimal model (T0-T4). Credits are deducted based on actual token usage.

Usage Analytics

Request distribution, cost breakdown, and tier analytics

Requests by Tier

T0 Free
T1 Free
T2 Flash
T3 Sonnet
T4 Opus

Cost by Provider

Google
Anthropic

How the Tier Decision is Made

Every request passes through a 4-layer classification pipeline in under 1ms. No manual model selection needed — the system always picks the cheapest model that can handle your task correctly.

Layer 1 — Signal Scoring
Your prompt is matched against 100+ regex patterns. Each tier has its own signal list. Matches are weighted: T1 x1.0, T2 x2.0, T3 x2.0, T4 x3.0. Highest weighted score wins.
"debug" → T2
"refactor entire" → T3
"distributed" → T4
Layer 2 — Session Momentum
If you've been working at T3 for several turns, a simple follow-up like "what about tests?" stays at T3 instead of dropping to T1. Prevents jarring model switches mid-conversation.
hint = avg(last 15min)
score += 0.4 x hint
de-escalate after 3+ simple
Layer 3 — Token Floor
If the conversation exceeds 150K tokens, it's forced to T3 minimum regardless of signal score. Only Claude's 200K+ context window can handle payloads that large reliably.
if tokens > 150K:
  tier = max(tier, 3)
Layer 4 — Budget Guard
Before routing, the system checks if the provider has budget remaining. If exhausted, it falls back to a cheaper tier automatically. T4 → T3 → T2. Never fails silently.
Anthropic exhausted?
  T4 → T3 → T2
All gone? → HTTP 429
Your prompt Signal score + Momentum Token check Budget check Model selected

The Five Tiers

T0
TRIVIAL
FREE
Gemini Flash-Lite. Ultra-trivial prompts routed to Google free tier. Same model as T1, zero cost.
T1
ROUTINE
FREE
Gemini Flash-Lite. File ops, docs, boilerplate, simple tests. Handles 75% of daily work at zero cost.
T2
COMPLEX
$0.15/1M
Gemini Flash with tool support. Debug, firmware, pipelines. 20x cheaper than Claude for mid-complexity.
T3
AGENT
$3/1M
Claude Sonnet 4. Multi-step agent workflows, full refactors, security audits. 200K context window.
T4
ARCHITECT
$15/1M
Claude Opus 4. System architecture, distributed design, regulatory compliance. 1M context for the hardest problems.

Settings

Account preferences and notification settings

Account

Active

Tenant ID

Notifications

Low balance alert

Get notified when credits drop below 100

Usage reports

Weekly email with usage summary and cost breakdown

Tier escalation alerts

Notify when requests are routed to T3/T4 (expensive tiers)

Danger Zone

Revoke all API keys

Immediately invalidate all active keys for this account

Close account

Permanently close this account. Credits are non-refundable.