Frequently Asked Questions


What is Moji Router?

Moji Router cuts the API cost of multi-turn AI products. It owns the session, accounts for how each provider caches, and tunes the routing to your own workflow. Your application keeps talking to the providers it already uses, and the router decides where each turn lands.

How does it cut the cost?

Three levers. It routes the turns that do not need the frontier model to a cheaper one, inside a spend mix you set. It groups turns so you switch models rarely, which keeps the cache warm. And it routes each session across the providers you already use to where it runs cheapest.

What does cache-aware mean?

In a long session most tokens are the same context, resent every turn. On Anthropic a cache read bills at about a tenth of a normal input token, close to a 90% discount on that repeated context. Cache-aware routing keeps a session on a warm cache so you read the context back cheaply instead of paying full price each turn.

Will it change my model outputs or quality?

Routing chooses where a turn runs, not what the model says. We keep the frontier model on the turns that need it and move the ones that do not, holding quality inside a bound you set. The cost-quality frontier shows the trade, so you pick the point rather than trust a black box.

Which providers does it work with?

Moji Router is provider-agnostic. It runs across the frontier providers you already use, routing each session to where it runs cheapest.

Do I have to switch providers or rewrite my app?

No. The router runs in front of the providers you already use, over your existing endpoints. After a small calibration sample passes through, it learns where each model is strong for you and routes the traffic behind your app.

The router sits in the request path. What about latency?

It is a thin layer in the path, so it adds little. We measure the router's own overhead and report it, so you can see what it costs in time as well as what it saves in spend before you commit.

What happens if a provider is down?

If a provider returns an error or times out, the router can retry the turn on another provider in your pool, so one provider's trouble does not have to end the session. You set which providers the router may use and the order it prefers them in.

How is my data handled?

We route your traffic and hold session state in order to route it. We do not train on your content, and we do not sell it. The calibration sample you send is used to tune the router and is handled under agreement. The Privacy page has the detail.

How do you handle security and compliance?

Routed traffic is encrypted in transit. For the traffic we route you are the data controller and we act as your processor under a data processing agreement; for your account and contact data we are the controller. We do not train on your content or sell it, and we can scope data residency and a signed agreement during onboarding. The Privacy page sets out the detail.

How does pricing work?

We are still finalising pricing. We start by measuring your saving on a sample of your traffic and showing you the figure, and we scope pricing with you from there, tied to the value the routing delivers.

How does the trial work?

Send us a sample of your traffic. We tune the router to it, run your own sessions through the same routing and caching we would use in production, and show you the saving against what you pay today before you decide.

How do I get started?

Email [email protected] with a line about your workload. We will scope a traffic sample and come back with a measured saving.