Guide · Model-agnostic workflows · 7 min read

Choosing Model Routes for Agentic Work

A practical guide to assigning premium, lighter, cheaper, and local models deliberately.

Start with the task

Do not begin with a model leaderboard. Begin with the work. Identify the expected output, risk, privacy boundary, latency requirement, context size, tool needs, and cost tolerance. The route should follow the task.

Use premium reasoning where it earns its place

Premium models are appropriate for ambiguous planning, difficult debugging, security-sensitive review, final synthesis, and work where a shallow answer creates downstream cost. Keep this route available. Do not spend it automatically.

Use lighter or cheaper routes for suitable work

Smaller cloud models can be useful for extraction, summaries, classification, evidence compression, rough drafts, and repetitive helper operations. Review the output boundary: a draft route should not silently become the final decision-maker.

Use local routes deliberately

Local models can support private preprocessing, low-risk summaries, repository evidence packaging, and offline work. Local does not automatically mean free or better. Account for hardware, latency, model size, context limits, maintenance, and output quality.

Use deterministic tools before asking a model

If a command, parser, search index, database query, or policy rule can answer the question mechanically, use it first. Send the resulting evidence to a model only when interpretation is useful.

Review the route with evidence

Inspect context size and token counts.
Check whether repeated context can be reused or cached.
Watch for loops, retries, and verbose intermediate output.
Keep high-risk actions behind approval boundaries.
Move a task only when the new route preserves the result you need.

Read Token Costs Are Becoming an Operating Issue for the economic argument or explore Kaptain for the operating layer.