How Cost Estimation Works

copilot-session-usage estimates session costs from VS Code debug logs. This page explains each step.


Where the logs live

VS Code stores one directory per workspace under workspaceStorage. Inside each workspace directory, the Copilot extension writes debug logs:

workspaceStorage/
└── <workspace-hash>/
    └── GitHub.copilot-chat/
        └── debug-logs/
            └── <session-uuid>/
                ├── *.jsonl      ← token events
                └── ...

Default locations by platform:

Platform

Path

macOS

~/Library/Application Support/Code/User/workspaceStorage/

Linux

~/.config/Code/User/workspaceStorage/

Windows

%APPDATA%\Code\User\workspaceStorage\

WSL2 (remote)

~/.vscode-server/data/User/workspaceStorage/

The tool also checks Code - Insiders variants automatically.


How sessions are discovered

The tool reads the session title from VS Code’s SQLite workspace database (state.vscdb). Sessions are sorted by the last-modified time of their debug-log directory, most recent first.

The list command reads only metadata (no JSONL parsing). The analyze, id, and batch commands parse the full JSONL files.


Parsing JSONL logs

Each .jsonl file contains one JSON object per line. The tool extracts token-count events emitted by the Copilot extension for each LLM call:

  • input_tokens — tokens sent to the model

  • output_tokens — tokens generated by the model

  • cached_tokens — input tokens served from the provider’s prompt cache

  • model — the model name as reported by the provider

A single session may call multiple models (e.g., Claude Sonnet for the main request and Claude Haiku for a subagent). Each model’s tokens are summed separately, then costs are computed per model and aggregated.


Cost calculation

The VS Code debug log reports inputTokens as the total prompt sent (cached + non-cached combined). cachedTokens is the subset served from the provider’s cache. The non-cached portion is inputTokens cachedTokens.

Each llm_request event also carries copilotUsageNanoAiu — VS Code’s own cost for that call in nano-AI Credits (nanoAIU). It is present for all Copilot-plan models (Claude, GPT, and others). It is absent for Azure-hosted models not billed via Copilot AIC — Kimi-K2.6-azure is one such model: it is billed through Azure separately and correctly reports $0 AIC.

When copilotUsageNanoAiu is present and non-zero, the tool uses it directly:

cost_usd = copilotUsageNanoAiu / 100_000_000_000   (1 nanoAIU = 1e-11 USD)

For models that do not report copilotUsageNanoAiu — Azure-hosted models billed outside the Copilot plan — the tool falls back to token-based computation (or $0 when the model has no Copilot pricing entry):

cost_usd = (
    (inputTokens - cachedTokens) × rate.input          # fresh tokens
  +  cachedTokens                × rate.cached_input   # cache-read tokens
  +  outputTokens                × rate.output
) / 1_000_000

plus the Anthropic cache-write approximation when applicable.

Verified on a real session: copilotUsageNanoAiu-based cost matches the VS Code AIC panel at 0.000% error for Claude models.


Subagent attribution

runSubagent calls appear in the JSONL as a distinct event type. The tool tracks them separately so --detail full can show which fraction of tokens was consumed by subagents vs. the main conversation.


Accuracy

For Copilot-plan models (Claude, GPT, etc.), the tool reads copilotUsageNanoAiu — VS Code’s own per-call cost field — directly from the JSONL, so the session total matches the VS Code AIC panel at 0% error.

For Azure-hosted models billed outside the Copilot plan (Kimi-K2.6-azure is the current example), copilotUsageNanoAiu is absent. The tool falls back to token-based pricing using rates in custom-models-pricing.yml. These models are billed through Azure separately; setting their prices to $0.00 in custom-models-pricing.yml correctly reflects that they do not consume AIC.

Why the LLM-call count differs by 1 from the panel

The VS Code Agent Debug panel excludes the title-*.jsonl file (background title-generation calls). copilot-session-usage counts them because they consume real tokens. A session with one title-generation call will show one more LLM call and the corresponding (small) token counts compared to the panel.

Why token counts can differ between tool and panel

Token counts in the tool include title-*.jsonl; the VS Code panel does not. For session 438d24a8, the delta is exactly one Kimi call: +441 input, +1,245 output.

Pricing table updates

The bundled pricing table (data/models-and-pricing.yml) is updated with each release. Run just refresh-pricing to pull the latest rates. Stale rates only affect the token-based fallback path; copilotUsageNanoAiu-based costs are unaffected by the table.