Skip to content

Design: Wallet & Billing System

Token Model

BacMR uses a virtual currency ("Tokens") to manage AI costs. - Deduction Rule: - Base Cost: 5 tokens (covers retrieval and processing). - Output Cost: 1 token per 200 characters generated. - Clarification: Flat 2 tokens (cheaper).

Reserve / Finalize Pattern

The billing system uses atomic reservations to prevent over-billing:

sequenceDiagram
    participant User
    participant Agent as Teacher Agent
    participant WS as WalletService
    participant DB as Postgres

    User->>Agent: POST /chat
    Agent->>WS: reserve(user_id, estimated=15)
    WS->>DB: Deduct 15 from wallet
    WS->>DB: INSERT reservation (status=reserved, TTL=5min)
    WS-->>Agent: reservation_id, balance=85

    Note over Agent: LLM retrieval + generation

    Agent->>WS: finalize(reservation_id, actual=8)
    WS->>DB: UPDATE reservation (status=finalized, actual=8)
    WS->>DB: Refund delta (15-8=7) to wallet
    WS->>DB: INSERT ledger entry (delta=-8)
    WS-->>Agent: balance=92
  1. Reserve: Before the LLM call, deduct the estimated cost from the wallet and create a reservations record (status: reserved, TTL: 5 minutes).
  2. LLM Call: Execute the retrieval + generation pipeline.
  3. Finalize: After the LLM response, update the reservation with actual cost. Refund the delta if actual < estimated. Cap actual at 2x estimated.
  4. Expire: A background job (scripts/expire_reservations.py) returns tokens for stale reservations.

Implementation

  • Service: WalletReservationService in app/services/wallet_reservation.py
  • Singleton: wallet_service in app/core/dependencies.py
  • Agent integration: check_wallet node calls wallet_service.reserve(), finalize node calls wallet_service.finalize()

Ledger System

Every transaction is logged in the wallet_ledger table. - Delta: Positive (top-up) or negative (usage). - Reason: agent_chat, topup, reservation_expired, etc. - Request ID: Links the deduction to a specific usage_log entry.

Financial Transactions

A separate transactions table tracks real-money flows (top-ups, refunds, API costs, server expenses). - Direction: credit (incoming) or debit (outgoing). - Types: topup, refund, promotion, api_cost, server_cost, infrastructure. - Currency: Supports MRU (Mauritanian Ouguiya), USD, EUR. - Payment methods: cash, bank_transfer, mobile_money, bankily, masrivi, seddad. - Migration: db/migrations/20260217000020_transactions.sql

Top-Up Flow

graph LR
    A[Admin calls<br>POST /wallet/topup] --> B[Increment wallet<br>balance]
    B --> C[Insert wallet_ledger<br>delta = +tokens]
    C --> D[Insert transaction<br>direction=credit<br>type=topup]
    D --> E[Return new balance]
  1. Admin calls POST /wallet/topup with student UUID, token amount, and payment details.
  2. Wallet balance is incremented.
  3. A wallet_ledger entry is created (delta = +token_amount).
  4. A transactions record is created (direction = credit, type = topup).

Tiers & Caps

Different subscription tiers control the "quality" of the AI: | Tier | Top-K (Search) | Rerank-N | Max Tokens/Request | Daily Limit | | :--- | :--- | :--- | :--- | :--- | | Free | 10 | 3 | 500 | 50 | | Standard | 20 | 5 | 2,000 | 500 | | Premium | 30 | 8 | 4,000 | Unlimited |

Tier configuration is managed by TierConfig in app/services/tier_config.py.

Back to Index