🧪 Postman Testing Workflows

Quick reference for testing all Phase A-F features with Postman.

🎯 Workflow 1: Student Experience (5 min)

Goal: Test full student chat flow with wallet deduction

1. Auth → Signup (Student)         → Creates account, JWT auto-saved
2. Wallet → Get Balance            → Check balance (50 tokens for new users)
3. Chat → Ask Question (French)    → Get answer, tokens deducted
4. Wallet → Get Balance            → Verify deduction (balance = 50 - tokens_used)
5. Chat → Ask Question (Arabic)    → Test translation (Arabic → French corpus)
6. Wallet → Get Balance            → Verify second deduction

Expected Results: - ✅ Balance starts at 50 tokens - ✅ Each chat deducts ~10-100 tokens - ✅ Arabic questions get French corpus results - ✅ All responses include request_id

🎯 Workflow 2: Admin Ingestion (10 min)

Goal: Upload PDF, trigger ingestion, verify searchable

1. Auth → Signin                                → Login as admin
2. Upload → Get Presigned Upload URL            → Get S3 URL, file_id saved
   [Manual: Upload PDF to presigned URL via curl]
3. Ingestion → Create Ingestion Job             → Create job, job_id saved
4. Ingestion → Get Job Status                   → Poll until status = 'ready'
5. Chat → Ask Question                          → Verify new content searchable

Expected Results: - ✅ Presigned URL expires in 300 seconds - ✅ Job transitions: queued → parsing → ... → ready - ✅ chunks_created > 0, vectors_upserted > 0 - ✅ New content appears in search results

External Step (Upload PDF):

# After getting presigned URL from step 2:
curl -X PUT "{{upload_url}}" \
  -H "Content-Type: application/pdf" \
  --upload-file path/to/your-file.pdf

🎯 Workflow 3: Scraper Deduplication (5 min)

Goal: Test SimHash deduplication and quality checks

1. Auth → Signin (Admin)                        → Login
2. Scraping → Sync Koutoubi Scraper             → First sync, scrape_run_id saved
   Check Console → View stats (found, new, duplicates)
3. Scraping → List References (Koutoubi)        → View discovered references
4. Scraping → Sync Koutoubi Scraper (again)     → Second sync
   Check Console → duplicates should be > 0 (SimHash working)
5. Scraping → List References                   → View canonical vs duplicates

Expected Results: - ✅ First sync: new > 0, duplicates = 0 - ✅ Second sync: new = 0, duplicates > 0 - ✅ Duplicate references have canonical_id pointing to first discovery - ✅ Quality-failed documents not inserted (quality_failed > 0 if any)

🎯 Workflow 4: Quiz Generation (3 min)

Goal: Generate quiz and verify RAG context

1. Auth → Signin                                → Login
2. Wallet → Get Balance                         → Check initial balance
3. Quizzes → Generate Quiz (French)             → Get 5 questions
   Check Console → quiz_id, tokens_used
4. Wallet → Get Balance                         → Verify deduction
5. Quizzes → Generate Quiz (Arabic)             → Test Arabic quiz

Expected Results: - ✅ 5 questions with options, correct answer, explanation - ✅ Each question has source_page reference - ✅ Tokens deducted from wallet - ✅ Arabic quiz generated correctly

🎯 Workflow 5: Rate Limiting (2 min)

Goal: Trigger rate limit and verify 429 response

1. Auth → Signin                                → Get JWT
2. Testing → Rate Limit Test (Send 11x)         → Single request
3. Use Postman Runner:
   - Select "Rate Limit Test" request
   - Click "Run" → Iterations: 11
   - Run
4. Check Results                                → 11th should be 429

Expected Results: - ✅ Requests 1-10: 200 OK - ✅ Request 11: 429 Too Many Requests - ✅ 429 response includes: - error: "rate_limited" - retry_after: 23 (seconds until window resets) - limit: 10 - window: "1m" - request_id: "uuid"

🎯 Workflow 6: Request-ID Propagation (2 min)

Goal: Verify request-ID correlation across subsystems

1. Auth → Signin                                → Get JWT
2. Testing → Request-ID Test (Custom ID)        → Sends X-Request-ID: test-custom-request-id-123
3. Check Response Headers                       → X-Request-ID should match
4. Check Environment                            → last_request_id = custom ID
5. Wallet → Get Balance                         → Auto-generated UUID
6. Check Environment                            → last_request_id = new UUID

Expected Results: - ✅ Custom request-ID adopted (echoed in response) - ✅ Auto-generated UUID if no custom ID provided - ✅ All error responses include request_id - ✅ Database rows (reservations, wallet_ledger) share same request_id

Verify in Database:

-- Find all records for a request
SELECT 'reservations' as table_name, * FROM reservations WHERE request_id = 'abc123'
UNION ALL
SELECT 'wallet_ledger', * FROM wallet_ledger WHERE request_id = 'abc123';

🎯 Workflow 7: Idempotency Test (3 min)

Goal: Verify deterministic chunk IDs prevent duplicates

1. Auth → Signin (Admin)                        → Login
2. Ingestion → Create Ingestion Job             → reference_id = X, job_id saved
3. Wait for job to complete                     → Poll until status = 'ready'
4. Testing → Idempotency Test                   → Same reference_id again

Expected Results: - ✅ First ingestion: 202 Accepted, job created - ✅ Second ingestion (before first completes): 409 Conflict - ✅ Second ingestion (after first completes): 202, but same chunk IDs generated - ✅ No duplicate vectors in Pinecone (verify via metrics or Pinecone dashboard)

Verify in Database:

-- Check chunk_ids for same file ingested twice
SELECT chunk_id, file_id, page_number, chunk_index, created_at
FROM chunks
WHERE file_id = 'your-file-uuid'
ORDER BY page_number, chunk_index;

-- Should see same chunk_ids even after re-ingestion

🎯 Workflow 8: Multilingual Chat (5 min)

Goal: Test cross-lingual retrieval and translation

1. Auth → Signin                                → Login
2. Chat → Ask Question (French)                 → Baseline (no translation)
   Check Response → query_language = "fr", translated_query = null
3. Chat → Ask Question (Arabic → French Corpus) → Test translation
   Check Response → query_language = "ar", translated_query = "<French translation>"
4. Chat → Ask Question (Hassaniya)              → Test Hassaniya detection
   Check Response → query_language = "ha", translated_query = "<French>"

Expected Results: - ✅ French query: No translation, direct retrieval - ✅ Arabic query: Translated to French before retrieval - ✅ Hassaniya query: Detected and translated - ✅ All queries return relevant French corpus results

🎯 Workflow 9: Cache Verification (3 min)

Goal: Verify rerank cache reduces latency

1. Auth → Signin                                → Login
2. Chat → Ask Question (French)                 → First request
   Check Console → Response time (e.g., 3000 ms)
   Check Response → cache_hit = false
3. Chat → Ask Question (French) - SAME question → Second request
   Check Console → Response time (e.g., 50 ms) ← Much faster!
   Check Response → cache_hit = true
4. Wait 16 minutes                              → TTL expires (15 min)
5. Chat → Ask Question (French) - SAME question → Third request
   Check Response → cache_hit = false (cache expired)

Expected Results: - ✅ First request: Full pipeline (2-5 seconds) - ✅ Second request: Cache hit (<100 ms) - 98% faster - ✅ After TTL: Cache miss (back to full pipeline)

🎯 Workflow 10: Circuit Breaker Test (Advanced)

Goal: Simulate OpenAI failure and verify fallback

Note: Requires temporarily breaking OpenAI connection (invalid API key) or mocking failures

1. [Simulate 3 OpenAI failures in 60 seconds]
2. Metrics → Get Metrics (JSON)                 → Check circuit_breaker_state
   Expected: circuit_breaker_state{service="openai_mini"} = 1 (open)
3. Chat → Ask Question                          → Should still work
   Check Response → Rerank skipped, dense retrieval order used
4. Wait 120 seconds                             → Recovery timeout
5. Metrics → Get Metrics (JSON)                 → Check circuit_breaker_state
   Expected: State = 2 (half-open) or 0 (closed after test request)

Expected Results: - ✅ Circuit opens after 3 failures - ✅ Requests still work (fallback to dense order) - ✅ Circuit recovers after 120 seconds - ✅ No complete service outage

📊 Verification Checklist

Phase A - Ingestion

[ ] Presigned URL generated with 5-min expiry
[ ] Ingestion job created (status: queued)
[ ] Job transitions through all states
[ ] Chunks created with deterministic IDs
[ ] Vectors upserted to Pinecone
[ ] Same file re-ingested → same chunk IDs (idempotent)

Phase B - Security

[ ] JWT contains app_metadata.role
[ ] Admin endpoints require admin role
[ ] Rate limiting triggers 429 after limit
[ ] Request-ID in all response headers
[ ] Request-ID in all error responses
[ ] Custom X-Request-ID adopted

Phase C - Caching

[ ] Repeat queries have cache_hit = true
[ ] Cache hit latency < 100 ms
[ ] Cache miss latency 2-5 seconds
[ ] Tier limits enforced (Free: 10, Premium: 30)

Phase D - Retrieval

[ ] Language detection works (French, Arabic, Hassaniya)
[ ] Arabic queries translated to French
[ ] Reranking improves relevance
[ ] Quiz generation produces valid questions
[ ] Circuit breaker prevents cascade failures

Phase E - Scraper

[ ] Scraper sync returns statistics
[ ] Duplicates detected via SimHash
[ ] Quality-failed documents rejected
[ ] Canonical references created

Phase F - Observability

[ ] Metrics endpoint returns valid Prometheus format
[ ] Metrics include counters, histograms, gauges
[ ] Wallet reconciliation detects discrepancies
[ ] Background jobs run without errors

🎬 Ready to Test!

Import collection_v2.json
Import environment_local.json
Follow Workflow 1 (Student Experience)
Check other workflows as needed

Happy Testing! 🚀