Skip to content

Quick Start - Phase A Testing

🎯 What Was Done

I've implemented Phase A of the backend architecture (S1-S5, S21):

Core Schema: 6 database migrations for ingestion, billing, and vector tracking ✅ Token-Based Chunking: Deterministic chunk IDs prevent duplicates ✅ Atomic Billing: Reservation pattern (reserve → LLM call → finalize) ✅ Canonical Chunk Store: Full text in Postgres, not Pinecone ✅ State Machine: Robust ingestion job lifecycle with retry & audit

Branch: feature/sonnet-impl-20260217-155229


📋 Testing on Your Other Laptop (3 Steps)

Step 1: Pull the Code

cd /path/to/BacMR
git fetch origin
git checkout feature/sonnet-impl-20260217-155229

# Install dependencies
source venv/bin/activate  # or create new: python3 -m venv venv
pip install -r requirements.txt

Step 2: Run Migrations

Via Supabase Dashboard (Recommended):

  1. Open https://app.supabase.com → Your Project → SQL Editor
  2. Copy and run each migration file in order:
  3. db/migrations/012_ingestion_jobs.sql
  4. db/migrations/013_chunks_enhanced.sql
  5. db/migrations/014_reservations.sql
  6. db/migrations/015_embedding_refs.sql
  7. db/migrations/016_rls_new_tables.sql
  8. db/migrations/017_references_enhancements.sql

Or via psycopg2:

pip install psycopg2-binary
export DATABASE_URL="postgresql://postgres:[PASSWORD]@db.[PROJECT].supabase.co:5432/postgres"
python scripts/run_migrations_psycopg.py

Step 3: Run Tests

# Unit tests
pytest tests/unit/test_chunking.py -v

# Quick integration test (optional)
python -c "
from uuid import uuid4
from app.services.chunking import ChunkingService

service = ChunkingService()
file_id = uuid4()

# Test deterministic chunk ID
id1 = service.generate_chunk_id(file_id, 0, 0)
id2 = service.generate_chunk_id(file_id, 0, 0)

assert id1 == id2, 'Chunk IDs are not deterministic!'
print(f'✅ Chunk ID test passed: {id1[:16]}...')

# Test chunking
text = 'Sample text ' * 100
chunks = service.chunk_text(text, file_id, 0, 'fr')
print(f'✅ Generated {len(chunks)} chunks')
"

📂 Key Files to Review

Services (Core Logic)

  • app/services/chunking.py - Token-based chunking
  • app/services/ingestion.py - Job state machine
  • app/services/wallet_reservation.py - Atomic billing
  • app/services/pinecone_adapter.py - Lightweight metadata
  • app/services/embedding_service.py - Embedding + refs

Migrations (Run These!)

  • db/migrations/012_ingestion_jobs.sql
  • db/migrations/013_chunks_enhanced.sql
  • db/migrations/014_reservations.sql
  • db/migrations/015_embedding_refs.sql
  • db/migrations/016_rls_new_tables.sql
  • db/migrations/017_references_enhancements.sql

Documentation

  • SONNET_RUN.md - Full implementation log
  • ARTIFACTS/PHASE_A_COMPLETE.md - Testing guide
  • docs/backend_architecture.md - Complete architecture spec

✅ Success Criteria

Phase A passes when:

  • [ ] All 6 migrations run without errors
  • [ ] Tables exist: ingestion_jobs, chunks, reservations, embedding_refs, ingestion_audit
  • [ ] RLS enabled on new tables
  • [ ] Unit tests pass (pytest tests/unit/test_chunking.py)
  • [ ] Deterministic chunk IDs generate correctly
  • [ ] Same input produces same chunk ID (idempotency)

🚀 What's Next (Phase B)

After Phase A passes:

  1. JWT Custom Claims (S6) - Move roles to JWT
  2. Remove x-admin-key (S7) - Deprecate static admin key
  3. Request-ID Propagation (S9b) - Correlation across subsystems
  4. Rate Limiting (S9b) - Per-user rate limits
  5. Secrets Management (S9) - Move to Cloud Secret Manager

🔧 Troubleshooting

"SSL certificate verify failed" → You're on the corporate laptop. Use the other one.

"relation already exists" → Some tables may exist. Check schema and adjust migrations.

"ModuleNotFoundError: No module named 'app'" → Run from project root: python -m pytest tests/...

"Permission denied" on migrations → Use SUPABASE_SERVICE_ROLE_KEY, not anon key


📝 Notes

  • SSL Issues: Corporate laptop blocks HTTPS. All testing deferred to non-corporate machine.
  • No Request-ID Yet: Implemented in Phase B (S9b).
  • No Rate Limiting Yet: Implemented in Phase B (S9b).
  • Deprecated x-admin-key: Still present, removed in Phase B (S7).

Questions? Check: - SONNET_RUN.md - Full implementation details - ARTIFACTS/PHASE_A_COMPLETE.md - Testing guide - ARTIFACTS/MISSING_CREDENTIALS.md - SSL issue notes


Ready to test! Pull the branch and run the migrations.