Backend Gap-Filling Architecture

Date: 2026-02-18 Purpose: Fill the 12 missing endpoints needed to support the BacMR-UI frontend.

1. Documents Table Enrichment

Problem

The documents table is lean — it lacks title, major, weight, and category. Students need this metadata when browsing available content. Currently this data only exists in references.

Solution

Add optional columns to documents. Populated automatically when ingesting from a reference, left NULL for manual uploads. Admin can update them later.

Migration: `20260218000021_documents_enrichment.sql`

ALTER TABLE documents
    ADD COLUMN IF NOT EXISTS title TEXT,
    ADD COLUMN IF NOT EXISTS major TEXT,
    ADD COLUMN IF NOT EXISTS weight INT DEFAULT 0,
    ADD COLUMN IF NOT EXISTS category TEXT,
    ADD COLUMN IF NOT EXISTS reference_id UUID REFERENCES references(id),
    ADD COLUMN IF NOT EXISTS status TEXT DEFAULT 'ready',
    ADD COLUMN IF NOT EXISTS pinecone_index TEXT,
    ADD COLUMN IF NOT EXISTS pinecone_namespace TEXT,
    ADD COLUMN IF NOT EXISTS pinecone_vector_count INT;

CREATE INDEX IF NOT EXISTS idx_documents_status ON documents(status);
CREATE INDEX IF NOT EXISTS idx_documents_grade_subject ON documents(grade, subject);
CREATE INDEX IF NOT EXISTS idx_documents_reference ON documents(reference_id) WHERE reference_id IS NOT NULL;

Data flow

Ingest from reference: Copy title, major, weight, category from the reference row. Set reference_id to link them.
Manual upload: These fields are NULL. Admin can update via PATCH /admin/documents/{id}.
Student queries: Query documents WHERE status = 'ready' — no joins needed.

2. Student Content Browsing

New router: `app/api/routers/curriculum.py`

Prefix: /curriculum Auth: None (public endpoints)

Endpoints

`GET /curriculum/subjects`

Returns distinct subjects from ingested documents.

Query: SELECT DISTINCT subject FROM documents WHERE status = 'ready' AND subject IS NOT NULL

Response:

{
    "subjects": ["Mathematiques", "Physique", "Arabe", "Francais"],
    "total": 4
}

`GET /curriculum/textbooks`

Returns available textbooks grouped by education level (grade).

Query params: ?grade=...&subject=...&language=... (all optional filters)

Query: SELECT * FROM documents WHERE status = 'ready', grouped client-side or via distinct grade values.

Response:

{
    "levels": ["elementary-1", "secondary-3", "high-school-7"],
    "textbooks": [
        {
            "id": "uuid",
            "title": "Manuel de Mathematiques 7C",
            "grade": "high-school-7",
            "subject": "Mathematiques",
            "major": "C",
            "language": "French",
            "weight": 9,
            "page_count": 120,
            "chunk_count": 85,
            "namespace": "grade-high-school-7-Mathematiques"
        }
    ]
}

`GET /curriculum/textbooks/{id}`

Returns details for a single textbook including its page range (derived from chunks).

Query: SELECT * FROM documents WHERE id = {id} + SELECT MIN(page_number), MAX(page_number) FROM chunks WHERE file_id = {id}

Response:

{
    "document": { "id": "...", "title": "...", "grade": "...", "..." : "..." },
    "page_range": { "min": 1, "max": 120 }
}

Schemas: `app/schemas/curriculum.py`

SubjectsResponse(subjects: List[str], total: int)
TextbookOut(id, title, grade, subject, major, language, weight, page_count, chunk_count, namespace)
TextbookListResponse(levels: List[str], textbooks: List[TextbookOut])
TextbookDetailResponse(document: TextbookOut, page_range: dict)

3. Ingestion Job Management

Flow change

The current POST /admin/ingest/{reference_id} runs synchronously. This changes to a job-based model:

POST /admin/ingest/{reference_id}
    → Creates ingestion_job (status: "queued")
    → Creates document record (status: "processing")
    → Returns { job_id, status: "queued" }

POST /admin/jobs/dispatch
    → Picks oldest "queued" job
    → Starts processing as a FastAPI BackgroundTask
    → Pipeline: download PDF → extract pages → chunk → embed → upsert Pinecone
    → On success: job status → "ready", document status → "ready"
    → On failure: job status → "failed", document status → "failed"
    → Returns { job_id, status: "parsing" }

GET /admin/jobs?status=...&limit=...&offset=...
    → List jobs with filters

GET /admin/jobs/{id}
    → Single job details with audit trail

POST /admin/jobs/{id}/requeue
    → Reset "failed" job to "queued", increment retry_count
    → Returns updated job

Processing pipeline (inside dispatch background task)

Fetch job + reference from DB
Download PDF from reference.pdf_source via httpx
Extract pages via pdf_processor.extract_pages()
Chunk via pdf_processor.build_chunks()
Embed via embedding_service.generate_embeddings()
Upsert to Pinecone via pinecone_adapter.index.upsert()
Create/update documents row with metadata from reference
Update job status → "ready"
Insert audit trail entries at each state transition

Endpoints added to: `app/api/routers/admin.py`

The existing POST /admin/ingest/{reference_id} is modified from synchronous to job-creating. Four new endpoints are added for job management.

Schemas: `app/schemas/ingestion.py` (extend existing)

IngestionJobOut(id, reference_id, file_id, status, chunks_created, vectors_upserted, retry_count, error_message, created_at, updated_at)
IngestionJobListResponse(items: List[IngestionJobOut], total: int)

4. Admin User CRUD

New endpoints added to: `app/api/routers/admin.py`

All use Supabase Auth admin API via supabase_service.

`POST /admin/users`

Create a new user account.

Request:

{
    "email": "student@example.com",
    "password": "securepassword",
    "full_name": "Ahmed Mohamed",
    "role": "student"
}

Implementation:

supabase_service.auth.admin.create_user({
    "email": email,
    "password": password,
    "email_confirm": True,
    "user_metadata": {"full_name": full_name, "role": role}
})

Profile row is auto-created by the existing handle_new_user trigger.

`DELETE /admin/users/{user_id}`

Delete a user account.

Implementation: 1. Delete from profiles table 2. Delete from Supabase Auth: supabase_service.auth.admin.delete_user(user_id)

`POST /admin/users/{user_id}/reset-password`

Reset a user's password.

Request: { "new_password": "..." }

Implementation:

supabase_service.auth.admin.update_user_by_id(user_id, {"password": new_password})

Schemas: `app/schemas/admin.py` (extend existing)

UserCreateRequest(email, password, full_name, role)
UserResetPasswordRequest(new_password)

5. Admin Stats

Endpoint added to: `app/api/routers/admin.py`

`GET /admin/stats`

Returns dashboard statistics.

Response:

{
    "documents": { "total": 45, "ready": 40, "processing": 3, "failed": 2 },
    "chunks": { "total": 3200 },
    "references": { "total": 120, "discovered": 75, "ready": 40, "failed": 5 },
    "users": { "total": 150, "admins": 2, "students": 148 },
    "jobs": { "queued": 2, "running": 1, "completed": 42, "failed": 3 },
    "wallet": { "total_tokens_in_circulation": 75000 }
}

Implementation: Multiple count queries against documents, chunks, references, profiles, ingestion_jobs, wallet.

6. Document Delete

Endpoint added to: `app/api/routers/admin.py`

`DELETE /admin/documents/{id}`

Delete a document and all associated data.

Implementation: 1. Fetch document (get pinecone_namespace, id) 2. Delete Pinecone vectors: pinecone_adapter.delete_by_file_id(document.id, namespace) 3. Delete chunks: DELETE FROM chunks WHERE file_id = document.id (CASCADE should handle this) 4. Delete document row 5. If reference_id exists, update reference status back to discovered

Response: { "status": "deleted", "chunks_removed": 85, "vectors_removed": 85 }

7. Router Registration

Update app/api/router.py to include the new curriculum router:

from app.api.routers import chat, admin, wallet, scraping, auth, me, curriculum

api_router.include_router(curriculum.router, tags=["curriculum"])

8. Files to Create/Modify

File	Action
`db/migrations/20260218000021_documents_enrichment.sql`	New migration
`app/schemas/curriculum.py`	New file: student browsing schemas
`app/api/routers/curriculum.py`	New file: student browsing router
`app/schemas/admin.py`	Add UserCreateRequest, UserResetPasswordRequest
`app/schemas/ingestion.py`	Add IngestionJobOut, IngestionJobListResponse
`app/api/routers/admin.py`	Add jobs, user CRUD, stats, document delete endpoints
`app/api/router.py`	Register curriculum router
`app/api/routers/admin.py`	Modify ingest endpoint to create job instead of running synchronously

9. Endpoint Summary

#	Method	Path	Auth	Category
1	GET	`/curriculum/subjects`	None	Student browsing
2	GET	`/curriculum/textbooks`	None	Student browsing
3	GET	`/curriculum/textbooks/{id}`	None	Student browsing
4	GET	`/admin/jobs`	Admin	Job management
5	GET	`/admin/jobs/{id}`	Admin	Job management
6	POST	`/admin/jobs/dispatch`	Admin	Job management
7	POST	`/admin/jobs/{id}/requeue`	Admin	Job management
8	POST	`/admin/users`	Admin	User CRUD
9	DELETE	`/admin/users/{user_id}`	Admin	User CRUD
10	POST	`/admin/users/{user_id}/reset-password`	Admin	User CRUD
11	GET	`/admin/stats`	Admin	Dashboard
12	DELETE	`/admin/documents/{id}`	Admin	Cleanup

Plus modification of existing POST /admin/ingest/{reference_id} to job-based model, and PATCH /admin/documents/{id} for metadata updates.

Backend Gap-Filling Architecture

1. Documents Table Enrichment

Problem

Solution

Migration: 20260218000021_documents_enrichment.sql

Data flow

2. Student Content Browsing

New router: app/api/routers/curriculum.py

Endpoints

GET /curriculum/subjects

GET /curriculum/textbooks

GET /curriculum/textbooks/{id}

Schemas: app/schemas/curriculum.py

3. Ingestion Job Management

Flow change

Processing pipeline (inside dispatch background task)

Endpoints added to: app/api/routers/admin.py

Schemas: app/schemas/ingestion.py (extend existing)

4. Admin User CRUD

New endpoints added to: app/api/routers/admin.py

POST /admin/users

DELETE /admin/users/{user_id}

POST /admin/users/{user_id}/reset-password

Schemas: app/schemas/admin.py (extend existing)

5. Admin Stats

Endpoint added to: app/api/routers/admin.py

GET /admin/stats

6. Document Delete

Endpoint added to: app/api/routers/admin.py

DELETE /admin/documents/{id}

7. Router Registration

8. Files to Create/Modify

9. Endpoint Summary

Migration: `20260218000021_documents_enrichment.sql`

New router: `app/api/routers/curriculum.py`

`GET /curriculum/subjects`

`GET /curriculum/textbooks`

`GET /curriculum/textbooks/{id}`

Schemas: `app/schemas/curriculum.py`

Endpoints added to: `app/api/routers/admin.py`

Schemas: `app/schemas/ingestion.py` (extend existing)

New endpoints added to: `app/api/routers/admin.py`

`POST /admin/users`

`DELETE /admin/users/{user_id}`

`POST /admin/users/{user_id}/reset-password`

Schemas: `app/schemas/admin.py` (extend existing)

Endpoint added to: `app/api/routers/admin.py`

`GET /admin/stats`

Endpoint added to: `app/api/routers/admin.py`

`DELETE /admin/documents/{id}`