Database Schema

Purpose

Defines the current StudyBoost public schema shape used by backend services and feature modules.


Core Notes

  • Primary database: PostgreSQL (Neon)
  • ORM: Prisma (backend/prisma/schema.prisma)
  • Migration source of truth: backend/prisma/migrations/*
  • Auth/session data currently lives in public (no separate auth schema yet)

Auth Tables (Public Schema)

users

Core identity table used across documents, subscriptions, notifications, and auth.

Important columns:

  • user_id (UUID, PK)
  • email (unique)
  • username (unique)
  • password (hashed)
  • onboarding_draft (JSONB) - Stores volatile AI/Wizard onboarding state.
  • onboarding_completed (Boolean) - Tracks final readiness status.
  • university, major_focus, study_goal - Canonicalized profile fields.
  • is_active, is_verified

Institutions Catalog (Public Schema)

institutions

A master catalog of 100k+ global universities and schools used for profile canonicalization.

Columns:

  • institution_id (UUID, PK)
  • institution_name (unique)
  • slug (unique, URL-friendly name)
  • is_active

Indexes:

  • idx_institution_name_trgm: GIN Trigram index on institution_name for high-performance fuzzy matching.
  • idx_institution_slug: B-tree index for rapid URL lookups.

Notes:

  • Trigram index added via migration 202605030322_onboarding_overhaul.
  • Enabled by the pg_trgm extension.

auth_sessions

Stores active refresh-session state per device/login.

Columns:

  • session_id (UUID, PK)
  • user_id (FK -> users.user_id)
  • refresh_token_hash (unique)
  • user_agent
  • ip_address
  • expires_at
  • created_at

Indexes:

  • idx_auth_sessions_user
  • idx_auth_sessions_expires_at
  • uk_auth_sessions_refresh_token_hash

email_verification_tokens

Stores one-time email verification tokens (hashed).

Columns:

  • id (UUID, PK)
  • user_id (FK -> users.user_id)
  • token_hash (unique)
  • expires_at
  • created_at
  • used_at

Indexes:

  • idx_email_verification_tokens_user
  • idx_email_verification_tokens_expires_at
  • idx_email_verification_tokens_used_at
  • uk_email_verification_tokens_token_hash

Queue Infrastructure Tables (Public Schema)

background_jobs

Tracks lifecycle state for asynchronous queue jobs processed by FEAT-33 workers.

Columns:

  • id (UUID, PK)
  • queue_name (ocr_queue, ai_processing_queue, indexing_queue, email_queue, notification_queue)
  • job_type (logical task type)
  • bull_job_id (BullMQ job identifier)
  • status (pending | processing | completed | failed)
  • payload_json
  • result_json
  • error_message
  • attempts_made
  • max_attempts
  • idempotency_key (unique when present)
  • created_at
  • started_at
  • finished_at
  • updated_at

Indexes:

  • idx_background_jobs_queue_name
  • idx_background_jobs_status
  • idx_background_jobs_created_at
  • idx_background_jobs_queue_status
  • uk_background_jobs_idempotency_key (partial unique index for non-null keys)

Notes:

  • Added via migration 202604272230_add_background_jobs_table.
  • Used by queue producers and worker processors for retry-safe status tracking.

Structured Notes Tables (Public Schema)

structured_notes

Stores user-owned note records for FEAT-59.

Columns:

  • id (UUID, PK)
  • user_id (FK -> users.user_id)
  • title
  • body
  • status (active | archived | deleted)
  • deadline_at
  • reminder_at
  • archived_at
  • deleted_at
  • created_at
  • updated_at

Indexes:

  • idx_structured_notes_user_status_updated
  • idx_structured_notes_user_deadline
  • idx_structured_notes_user_created

structured_note_references

Stores typed note references to catalog entities.

Columns:

  • id (UUID, PK)
  • note_id (FK -> structured_notes.id)
  • reference_type (document | course | institution)
  • document_id (nullable FK -> documents.document_id)
  • course_id (nullable FK -> courses.course_id)
  • institution_id (nullable FK -> institutions.institution_id)
  • context
  • access_state_snapshot
  • deadline_at
  • reminder_at
  • position_order
  • archived_at
  • created_at
  • updated_at

Indexes:

  • idx_structured_note_references_note
  • idx_structured_note_references_type
  • idx_structured_note_references_document
  • idx_structured_note_references_course
  • idx_structured_note_references_institution

structured_note_canvas_items

Stores persisted card coordinates/layout metadata per note.

Columns:

  • id (UUID, PK)
  • note_id (FK -> structured_notes.id)
  • x, y, width, height
  • group_label
  • created_at
  • updated_at

Indexes:

  • idx_structured_note_canvas_note

Notes:

  • Added via migration 202605021540_add_structured_notes_tables.
  • Migration is additive and guard-safe (IF NOT EXISTS).

Referral Ledger Tables (Public Schema)

referral_settings

Global referral configuration row (settings_key = 'global').

Columns:

  • enable_referrals
  • commission_percentage
  • fraud_window_days
  • min_payout_amount
  • created_at, updated_at

referral_codes

One auto-generated referral code per user.

Columns:

  • id (UUID, PK)
  • user_id (unique FK -> users.user_id)
  • code (case-insensitive unique via LOWER(code) index)
  • is_active
  • created_at, updated_at

referrals

Attribution mapping between referrer and referred users.

Columns:

  • id (UUID, PK)
  • referrer_user_id (FK -> users.user_id)
  • referred_user_id (unique FK -> users.user_id)
  • attribution_source (registration | checkout)
  • locked_at
  • created_at

payment_events

Payment ledger rows used as commission source records for subscriptions now and future paid features later.

Columns:

  • id (UUID, PK)
  • user_id (FK -> users.user_id)
  • source_type, source_id (unique pair)
  • amount, currency
  • stripe_payment_intent_id, stripe_charge_id, payment_fingerprint
  • status (confirmed | refunded | fraud_flagged)
  • paid_at, metadata, created_at

commissions

Commission ledger rows (source of truth for earnings state; no balance field).

Columns:

  • id (UUID, PK)
  • referral_id (FK -> referrals.id)
  • payment_event_id (unique nullable FK -> payment_events.id)
  • amount, currency, percentage
  • status (pending | locked | ready_for_payout | paid | cancelled)
  • available_at
  • payout_batch_id (nullable FK -> payout_batches.id)
  • created_at, updated_at

commission_status_events

Trace table for explicit commission status transitions.

Columns:

  • commission_id (FK -> commissions.id)
  • from_status, to_status
  • reason
  • actor_user_id (nullable FK -> users.user_id)
  • created_at

payout_batches

Future payout grouping placeholder; no disbursement logic yet.

Columns:

  • id (UUID, PK)
  • total_amount
  • status (pending | processing | completed | failed)
  • created_at, updated_at

Notes:

  • Added via migration 202605120900_add_referral_ledger_system.
  • Includes default disabled referral settings and default enable_referrals feature flag insertion.

Relationship Highlights

  • users 1 -> many auth_sessions
  • users 1 -> many email_verification_tokens
  • users 1 -> many structured_notes
  • users 1 -> 1 referral_codes
  • users 1 -> many referrals (as referrer) and users 1 -> 0..1 referrals (as referred)
  • users 1 -> many payment_events
  • referrals 1 -> many commissions
  • commissions 1 -> many commission_status_events
  • payout_batches 1 -> many commissions
  • structured_notes 1 -> many structured_note_references
  • structured_notes 1 -> many structured_note_canvas_items
  • background_jobs is standalone in v1 (no FK dependency to domain tables)
  • Existing feature tables (documents, notifications, user_subscriptions, etc.) continue to reference users.user_id

Schema Change Rules

  • Follow mandatory migration controls in /docs/database-schema/migration-convention:
    • required intent header (PURPOSE, IMPACT, DATA RISK, ROLLBACK STRATEGY)
    • dangerous migration classification and required destructive checklist evidence
    • migration rationale/changelog block in the PR
  • Additive auth/security migrations must include QA docs updates in docs/features/*
  • Never push schema changes directly to main; use PR flow to dev