Feature: Feature Flag System

Metadata

Issue ID: FEAT-35
Status: Done
Owner: Kzu0-afk
Related PRs: 35-feature-flag-system → dev

Overview

Centralized, database-backed feature flag system that allows administrators to enable or disable application features dynamically without redeploying. Flags are stored in a PostgreSQL feature_flags table, cached in-memory for fast lookups, and exposed through a REST API for administration and programmatic checks. The system operates at the system level — independently of user subscriptions — and complements the existing subscription-based PlanAccessGuard / @RequireFeature pattern used in the Documents Module.

Key distinction from the existing Documents Module pattern:

@RequireFeature('UPLOAD') + PlanAccessGuard → checks whether a user's subscription plan permits the action (subscription-level gate).
@FeatureFlag(FeatureFlagName.ENABLE_UPLOADS) + FeatureFlagGuard → checks whether the feature itself is turned on system-wide (system-level gate).

Both guards can be composed on the same endpoint. The feature flag guard runs first; if the flag is disabled, the request is rejected with 403 before the subscription guard even executes.

Frontend Behavior

Note: The admin UI is optional for the initial release. The system can be operated entirely via API or direct DB updates until an admin dashboard is built.

Admin Panel (Future)

Displays a table of all feature flags with columns: Name, Description, Enabled (toggle switch), Environment, Last Updated
Toggle switch calls PATCH /feature-flags/:name/toggle and updates the row optimistically
"Create Flag" button opens a form with fields: Name (required, enable_ prefix enforced), Description (optional), Environment (optional dropdown: development, staging, production, test, or blank for all)
"Delete" action shows a confirmation dialog before calling DELETE /feature-flags/:name
Loading state shown while fetching flag list
Error state shown if API calls fail
Empty state: "No feature flags configured" when the table is empty

Consumer-Facing

Features gated by flags show a generic "Feature not available" error message when the flag is disabled
No indication is given to end users that a feature flag exists — the feature simply appears unavailable

Backend Behavior

Module Structure

backend/src/feature-flags/
├── decorators/
│   └── feature-flag.decorator.ts    # @FeatureFlag() SetMetadata decorator
├── dto/
│   └── create-feature-flag.dto.ts   # Validation DTO
├── feature-flag.guard.ts            # CanActivate guard
├── feature-flags.controller.ts      # REST endpoints
├── feature-flags.enum.ts            # Known flag name constants
├── feature-flags.module.ts          # NestJS module
└── feature-flags.service.ts         # Core service with caching

Endpoints

`GET /feature-flags`

Returns all feature flags
Restricted to admin users (RolesGuard + @Roles('admin'))

Response:

{
  "data": [
    {
      "id": "uuid",
      "name": "enable_ai_tools",
      "description": "Controls access to AI-powered features",
      "is_enabled": true,
      "environment": null,
      "created_at": "2026-04-23T12:00:00Z",
      "updated_at": "2026-04-23T14:30:00Z"
    }
  ]
}

`GET /feature-flags/status/:name`

Returns the enabled/disabled status of a single flag
Intended for internal service-to-service or frontend feature checks
Returns { name, is_enabled } — minimal payload for fast checks

Response (flag exists):

{ "name": "enable_ai_tools", "is_enabled": true }

Response (flag not found):

{ "name": "enable_ai_tools", "is_enabled": false }

Missing flags default to false (disabled) — safe default behavior.

`POST /feature-flags`

Creates a new feature flag
Restricted to admin users (RolesGuard + @Roles('admin'))
Validates that name follows the enable_ prefix convention
Returns 409 Conflict if a flag with the same name already exists

Request body:

{
  "name": "enable_exam_mode",
  "description": "Controls access to exam/quiz mode",
  "is_enabled": false,
  "environment": "production"
}

Response: 201 Created with the full flag object.

`PATCH /feature-flags/:name/toggle`

Flips the is_enabled value of the specified flag using a single atomic SQL statement
Restricted to admin users (RolesGuard + @Roles('admin'))
Invalidates the in-memory cache immediately after toggle

Response:

{ "name": "enable_exam_mode", "is_enabled": true }

`DELETE /feature-flags/:name`

Hard-deletes the flag record
Restricted to admin users (RolesGuard + @Roles('admin'))
Returns 404 if flag not found
Invalidates the in-memory cache immediately after deletion

Response:

{ "deleted": true, "name": "enable_exam_mode" }

`POST /feature-flags/cache/refresh`

Force-refreshes the in-memory cache from the database
Restricted to admin users (RolesGuard + @Roles('admin'))
Useful in multi-instance deployments to mitigate eventual consistency after a critical toggle

Response:

{ "refreshed": true }

Business Logic

Cache strategy: All flags are loaded into an in-memory Map on service initialization and refreshed on every write operation. A background TTL of 60 seconds forces a periodic refresh. In multi-instance deployments, eventual consistency applies up to the TTL; the POST /feature-flags/cache/refresh endpoint mitigates this. ensureCacheFresh uses an in-flight promise pattern to prevent thundering herds.
isEnabled() method: The primary API consumed by FeatureFlagGuard and any service that needs to check a flag. Signature: isEnabled(flagName: string, context?: { environment?: string }): Promise<boolean>.
- If the flag does not exist in cache or DB → returns false (safe default).
- If the flag has an environment value set, it only returns true when the current NODE_ENV matches.
- If the flag has environment = null, it applies to all environments.
Naming convention: All flag names MUST use the enable_ prefix (e.g., enable_ai_tools, enable_ocr_processing). The POST endpoint validates this.
Decorator typing: @FeatureFlag() accepts FeatureFlagName | string. Prefer enum constants (FeatureFlagName.ENABLE_*) to reduce typo risk and improve IDE autocomplete.
No hardcoding: Feature availability must never be determined by hardcoded booleans in code. All checks go through FeatureFlagsService.isEnabled() or FeatureFlagGuard.

Guard Composition with Existing Documents Module Pattern

Example of composing both guards on a single endpoint:

import { FeatureFlagName } from '../feature-flags/feature-flags.enum';

// System-level gate: is the upload feature turned on at all?
// Subscription-level gate: does this user's plan allow more uploads?
@Post('upload')
@UseGuards(FeatureFlagGuard, PlanAccessGuard)
@FeatureFlag(FeatureFlagName.ENABLE_UPLOADS)
@RequireFeature('UPLOAD')
uploadDocument(@Body() data: any) {
  return { success: true };
}

Guard execution order: FeatureFlagGuard → PlanAccessGuard. If the feature flag is disabled, the request is rejected immediately with 403 and the subscription check never runs.

Failure Modes

DB connection failure during cache refresh → stale cache continues to serve; error is logged via appLogger
DB connection failure during isEnabled() with empty cache → returns false (feature disabled); error is logged
Invalid flag name format on POST → returns 400 Bad Request
Duplicate flag name on POST → returns 409 Conflict
Flag not found on PATCH/DELETE → returns 404 Not Found
FeatureFlagGuard with disabled flag → returns 403 Forbidden with message "Feature '[flag_name]' is currently disabled"

QA Test Scenarios

Scenario ID	Description	Steps	Input	Expected Result
FEAT-35-01	List all flags — happy path	`GET /feature-flags`	No params, 3 flags exist in DB	Returns 200 with array of 3 flag objects
FEAT-35-02	List all flags — empty state	`GET /feature-flags`	No flags in DB	Returns 200 with `data: []`
FEAT-35-03	Check flag status — exists and enabled	`GET /feature-flags/status/enable_ai_tools`	Flag exists, `is_enabled = true`	Returns 200 `{ name: "enable_ai_tools", is_enabled: true }`
FEAT-35-04	Check flag status — not found	`GET /feature-flags/status/enable_nonexistent`	Flag does not exist	Returns 200 `{ name: "enable_nonexistent", is_enabled: false }`
FEAT-35-05	Create flag — happy path	`POST /feature-flags`	`{ name: "enable_exam_mode", is_enabled: false }`	Returns 201 with full flag object
FEAT-35-06	Create flag — duplicate name	`POST /feature-flags`	`{ name: "enable_ai_tools" }` when flag already exists	Returns 409 Conflict
FEAT-35-07	Create flag — invalid name (no prefix)	`POST /feature-flags`	`{ name: "ai_tools" }`	Returns 400 Bad Request with validation message
FEAT-35-08	Toggle flag	`PATCH /feature-flags/enable_ai_tools/toggle`	Flag exists with `is_enabled = true`	Returns 200 with `is_enabled: false`; subsequent status check reflects change
FEAT-35-09	Toggle flag — not found	`PATCH /feature-flags/enable_nonexistent/toggle`	Flag does not exist	Returns 404
FEAT-35-10	Delete flag — happy path	`DELETE /feature-flags/enable_exam_mode`	Flag exists	Returns 200 `{ deleted: true }`; flag no longer appears in list
FEAT-35-11	Delete flag — not found	`DELETE /feature-flags/enable_nonexistent`	Flag does not exist	Returns 404
FEAT-35-12	Guard blocks disabled flag	Hit endpoint decorated with `@FeatureFlag(FeatureFlagName.ENABLE_UPLOADS)`	Flag exists, `is_enabled = false`	Returns 403 with message containing flag name
FEAT-35-13	Guard allows enabled flag	Hit endpoint decorated with `@FeatureFlag(FeatureFlagName.ENABLE_UPLOADS)`	Flag exists, `is_enabled = true`	Request passes through to controller handler
FEAT-35-14	Guard handles missing flag (safe default)	Hit endpoint decorated with `@FeatureFlag('enable_nonexistent')`	Flag does not exist in DB	Returns 403 (safe default: missing = disabled)
FEAT-35-15	Cache refresh on toggle	Toggle a flag, then immediately check status	`PATCH` then `GET` in sequence	Status reflects the toggled value without delay
FEAT-35-16	Environment-scoped flag	Create flag with `environment: "production"`, check in `dev` environment	`NODE_ENV=development`	`isEnabled()` returns `false` even though `is_enabled = true`
FEAT-35-17	Admin auth — unauthenticated	Hit `GET /feature-flags` with no user object	No `request.user`	Returns 403 Forbidden
FEAT-35-18	Admin auth — insufficient role	Hit `GET /feature-flags` with non-admin user	`request.user.role = 'student'`	Returns 403 Forbidden
FEAT-35-19	Cache force refresh	Hit `POST /feature-flags/cache/refresh`	Admin user	Returns 200 `{ refreshed: true }`

Edge Cases

Missing flag → safe default: isEnabled() returns false for any flag name that does not exist in the database. The system must never crash or throw on a missing flag.
Cache staleness: If a flag is updated directly in the DB or on another instance in a multi-replica setup, the cache will be stale for up to 60 seconds. Use the /cache/refresh endpoint if immediate sync is required.
Race condition on toggle: Mitigated by using an atomic raw SQL UPDATE ... SET is_enabled = NOT is_enabled statement.
Environment mismatch: A flag with environment = 'production' will return false in development/staging even if is_enabled = true. Flags with environment = null apply everywhere.
Name uniqueness: The name column has a unique constraint. Attempting to create a duplicate flag returns 409 — never silently overwrites.
DB down during startup: If the database is unreachable when the service initializes, the cache will be empty and all isEnabled() calls will return false (all features disabled). An error is logged.
Guard without decorator: If FeatureFlagGuard is applied to an endpoint without the @FeatureFlag() decorator, the guard allows the request through (no flag name = no restriction).
Composing with PlanAccessGuard: When both guards are applied, FeatureFlagGuard should be listed first in @UseGuards() to short-circuit before the more expensive subscription check.

Notes

Admin Auth implemented: Admin endpoints are protected by RolesGuard + @Roles('admin').
Relationship to Documents Module: The existing @RequireFeature decorator and PlanAccessGuard in the Documents Module operate at the subscription level. The feature flag system operates at the system level. They are independent and composable — see the guard composition example in Backend Behavior.
Naming convention: All flags use the enable_ prefix (e.g., enable_ai_tools, enable_ocr_processing). This is enforced by validation on the POST endpoint.
Cache TTL: Default 60 seconds, configurable via FEATURE_FLAG_CACHE_TTL_MS environment variable.
Database: The feature_flags table was previously dropped in migration 202604220400_drop_deprecated_tables. This is a fresh redesign with a different schema.
Dependencies: PrismaModule (global), ConfigModule (global). No additional npm packages required.
Known limitations: No audit log for flag changes (future work). No percentage-based rollouts (future work). No user-scoped flags (future work — requires auth integration).
Frontend integration: The frontend can call GET /feature-flags/status/:name to conditionally render UI elements. A React hook (useFeatureFlag) is recommended for future frontend work but is out of scope for this PR.