Feature: Feature Flag System

Metadata

  • Issue ID: FEAT-35
  • Status: Done
  • Owner: Kzu0-afk
  • Related PRs: 35-feature-flag-systemdev

Overview

Centralized, database-backed feature flag system that allows administrators to enable or disable application features dynamically without redeploying. Flags are stored in a PostgreSQL feature_flags table, cached in-memory for fast lookups, and exposed through a REST API for administration and programmatic checks. The system operates at the system level — independently of user subscriptions — and complements the existing subscription-based PlanAccessGuard / @RequireFeature pattern used in the Documents Module.

Key distinction from the existing Documents Module pattern:

  • @RequireFeature('UPLOAD') + PlanAccessGuard → checks whether a user's subscription plan permits the action (subscription-level gate).
  • @FeatureFlag(FeatureFlagName.ENABLE_UPLOADS) + FeatureFlagGuard → checks whether the feature itself is turned on system-wide (system-level gate).

Both guards can be composed on the same endpoint. The feature flag guard runs first; if the flag is disabled, the request is rejected with 403 before the subscription guard even executes.


Frontend Behavior

Note: The admin UI is optional for the initial release. The system can be operated entirely via API or direct DB updates until an admin dashboard is built.

Admin Panel (Future)

  • Displays a table of all feature flags with columns: Name, Description, Enabled (toggle switch), Environment, Last Updated
  • Toggle switch calls PATCH /feature-flags/:name/toggle and updates the row optimistically
  • "Create Flag" button opens a form with fields: Name (required, enable_ prefix enforced), Description (optional), Environment (optional dropdown: development, staging, production, test, or blank for all)
  • "Delete" action shows a confirmation dialog before calling DELETE /feature-flags/:name
  • Loading state shown while fetching flag list
  • Error state shown if API calls fail
  • Empty state: "No feature flags configured" when the table is empty

Consumer-Facing

  • Features gated by flags show a generic "Feature not available" error message when the flag is disabled
  • No indication is given to end users that a feature flag exists — the feature simply appears unavailable

Backend Behavior

Module Structure

backend/src/feature-flags/
├── decorators/
│   └── feature-flag.decorator.ts    # @FeatureFlag() SetMetadata decorator
├── dto/
│   └── create-feature-flag.dto.ts   # Validation DTO
├── feature-flag.guard.ts            # CanActivate guard
├── feature-flags.controller.ts      # REST endpoints
├── feature-flags.enum.ts            # Known flag name constants
├── feature-flags.module.ts          # NestJS module
└── feature-flags.service.ts         # Core service with caching

Endpoints

GET /feature-flags

  • Returns all feature flags
  • Restricted to admin users (RolesGuard + @Roles('admin'))

Response:

{
  "data": [
    {
      "id": "uuid",
      "name": "enable_ai_tools",
      "description": "Controls access to AI-powered features",
      "is_enabled": true,
      "environment": null,
      "created_at": "2026-04-23T12:00:00Z",
      "updated_at": "2026-04-23T14:30:00Z"
    }
  ]
}

GET /feature-flags/status/:name

  • Returns the enabled/disabled status of a single flag
  • Intended for internal service-to-service or frontend feature checks
  • Returns { name, is_enabled } — minimal payload for fast checks

Response (flag exists):

{ "name": "enable_ai_tools", "is_enabled": true }

Response (flag not found):

{ "name": "enable_ai_tools", "is_enabled": false }

Missing flags default to false (disabled) — safe default behavior.

POST /feature-flags

  • Creates a new feature flag
  • Restricted to admin users (RolesGuard + @Roles('admin'))
  • Validates that name follows the enable_ prefix convention
  • Returns 409 Conflict if a flag with the same name already exists

Request body:

{
  "name": "enable_exam_mode",
  "description": "Controls access to exam/quiz mode",
  "is_enabled": false,
  "environment": "production"
}

Response: 201 Created with the full flag object.

PATCH /feature-flags/:name/toggle

  • Flips the is_enabled value of the specified flag using a single atomic SQL statement
  • Restricted to admin users (RolesGuard + @Roles('admin'))
  • Invalidates the in-memory cache immediately after toggle

Response:

{ "name": "enable_exam_mode", "is_enabled": true }

DELETE /feature-flags/:name

  • Hard-deletes the flag record
  • Restricted to admin users (RolesGuard + @Roles('admin'))
  • Returns 404 if flag not found
  • Invalidates the in-memory cache immediately after deletion

Response:

{ "deleted": true, "name": "enable_exam_mode" }

POST /feature-flags/cache/refresh

  • Force-refreshes the in-memory cache from the database
  • Restricted to admin users (RolesGuard + @Roles('admin'))
  • Useful in multi-instance deployments to mitigate eventual consistency after a critical toggle

Response:

{ "refreshed": true }

Business Logic

  • Cache strategy: All flags are loaded into an in-memory Map on service initialization and refreshed on every write operation. A background TTL of 60 seconds forces a periodic refresh. In multi-instance deployments, eventual consistency applies up to the TTL; the POST /feature-flags/cache/refresh endpoint mitigates this. ensureCacheFresh uses an in-flight promise pattern to prevent thundering herds.
  • isEnabled() method: The primary API consumed by FeatureFlagGuard and any service that needs to check a flag. Signature: isEnabled(flagName: string, context?: { environment?: string }): Promise<boolean>.
    • If the flag does not exist in cache or DB → returns false (safe default).
    • If the flag has an environment value set, it only returns true when the current NODE_ENV matches.
    • If the flag has environment = null, it applies to all environments.
  • Naming convention: All flag names MUST use the enable_ prefix (e.g., enable_ai_tools, enable_ocr_processing). The POST endpoint validates this.
  • Decorator typing: @FeatureFlag() accepts FeatureFlagName | string. Prefer enum constants (FeatureFlagName.ENABLE_*) to reduce typo risk and improve IDE autocomplete.
  • No hardcoding: Feature availability must never be determined by hardcoded booleans in code. All checks go through FeatureFlagsService.isEnabled() or FeatureFlagGuard.

Guard Composition with Existing Documents Module Pattern

Example of composing both guards on a single endpoint:

import { FeatureFlagName } from '../feature-flags/feature-flags.enum';

// System-level gate: is the upload feature turned on at all?
// Subscription-level gate: does this user's plan allow more uploads?
@Post('upload')
@UseGuards(FeatureFlagGuard, PlanAccessGuard)
@FeatureFlag(FeatureFlagName.ENABLE_UPLOADS)
@RequireFeature('UPLOAD')
uploadDocument(@Body() data: any) {
  return { success: true };
}

Guard execution order: FeatureFlagGuardPlanAccessGuard. If the feature flag is disabled, the request is rejected immediately with 403 and the subscription check never runs.

Failure Modes

  • DB connection failure during cache refresh → stale cache continues to serve; error is logged via appLogger
  • DB connection failure during isEnabled() with empty cache → returns false (feature disabled); error is logged
  • Invalid flag name format on POST → returns 400 Bad Request
  • Duplicate flag name on POST → returns 409 Conflict
  • Flag not found on PATCH/DELETE → returns 404 Not Found
  • FeatureFlagGuard with disabled flag → returns 403 Forbidden with message "Feature '[flag_name]' is currently disabled"

QA Test Scenarios

Scenario IDDescriptionStepsInputExpected Result
FEAT-35-01List all flags — happy pathGET /feature-flagsNo params, 3 flags exist in DBReturns 200 with array of 3 flag objects
FEAT-35-02List all flags — empty stateGET /feature-flagsNo flags in DBReturns 200 with data: []
FEAT-35-03Check flag status — exists and enabledGET /feature-flags/status/enable_ai_toolsFlag exists, is_enabled = trueReturns 200 { name: "enable_ai_tools", is_enabled: true }
FEAT-35-04Check flag status — not foundGET /feature-flags/status/enable_nonexistentFlag does not existReturns 200 { name: "enable_nonexistent", is_enabled: false }
FEAT-35-05Create flag — happy pathPOST /feature-flags{ name: "enable_exam_mode", is_enabled: false }Returns 201 with full flag object
FEAT-35-06Create flag — duplicate namePOST /feature-flags{ name: "enable_ai_tools" } when flag already existsReturns 409 Conflict
FEAT-35-07Create flag — invalid name (no prefix)POST /feature-flags{ name: "ai_tools" }Returns 400 Bad Request with validation message
FEAT-35-08Toggle flagPATCH /feature-flags/enable_ai_tools/toggleFlag exists with is_enabled = trueReturns 200 with is_enabled: false; subsequent status check reflects change
FEAT-35-09Toggle flag — not foundPATCH /feature-flags/enable_nonexistent/toggleFlag does not existReturns 404
FEAT-35-10Delete flag — happy pathDELETE /feature-flags/enable_exam_modeFlag existsReturns 200 { deleted: true }; flag no longer appears in list
FEAT-35-11Delete flag — not foundDELETE /feature-flags/enable_nonexistentFlag does not existReturns 404
FEAT-35-12Guard blocks disabled flagHit endpoint decorated with @FeatureFlag(FeatureFlagName.ENABLE_UPLOADS)Flag exists, is_enabled = falseReturns 403 with message containing flag name
FEAT-35-13Guard allows enabled flagHit endpoint decorated with @FeatureFlag(FeatureFlagName.ENABLE_UPLOADS)Flag exists, is_enabled = trueRequest passes through to controller handler
FEAT-35-14Guard handles missing flag (safe default)Hit endpoint decorated with @FeatureFlag('enable_nonexistent')Flag does not exist in DBReturns 403 (safe default: missing = disabled)
FEAT-35-15Cache refresh on toggleToggle a flag, then immediately check statusPATCH then GET in sequenceStatus reflects the toggled value without delay
FEAT-35-16Environment-scoped flagCreate flag with environment: "production", check in dev environmentNODE_ENV=developmentisEnabled() returns false even though is_enabled = true
FEAT-35-17Admin auth — unauthenticatedHit GET /feature-flags with no user objectNo request.userReturns 403 Forbidden
FEAT-35-18Admin auth — insufficient roleHit GET /feature-flags with non-admin userrequest.user.role = 'student'Returns 403 Forbidden
FEAT-35-19Cache force refreshHit POST /feature-flags/cache/refreshAdmin userReturns 200 { refreshed: true }

Edge Cases

  • Missing flag → safe default: isEnabled() returns false for any flag name that does not exist in the database. The system must never crash or throw on a missing flag.
  • Cache staleness: If a flag is updated directly in the DB or on another instance in a multi-replica setup, the cache will be stale for up to 60 seconds. Use the /cache/refresh endpoint if immediate sync is required.
  • Race condition on toggle: Mitigated by using an atomic raw SQL UPDATE ... SET is_enabled = NOT is_enabled statement.
  • Environment mismatch: A flag with environment = 'production' will return false in development/staging even if is_enabled = true. Flags with environment = null apply everywhere.
  • Name uniqueness: The name column has a unique constraint. Attempting to create a duplicate flag returns 409 — never silently overwrites.
  • DB down during startup: If the database is unreachable when the service initializes, the cache will be empty and all isEnabled() calls will return false (all features disabled). An error is logged.
  • Guard without decorator: If FeatureFlagGuard is applied to an endpoint without the @FeatureFlag() decorator, the guard allows the request through (no flag name = no restriction).
  • Composing with PlanAccessGuard: When both guards are applied, FeatureFlagGuard should be listed first in @UseGuards() to short-circuit before the more expensive subscription check.

Notes

  • Admin Auth implemented: Admin endpoints are protected by RolesGuard + @Roles('admin').
  • Relationship to Documents Module: The existing @RequireFeature decorator and PlanAccessGuard in the Documents Module operate at the subscription level. The feature flag system operates at the system level. They are independent and composable — see the guard composition example in Backend Behavior.
  • Naming convention: All flags use the enable_ prefix (e.g., enable_ai_tools, enable_ocr_processing). This is enforced by validation on the POST endpoint.
  • Cache TTL: Default 60 seconds, configurable via FEATURE_FLAG_CACHE_TTL_MS environment variable.
  • Database: The feature_flags table was previously dropped in migration 202604220400_drop_deprecated_tables. This is a fresh redesign with a different schema.
  • Dependencies: PrismaModule (global), ConfigModule (global). No additional npm packages required.
  • Known limitations: No audit log for flag changes (future work). No percentage-based rollouts (future work). No user-scoped flags (future work — requires auth integration).
  • Frontend integration: The frontend can call GET /feature-flags/status/:name to conditionally render UI elements. A React hook (useFeatureFlag) is recommended for future frontend work but is out of scope for this PR.