Manuscript Status System

Complete guide to ChapterWise's manuscript status tracking system for developers

Manuscript Status System

ChapterWise implements a sophisticated dual-status system to track manuscripts through their complete lifecycle from upload to analysis completion. This system ensures accurate progress tracking, proper UI state management, and reliable status transitions.

System Overview

The status system consists of two complementary tracking mechanisms:

  1. Primary Status Column (manuscripts.status) - High-level workflow stage
  2. Detailed Status Metadata (manuscripts.file_metadata.status) - Granular progress tracking with timestamps

API Endpoints

  • /api/manuscripts/{id}/status (Recommended) - Returns both status column and file_metadata.status
  • /api/manuscripts/{id}/import-status (Legacy) - Returns detailed workflow status (deprecated)

Status Stages

Complete Lifecycle

pending → uploaded → converted → metadata-detected → chapters-detected → confirmed → analyzed

Stage Definitions

Stage Description Triggers UI Impact
pending Initial state after manuscript creation Record creation Shows upload form
uploaded File successfully uploaded and validated Successful file upload Automatic processing begins
converted Document converted to processable format Document conversion completion Continue to metadata detection
metadata-detected Combined metadata and TOC extracted Combined metadata/TOC OpenAI call completion Continue to chapter detection
chapters-detected Chapter structure identified and chunked Full chapter detection completion Shows confirmation step
confirmed Import process finalized by user User confirms chapter structure Shows analysis options
analyzed All configured analyses completed All analysis modules finish Shows results dashboard

Technical Implementation

Database Schema

-- Primary status tracking
manuscripts.status VARCHAR(50) DEFAULT 'pending'

-- Detailed metadata tracking
manuscripts.file_metadata JSONB
{
  "status": {
    "is_pending": true,
    "is_uploaded": false,
    "is_converted": false,
    "is_metadata_detected": false,
    "is_chapters_detected": false,
    "is_confirmed": false,
    "is_analyzed": false,
    "last_updated": "2025-01-22T10:30:00Z",
    "stage_timestamps": {
      "uploaded_at": "2025-01-22T10:25:00Z",
      "converted_at": "2025-01-22T10:27:00Z",
      "metadata_detected_at": "2025-01-22T10:28:00Z",
      "chapters_detected_at": "2025-01-22T10:30:00Z",
      "confirmed_at": null,
      "analyzed_at": null
    }
  }
}

Status Update Method

def update_status(self, new_stage, save=True):
    """Update manuscript to new stage"""
    if not self.file_metadata:
        self.file_metadata = {}

    status_obj = self.get_status_obj()
    stages = ['pending', 'uploaded', 'converted', 'metadata-detected', 'chapters-detected', 'confirmed', 'analyzed']

    if new_stage not in stages:
        raise ValueError(f"Invalid stage: {new_stage}")

    # Set all previous stages to True
    stage_index = stages.index(new_stage)
    for i, stage in enumerate(stages):
        # Convert hyphenated stage names to underscore for JSON keys
        stage_key = stage.replace('-', '_')
        status_obj[f'is_{stage_key}'] = i <= stage_index

    # Update timestamp
    now = datetime.utcnow().isoformat()
    if new_stage != 'pending':
        # Convert hyphenated stage names to underscore for timestamp keys
        timestamp_key = new_stage.replace('-', '_')
        status_obj['stage_timestamps'][f'{timestamp_key}_at'] = now

    status_obj['last_updated'] = now

    # Update both status column and metadata
    self.status = new_stage
    self.file_metadata['status'] = status_obj
    self.updated_at = datetime.utcnow()

    if save:
        db.session.commit()

Status Transition Points

1. Upload → Uploaded

Location: app/routes/main.py:297

@main_bp.route('/api/manuscripts/upload', methods=['POST'])
def api_upload_manuscript():
    # ... file upload logic ...
    manuscript.update_status('uploaded', save=True)

2. Convert → Converted

Location: app/routes/main.py:884

@main_bp.route('/api/manuscripts/<id>/convert', methods=['POST'])
def convert_manuscript(manuscript_id):
    # ... conversion logic ...
    manuscript.update_status('converted', save=True)

3. Convert → Metadata-Detected

Location: app/services/import_orchestrator.py (front-matter completion)

# After front-matter/TOC detection completes
manuscript.update_status('metadata-detected', save=True)

4. Metadata-Detected → Chapters-Detected

Location: agent_worker.py:934 (async completion)

# Direct database update to avoid circular imports
cursor.execute(
    "UPDATE manuscripts SET status = %s, file_metadata = %s WHERE id = %s",
    ('chapters-detected', json.dumps(updated_metadata), manuscript_id)
)

5. Chapters-Detected → Confirmed

Location: app/routes/main.py:1751

@main_bp.route('/api/manuscripts/<id>/finalize-import', methods=['POST'])
def finalize_import(manuscript_id):
    # ... finalization logic ...
    manuscript.update_status('confirmed', save=True)

6. Confirmed → Analyzed

Location: app/analysis/analysis_service.py:209-221

@staticmethod
def get_analysis_status(manuscript_id: str):
    # When all analyses complete
    if all(a.status == 'completed' for a in all_analyses):
        manuscript = Manuscript.query.get(manuscript_id)
        if manuscript and manuscript.status != 'analyzed':
            manuscript.update_status('analyzed', save=True)

UI Integration

Frontend Status Initialization

The import page now initializes from server status rather than client state:

function importManuscript(manuscriptId, fileType, csrfToken, manuscriptStatus, currentStatus) {
    return {
        initializeFromServerStatus() {
            // Set UI state based on server status
            if (this.manuscriptStatus.is_uploaded) {
                this.state.conversion = 'completed';
            }
            if (this.manuscriptStatus.is_detected) {
                this.state.chapters = 'completed';
            }
            if (this.manuscriptStatus.is_confirmed) {
                this.state.confirmation = 'completed';
            }

            // Navigate to appropriate step
            if (this.currentStatus === 'confirmed') {
                this.step = 3; // Show confirmation step
            } else if (this.currentStatus === 'detected') {
                this.step = 3; // Move to confirmation
            } else if (this.currentStatus === 'converted') {
                this.step = 2; // Show chapter detection
            }
        }
    };
}

Progress Steps Visual State

Steps now reflect server status with proper visual indicators:

<!-- Step circle classes based on server status -->
<div :class="{
    'bg-emerald-500 text-white': state.conversion === 'completed' || manuscriptStatus?.is_uploaded,
    'bg-indigo-500 text-white': state.conversion === 'processing',
    'bg-gray-300': state.conversion === 'idle' && !manuscriptStatus?.is_uploaded
}">

Error Handling

Status Validation

def validate_status_transition(current_status, new_status):
    """Validate that status transition is allowed"""
    stages = ['pending', 'uploaded', 'converted', 'detected', 'confirmed', 'analyzed']

    if new_status not in stages:
        raise ValueError(f"Invalid status: {new_status}")

    current_index = stages.index(current_status)
    new_index = stages.index(new_status)

    # Allow moving forward or staying same
    if new_index < current_index:
        raise ValueError(f"Cannot move backwards from {current_status} to {new_status}")

Rollback Scenarios

  • Upload Failure: Status remains pending
  • Conversion Failure: Status remains uploaded, user can retry
  • Chapter Detection Failure: Status remains converted, user can retry
  • Analysis Failure: Individual modules fail, manuscript status unchanged

Monitoring & Debugging

Status Queries

-- Check status distribution
SELECT status, COUNT(*) FROM manuscripts GROUP BY status;

-- Find manuscripts stuck in processing
SELECT id, title, status, updated_at 
FROM manuscripts 
WHERE status IN ('uploaded', 'converted') 
AND updated_at < NOW() - INTERVAL '1 hour';

-- Check metadata consistency
SELECT id, status, file_metadata->'status'->>'last_updated' as last_meta_update
FROM manuscripts 
WHERE file_metadata->'status'->>'last_updated' IS NOT NULL;

Common Issues

  1. Status Stuck in Processing
  2. Check background workers are running
  3. Verify task completion in agent worker logs
  4. Manual status update may be required

  5. UI Not Reflecting Server Status

  6. Verify manuscript_status passed to template
  7. Check initializeFromServerStatus() execution
  8. Confirm browser cache not serving stale JavaScript

  9. Missing Timestamps

  10. Check update_status() calls include proper stage
  11. Verify database migrations applied correctly
  12. Ensure flag_modified() called for JSONB updates

Migration & Backwards Compatibility

Legacy Analysis Status

For manuscripts created before the analysis configuration system:

def _get_legacy_status(all_analyses: List[ChapterAnalysis], manuscript_id: str):
    """Fallback for manuscripts without ManuscriptAnalysisConfig"""
    if all(a.status == 'completed' for a in all_analyses):
        # Still updates to 'analyzed' status
        manuscript = Manuscript.query.get(manuscript_id)
        if manuscript and manuscript.status != 'analyzed':
            manuscript.update_status('analyzed', save=True)

This ensures backwards compatibility while maintaining identical functionality.

Best Practices

For Developers

  1. Always use update_status() - Don't manually set status columns
  2. Check current status before transitions to avoid conflicts
  3. Include error handling for invalid status transitions
  4. Log status changes for debugging and audit trails
  5. Test UI state with different server status combinations

For Operations

  1. Monitor stuck manuscripts using status queries
  2. Verify worker health for async status updates
  3. Check database consistency between status column and metadata
  4. Backup before migrations that affect status tracking

API Examples

GET /api/manuscripts/{id}/status

Response:

{
  "success": true,
  "status": "converted",
  "status_obj": {
    "is_pending": false,
    "is_uploaded": true,
    "is_converted": true,
    "is_detected": false,
    "is_confirmed": false,
    "is_analyzed": false,
    "last_updated": "2025-01-22T10:30:00Z",
    "stage_timestamps": {
      "uploaded_at": "2025-01-22T10:25:00Z",
      "converted_at": "2025-01-22T10:27:00Z"
    }
  },
  "file_metadata_status": {
    "is_uploaded": true,
    "is_converted": true
  },
  "is_uploaded": true,
  "is_converted": true,
  "is_detected": false,
  "is_confirmed": false
}

Check Import Status (Legacy)

⚠️ DEPRECATED - Use /status endpoint instead.

GET /api/manuscripts/{id}/import-status

Response:

{
  "success": true,
  "import_status": {
    "conversion": {
      "status": "completed",
      "message": "Conversion completed successfully"
    },
    "chapter_detection": {
      "status": "completed", 
      "message": "Successfully detected 12 chapters"
    },
    "finalization": {
      "status": "completed",
      "message": "Import completed successfully"
    }
  }
}

Update Status (Internal)

# In application code
manuscript = Manuscript.query.get(manuscript_id)
manuscript.update_status('analyzed', save=True)

# Verify update
assert manuscript.status == 'analyzed'
assert manuscript.get_status_obj()['is_analyzed'] == True

Summary

The ChapterWise manuscript status system provides robust, reliable progress tracking through a dual-tracking approach that maintains both high-level workflow state and detailed progress metadata. The system ensures UI consistency, supports backwards compatibility, and provides comprehensive error handling for a seamless user experience.