Impact Analysis System - Technical Documentation
System Architecture
High-Level Architecture
Presentation Layer: Django Templates, AJAX APIs, Static Assets
Business Logic Layer: Data Processor, Statistical Engine, Qualitative Analyzer
Data Access Layer: Django ORM, File Storage, Cache Layer
Infrastructure Layer: PostgreSQL, File System, Redis Cache
Data Flow Pipeline
Data Sources → Processing Pipeline → Quality Assessment → Unified Data Model
↓
Method Execution Engine → Results Generation → AI Interpretation → Export Generation
Output Specifications
Statistical Result Record Structure
| Component | Field | Type | Description |
| Identification | analysis_job_id | UUID | Unique analysis identifier |
| method | Enumerated | Statistical method used |
| outcome_variable | String | Variable being analyzed |
| created_timestamp | DateTime | When result was created |
| Core Statistics | treatment_effect | Decimal(15,6) | Estimated treatment effect |
| standard_error | Decimal(15,6) | Standard error of estimate |
| p_value | Decimal(15,10) | Statistical significance |
| confidence_interval_lower | Decimal(15,6) | Lower CI bound |
| confidence_interval_upper | Decimal(15,6) | Upper CI bound |
| confidence_level | Decimal(5,2) | CI level (default 95.00) |
| Sample Info | treatment_group_size | Integer | Treatment group N |
| control_group_size | Integer | Control group N |
| total_sample_size | Integer | Total sample N |
| effective_sample_size | Integer | Effective N after matching |
Significance Classification
| Level | P-Value Range | Description |
| highly_significant | p < 0.01 | Strong evidence against null |
| significant | p < 0.05 | Conventional significance |
| marginally_significant | p < 0.10 | Weak evidence |
| not_significant | p ≥ 0.10 | No evidence against null |
Qualitative Result Structure
Theme Information:
• Theme Name: String (255)
• Theme Description: Text
• Theme Keywords: JSON Array
• Theme Frequency: Integer
• Theme Percentage: Decimal (5,2)
Sentiment Analysis:
• Sentiment Label: very_positive, positive, neutral, negative, very_negative
• Sentiment Score: Decimal (5,3) [-1.000 to 1.000]
• Confidence Score: Decimal (5,2)
Supporting Evidence:
• Sample Quotes: JSON Array (max 5)
• Representative Examples: JSON Array
• Context Information: JSON Object
Export Formats
Power BI Export Package
Data Tables (Excel Workbook):
• Analysis_Summary: Metadata, project info, summary statistics
• Statistical_Results: Method info, effect sizes, significance tests
• Participant_Data: Demographics, baseline/midline/endline values
• Qualitative_Results: Themes, sentiment analysis, word frequencies
• Data_Dictionary: Table descriptions, field definitions
Metadata (JSON):
• Dataset Information: Name, description, version, contact
• Table Relationships: Primary keys, foreign keys, cardinality
• Suggested Measures: Treatment effects, significance rates
• Recommended Visualizations: Bar charts, scatter plots, tables
PDF Report Structure
| Report Type | Pages | Target Audience | Key Sections |
| Executive | 10-15 | Decision-makers | Summary, key findings, recommendations |
| Technical | 20-30 | Researchers | Detailed methodology, comprehensive results |
| Comprehensive | 40+ | All stakeholders | Complete analysis, full documentation |
Excel Export Structure
Workbook Sheets:
• Summary: Analysis overview, key statistics
• Participant_Data: Individual participant records
• Statistical_Results: Method results with full details
• Qualitative_Results: Themes, sentiment, word frequencies
• Treatment_Assignments: Group assignments and propensity scores
• Data_Dictionary: Variable definitions and descriptions
• Metadata: Analysis information, processing details
Comprehensive Export Package (ZIP)
Archive Contents:
📁 powerbi/ - Dataset files and metadata
📁 reports/ - PDF reports (executive, technical, comprehensive)
📁 data/ - Raw analysis data in multiple formats
📁 qualitative/ - Text analysis exports for external tools
📁 visualizations/ - Charts and dashboard previews
📁 documentation/ - User guides and methodology notes
📁 metadata/ - Export summary and configuration files
Component Architecture
Data Processing Components
CSV Upload Processor: Multi-format support, encoding detection, validation, cleaning
Form Response Processor: Project data extraction, form mapping, response linking
Data Validator: Quality assessment, completeness checking, readiness validation
Statistical Engine Components
Causal Inference Methods: DiD, PSM, IV, RDD
Standard Methods: T-Tests, ANOVA, Regression, Descriptive Statistics
Specialized Methods: Survival Analysis, Panel Data, Heckman Selection
Qualitative Analyzer Components
Text Processing: Preprocessing, tokenization, lemmatization
Content Analysis: Theme extraction, sentiment analysis, text coding
Pattern Recognition: Word frequency, n-gram analysis, entity recognition
AI Service Components
API Client: Authentication, rate limiting, error handling
Prompt Management: Template management, context optimization
Response Processing: Content extraction, quality assessment, caching
Processing Pipeline
Job Lifecycle States
PENDING → IMPORTING → VALIDATING → CONFIGURING → RUNNING → COMPLETED
↓
FAILED
Real-time Progress Tracking
Metrics: Current method, completion percentage, time elapsed, ETA, error count
Channels: WebSocket, AJAX polling, server events, email notifications
Persistence: Database storage, progress logs, checkpoints, recovery points
Error Handling Strategy
Detection: Data errors, processing errors, system errors
Response: Recovery actions, user notification, system response
Outcomes: Graceful degradation, partial results, safe failure
Security Framework
Authentication: Multi-factor authentication, single sign-on, role-based access
Authorization: Organization-based isolation, project-level permissions
Data Protection: Encryption at rest and in transit, PII handling
API Security: Rate limiting, input validation, injection prevention
Compliance: GDPR compliance, data governance, audit logging