Phoenix Technical Reference

This document provides comprehensive technical specifications for Phoenix, including mathematical models, validation rules, API endpoints, and implementation details.

Table of Contents


Mathematical Models

Signal Generation Formula

Phoenix generates time series using a superposition model:

signal(t) = μ + trend(t) + Σ oscillation_i(t) + Σ transient_j(t) + noise(t)

Where:
  μ = mean (baseline value)
  t = time in seconds from start
  trend(t) = linear trend component
  oscillation_i(t) = ith oscillatory component
  transient_j(t) = jth transient event
  noise(t) = random noise component

Components in Detail

Base Signal (Mean)

baseline = μ

Where:
  μ ∈ ℝ (any real number)

Noise Component

Gaussian (normal) distribution with zero mean:

noise(t) ~ N(0, σ²)

Where:
  σ = noise amplitude (standard deviation)
  σ ≥ 0
  N(0, σ²) = normal distribution with mean 0, variance σ²

Implementation:

noise = np.random.normal(0, sigma, size=n_points)

Statistical Properties: - Mean: 0 - Standard Deviation: σ - ~68% of values within ±σ - ~95% of values within ±2σ - ~99.7% of values within ±3σ

Linear Trend

trend(t) = m × i

Where:
  m = slope (rate of change per sample)
  i = sample index (0, 1, 2, ..., n-1)
  m ∈ ℝ (any real number, positive or negative)

Implementation:

trend = slope * np.arange(n_points)

Total Change:

Δtotal = m × (n_points - 1)

Oscillation (Sine Wave)

Each oscillation follows:

oscillation(t) = A × sin(2π × f × t + φ)

Where:
  A = amplitude (peak height)
  f = frequency (Hz, cycles per second)
  t = time (seconds)
  φ = phase offset (radians)

Constraints:
  A ≥ 0
  f > 0
  φ ∈ ℝ

Frequency-Period Relationship:

f = 1 / P
P = 1 / f

Where:
  P = period (seconds per cycle)

Implementation (frequency-based):

time_array = np.arange(n_points) * time_step
oscillation = amplitude * np.sin(2 * np.pi * frequency * time_array + phase)

Implementation (period-based):

frequency = 1.0 / period
oscillation = amplitude * np.sin(2 * np.pi * frequency * time_array + phase)

Transient Response (Second-Order System)

Each transient models an impulse or step response:

transient(t) = A × h(t - t₀)    for t ≥ t₀
transient(t) = 0                  for t < t₀

Where:
  A  = amplitude (peak height)
  t₀ = onset time (seconds)
  h  = normalized response function

Impulse Response

Underdamped (ζ < 1):

h(τ) = exp(-ζωₙτ) × sin(ωdτ) / peak

Where:
  ωₙ = 2π × fₙ (natural angular frequency)
  ωd = ωₙ√(1 - ζ²) (damped natural frequency)
  peak = exp(-ζωₙt_p) × sin(ωd × t_p)
  t_p = atan2(ωd, ζωₙ) / ωd (time of peak)

Critically Damped (ζ = 1):

h(τ) = ωₙτ × exp(-ωₙτ) / e⁻¹

Overdamped (ζ > 1):

h(τ) = exp(-ζωₙτ) × sinh(ω'τ) / max(|h|)

Where:
  ω' = ωₙ√(ζ² - 1)

Step Response

Underdamped (ζ < 1):

h(τ) = 1 - exp(-ζωₙτ) × [cos(ωdτ) + (ζ/√(1-ζ²)) × sin(ωdτ)]

Critically Damped (ζ = 1):

h(τ) = 1 - (1 + ωₙτ) × exp(-ωₙτ)

Overdamped (ζ > 1):

h(τ) = 1 - exp(-ζωₙτ) × [cosh(ω'τ) + (ζ/√(ζ²-1)) × sinh(ω'τ)]

Implementation:

# Impulse response is normalized so peak magnitude ≈ 1.0
# Step response settles to 1.0 (with possible overshoot for underdamped)
# Final signal: amplitude * response

Multi-Channel Correlation

Correlations are applied using Cholesky decomposition:

Correlation Matrix

For N channels, the correlation matrix C is N×N symmetric:

C = [c_ij] where c_ij = correlation between channel i and j

Properties:
  c_ii = 1 (self-correlation)
  c_ij = c_ji (symmetric)
  -1 ≤ c_ij ≤ 1
  C must be positive semi-definite

Cholesky Decomposition

C = L × L^T

Where:
  L = lower triangular matrix (Cholesky factor)
  L^T = transpose of L

Existence: Cholesky decomposition exists if and only if C is positive semi-definite.

Positive Semi-Definite Test:

C is positive semi-definite ⟺ all eigenvalues λ_i ≥ 0

Phoenix uses tolerance: λ_i ≥ -1×10^-10

Correlation Application Algorithm

For independent signals X = [x_1, x_2, ..., x_N]:

  1. Normalize: ``` X_norm = (X - μ_X) / σ_X

Where: μ_X = mean of each channel σ_X = standard deviation of each channel ```

  1. Apply Correlation: ``` X_corr = L × X_norm

Where L is the Cholesky factor of correlation matrix C ```

  1. Denormalize: X_final = X_corr × σ_X + μ_X

Result: X_final has: - Correlation matrix = C - Original means μ_X - Original standard deviations σ_X

Data Degradation Models

Data Point Removal

Number Mode:

n_remove = N (user-specified count)

Constraints:
  0 < N ≤ total_points
  N ∈ ℤ⁺ (positive integer)

Percentage Mode:

n_remove = ⌈total_points × (P / 100)⌉

Where:
  P = percentage (user-specified)
  ⌈·⌉ = ceiling function

Constraints:
  0 < P ≤ 100

Selection:

indices_to_remove = random_sample(range(total_points), n_remove)

Without replacement (each point removed at most once)
Uniform distribution (each point equally likely)

Outlier Insertion

Quantity (same as removal): - Number mode: exact count - Percentage mode: proportion of points

Value Generation:

1. Constant Value:

outlier_value = c (user-specified constant)

c ∈ ℝ (any real number)

2. Random Range:

outlier_value ~ U(v_min, v_max)

Where:
  U(a, b) = uniform distribution between a and b
  v_min < v_max (user-specified)

3. Factor Multiplication:

outlier_value = k × original_value

Where:
  k = multiplication factor (user-specified)
  k ∈ ℝ, k ≠ 0
  original_value = value at selected timestamp before replacement

Validation Rules

Time Configuration

Duration Validation:

duration_seconds = days × 86400 + hours × 3600 + minutes × 60 + seconds

Requirements:
  duration_seconds > 0
  At least one component (days, hours, minutes, seconds) > 0
  days ≥ 0
  hours ≥ 0
  minutes ≥ 0
  seconds ≥ 0

Sampling Frequency:

Requirements:
  frequency_hz > 0
  frequency_hz ≥ 0.001 Hz (minimum)

OR equivalently:
  time_step_seconds > 0
  time_step_seconds ≥ 0.001 seconds (minimum)

Relationship:
  frequency_hz = 1 / time_step_seconds
  time_step_seconds = 1 / frequency_hz

Point Count Calculation:

expected_points = ⌈duration_seconds × frequency_hz⌉ + 1

OR:
expected_points = ⌊duration_seconds / time_step_seconds⌋ + 1

Multi-Channel:
total_points = expected_points × n_channels

Limit:
total_points ≤ MAX_POINTS_PER_SERIES (10,000)

Signal Parameters

Mean Value:

μ ∈ ℝ (any real number, no constraints)

Noise Amplitude:

σ ≥ 0 (non-negative)

Trend Slope:

m ∈ ℝ (any real number, positive or negative)

Oscillation Parameters

Frequency:

f > 0 (strictly positive)

Period:

P > 0 (strictly positive)

Amplitude:

A ≥ 0 (non-negative)

Phase:

φ ∈ ℝ (any real number, typically 0 to 2π)

Aliasing Check:

Nyquist Frequency: f_N = f_sampling / 2

ERROR if: f_oscillation > f_N
WARNING if: f_oscillation > f_N / 2

Suggested minimum:
f_sampling ≥ 2.5 × f_oscillation

Transient Parameters

Onset Time:

t₀ ≥ 0 (non-negative)

Natural Frequency:

fₙ > 0 (strictly positive)

Damping Ratio:

ζ > 0 (strictly positive)

Amplitude:

A ∈ ℝ (any real number)

Response Type:

response_type ∈ {"impulse", "step"}

Multi-Channel Validation

Channel Count:

1 ≤ n_channels ≤ MAX_CHANNELS (10)

Channel Names:

Must be non-empty strings
No validation on uniqueness (but recommended)

Correlation Coefficient:

-1.0 ≤ c_ij ≤ 1.0 for all i ≠ j
c_ii = 1.0 (implied, not user-specified)

Correlation Matrix Validation:

Build symmetric matrix C
Compute eigenvalues λ_1, λ_2, ..., λ_N
Check: all λ_i ≥ -1×10^-10 (tolerance for numerical precision)

If any λ_i < -1×10^-10: Matrix is not positive semi-definite → ERROR

Data Degradation Validation

Data Removal:

Number Mode:
  0 < n_remove ≤ total_points
  n_remove ∈ ℤ⁺

Percentage Mode:
  0 < percentage ≤ 100

Outliers:

Number Mode:
  0 < n_outliers ≤ total_points
  n_outliers ∈ ℤ⁺

Percentage Mode:
  0 < percentage ≤ 100

Constant Value Mode:
  c ∈ ℝ (any value)

Random Range Mode:
  v_min < v_max

Factor Multiplication Mode:
  k ≠ 0 (non-zero)

System Limits

Global Constants

MAX_POINTS_PER_SERIES = 10_000
MAX_CHANNELS = 10
MAX_SERIES_PER_USER = 3  # Regular users only

Per-User Limits

Regular Users: - Maximum saved series: 3 - Maximum points per series: 10,000 (total across all channels) - Maximum channels: 10

Superusers: - Maximum saved series: Unlimited - Maximum points per series: 10,000 (enforced) - Maximum channels: 10 (enforced)

Calculated Limits

Single-Channel:

max_duration = MAX_POINTS_PER_SERIES / frequency_hz

Example:
  At 1 Hz: 10,000 seconds (2.78 hours)
  At 0.1 Hz: 100,000 seconds (27.78 hours)
  At 10 Hz: 1,000 seconds (16.67 minutes)

Multi-Channel:

max_duration = MAX_POINTS_PER_SERIES / (frequency_hz × n_channels)

Example (3 channels):
  At 1 Hz: 3,333 seconds (55.56 minutes)
  At 0.1 Hz: 33,333 seconds (9.26 hours)

Precision Limits

Time Step:

Minimum: 0.001 seconds (1 millisecond)
Maximum: 1,000,000 seconds (practical, no hard limit)

Frequency:

Minimum: 0.001 Hz
Maximum: 1,000 Hz (practical, limited by point count)

Numerical Precision:

All floating-point calculations use IEEE 754 double precision (64-bit)
Precision: ~15-17 decimal digits
Epsilon: 2.220446049250313×10^-16

API Endpoints

Time Series Generation

Generate Preview

Endpoint: POST /phoenix/generate/preview/

Purpose: Generate time series without saving to database

Request Body (JSON):

{
  "chart_title": "string (optional)",
  "time_config": {
    "duration_days": number (optional),
    "duration_hours": number (optional),
    "duration_minutes": number (optional),
    "duration_seconds": number (optional),
    "sampling_frequency_hz": number,
    "end_time": "ISO 8601 datetime (optional)"
  },
  "channels": [
    {
      "name": "string",
      "unit": "string (optional)",
      "mean": number,
      "noise_amplitude": number,
      "trend_slope": number (optional),
      "oscillations": [
        {
          "frequency_hz": number (optional),
          "period_seconds": number (optional),
          "amplitude": number,
          "phase": number (optional, default 0)
        }
      ],
      "transients": [
        {
          "onset_seconds": number (optional, default 0),
          "amplitude": number,
          "natural_frequency_hz": number,
          "damping_ratio": number,
          "response_type": "impulse" | "step" (optional, default "impulse")
        }
      ]
    }
  ],
  "correlations": [
    {
      "channel_a_index": integer,
      "channel_b_index": integer,
      "correlation": number (-1.0 to 1.0)
    }
  ],
  "data_removal": {
    "mode": "number" | "percentage",
    "value": number
  } (optional),
  "outliers": {
    "mode": "number" | "percentage",
    "quantity": number,
    "value_mode": "constant" | "range" | "factor",
    "constant_value": number (if value_mode=constant),
    "range_min": number (if value_mode=range),
    "range_max": number (if value_mode=range),
    "factor": number (if value_mode=factor)
  } (optional)
}

Response (JSON):

{
  "chart_data": {
    "traces": [
      {
        "x": ["ISO 8601 timestamps"],
        "y": [numbers],
        "name": "string (channel name)",
        "type": "scatter",
        "mode": "lines"
      }
    ],
    "layout": {
      "title": "string",
      "xaxis": {"title": "Time"},
      "yaxis": {"title": "Value"}
    }
  },
  "statistics": {
    "channel_name": {
      "min": number,
      "max": number,
      "mean": number,
      "count": integer
    }
  },
  "aliasing_warnings": [
    {
      "channel": "string",
      "oscillation_index": integer,
      "frequency": number,
      "nyquist_frequency": number,
      "severity": "error" | "warning",
      "message": "string"
    }
  ],
  "generation_params": {...}  // Echo of input parameters
}

Save Time Series

Endpoint: POST /phoenix/save/

Purpose: Save generated time series to database

Request Body (JSON):

{
  "name": "string (required)",
  "description": "string (optional)",
  "generation_params": {...},  // Same structure as preview request
  "open_in_sentinel": boolean (optional, default false)
}

Response (JSON):

{
  "success": boolean,
  "time_series_id": integer,
  "redirect_url": "string (if open_in_sentinel=true)",
  "message": "string"
}

Error Responses:

{
  "success": false,
  "error": "string (error message)",
  "error_code": "MAX_SERIES_REACHED" | "POINT_LIMIT_EXCEEDED" | "VALIDATION_ERROR"
}

Time Series Management

List User's Time Series

Endpoint: GET /phoenix/

Response: HTML page with list of saved series

Regenerate from Existing

Endpoint: GET /phoenix/generate/<id>/

Response: HTML page with generation form pre-filled with saved parameters

Delete Time Series

Endpoint: POST /phoenix/delete/<id>/

Response: Redirect to /phoenix/


Data Structures

DataFrame Structure (Pandas)

Single-Channel:

DataFrame(
    index=DatetimeIndex,  # Timestamps
    columns=['value']      # Single data column
)

Example:
                         value
2024-01-15 10:00:00     100.5
2024-01-15 10:00:01     101.2
2024-01-15 10:00:02      99.8

Multi-Channel:

DataFrame(
    index=DatetimeIndex,         # Timestamps (shared)
    columns=['ch1', 'ch2', ...]  # One column per channel
)

Example:
                         X-Axis  Y-Axis  Z-Axis
2024-01-15 10:00:00       0.5     0.3    -9.8
2024-01-15 10:00:01       0.6     0.4    -9.7
2024-01-15 10:00:02       0.4     0.2    -9.9

Database Schema (Simplified)

TimeSeriesData Model:

class TimeSeriesData(models.Model):
    id = AutoField(primary_key=True)
    name = CharField(max_length=255)
    description = TextField(blank=True)
    created_at = DateTimeField(auto_now_add=True)
    updated_at = DateTimeField(auto_now=True)
    user = ForeignKey(CustomUser)

    # Stored as JSON
    generation_params = JSONField()

    # Related model
    # TimeSeriesMetadata (not detailed here)

File Formats

CSV Format

Single-Channel:

Timestamp,value
2024-01-15T10:00:00,100.5
2024-01-15T10:00:01,101.2
2024-01-15T10:00:02,99.8

Multi-Channel:

Timestamp,Channel1,Channel2,Channel3
2024-01-15T10:00:00,0.5,0.3,-9.8
2024-01-15T10:00:01,0.6,0.4,-9.7

Specifications: - Encoding: UTF-8 - Line endings: LF (\n) - Timestamp format: ISO 8601 - Decimal separator: . (period) - No thousands separator - Header row: Column names - No index column

Excel (XLSX) Format

Structure: Single worksheet with same data as CSV

Specifications: - Format: Office Open XML (.xlsx) - Worksheet name: "Sheet1" - Timestamps: Excel datetime format - Numbers: Native Excel numeric type

JSON Format

Structure:

{
  "metadata": {
    "name": "string",
    "description": "string",
    "created_at": "ISO 8601",
    "sampling_frequency": number,
    "channels": ["array", "of", "channel", "names"]
  },
  "statistics": {
    "channel_name": {
      "min": number,
      "max": number,
      "mean": number,
      "count": integer
    }
  },
  "generation_params": {
    ...  // Full generation configuration
  },
  "data": [
    {
      "timestamp": "ISO 8601",
      "channel1": number,
      "channel2": number
    }
  ]
}

Specifications: - Encoding: UTF-8 - Timestamp format: ISO 8601 with timezone - Numbers: JSON number type (IEEE 754 double) - Arrays: Ordered


Error Messages

Validation Errors

Time Configuration:

"Duration must be positive"
→ All duration components are zero

"Sampling frequency must be greater than 0.001 Hz"
→ frequency_hz < 0.001

"Data point limit exceeded: {calculated} points > 10,000"
→ Duration × Frequency × Channels > MAX_POINTS_PER_SERIES

Signal Parameters:

"Noise amplitude must be non-negative"
→ noise_amplitude < 0

"Oscillation frequency must be positive"
→ frequency_hz ≤ 0

"Oscillation amplitude must be non-negative"
→ amplitude < 0

Multi-Channel:

"Maximum {MAX_CHANNELS} channels allowed"
→ n_channels > MAX_CHANNELS

"Correlation value must be between -1.0 and 1.0"
→ correlation < -1.0 or correlation > 1.0

"Correlation matrix is not positive semi-definite"
→ Matrix has negative eigenvalues

"Cannot correlate a channel with itself"
→ channel_a_index == channel_b_index

Data Degradation:

"Cannot remove more points than exist"
→ n_remove > total_points

"Range minimum must be less than maximum"
→ range_min ≥ range_max (outliers)

"Multiplication factor cannot be zero"
→ factor == 0

User Limit Errors

"Maximum series limit reached (3). Delete an existing series or contact admin."
→ User has 3 saved series (MAX_SERIES_PER_USER)

"Point limit exceeded. Maximum 10,000 total points allowed."
→ total_points > MAX_POINTS_PER_SERIES

Aliasing Warnings

Error Level:

"Aliasing ERROR: {channel} oscillation #{i} ({freq} Hz) exceeds Nyquist frequency ({nyquist} Hz)"
→ frequency > nyquist_frequency

Warning Level:

"Aliasing WARNING: {channel} oscillation #{i} ({freq} Hz) approaches Nyquist limit ({nyquist} Hz)"
→ frequency > nyquist_frequency / 2

Implementation Notes

Random Number Generation

Library: NumPy random module

Seed: Not set by default (non-reproducible) - Each generation produces different random noise/degradation - For reproducibility, users should save generation parameters and regenerate

Distribution: - Noise: np.random.normal(0, sigma, size) - Removal/Outliers: np.random.choice(indices, size, replace=False) - Random Range: np.random.uniform(low, high, size)

Floating-Point Considerations

Precision: 64-bit IEEE 754 (double precision)

Common Issues:

Duration calculation: sum of days/hours/minutes/seconds
→ Potential rounding in conversion to seconds

Frequency ↔ Period conversion: f = 1/P
→ Possible precision loss in reciprocal

Correlation matrix eigenvalues: numerical computation
→ Tolerance of 1×10^-10 for positive semi-definite check

Performance Considerations

Generation Time: - Single-channel, 10,000 points: < 100ms (typical) - Multi-channel (10 channels), 10,000 points: < 500ms (typical) - Correlations add overhead: ~50-200ms (Cholesky decomposition)

Memory Usage: - 10,000 points × 1 channel × 8 bytes ≈ 80 KB - 10,000 points × 10 channels × 8 bytes ≈ 800 KB - Negligible for modern systems


Glossary

Aliasing: Phenomenon where high-frequency signal appears as lower frequency when undersampled

Cholesky Decomposition: Factorization of positive semi-definite matrix into L × L^T

Correlation Coefficient: Measure of linear relationship between two variables (-1 to +1)

Nyquist Frequency: Half the sampling frequency; maximum frequency that can be represented

Oscillation: Periodic variation (sine wave component)

Positive Semi-Definite: Matrix with all eigenvalues ≥ 0 (required for valid correlation matrix)

Transient Event: Localized, decaying response of a second-order system to a sudden input

Damping Ratio (ζ): Dimensionless measure of how quickly a transient response decays

Natural Frequency: Characteristic oscillation frequency of a second-order system

Sampling Frequency: Number of samples per second (Hz)

Time Step: Duration between consecutive samples (seconds)


References

Signal Processing

  • Nyquist-Shannon Sampling Theorem
  • Discrete Fourier Transform
  • Digital Signal Processing fundamentals

Statistical Methods

  • Gaussian (Normal) Distribution
  • Correlation and Covariance
  • Cholesky Decomposition

Standards

  • ISO 8601: Date and time format
  • IEEE 754: Floating-point arithmetic
  • UTF-8: Character encoding

This technical reference provides the foundation for understanding Phoenix's implementation. For user-focused documentation, see the Phoenix Overview and feature-specific guides.