Phoenix Technical Reference
This document provides comprehensive technical specifications for Phoenix, including mathematical models, validation rules, API endpoints, and implementation details.
Table of Contents
- Mathematical Models
- Validation Rules
- System Limits
- API Endpoints
- Data Structures
- File Formats
- Error Messages
Mathematical Models
Signal Generation Formula
Phoenix generates time series using a superposition model:
signal(t) = μ + trend(t) + Σ oscillation_i(t) + Σ transient_j(t) + noise(t)
Where:
μ = mean (baseline value)
t = time in seconds from start
trend(t) = linear trend component
oscillation_i(t) = ith oscillatory component
transient_j(t) = jth transient event
noise(t) = random noise component
Components in Detail
Base Signal (Mean)
baseline = μ
Where:
μ ∈ ℝ (any real number)
Noise Component
Gaussian (normal) distribution with zero mean:
noise(t) ~ N(0, σ²)
Where:
σ = noise amplitude (standard deviation)
σ ≥ 0
N(0, σ²) = normal distribution with mean 0, variance σ²
Implementation:
noise = np.random.normal(0, sigma, size=n_points)
Statistical Properties: - Mean: 0 - Standard Deviation: σ - ~68% of values within ±σ - ~95% of values within ±2σ - ~99.7% of values within ±3σ
Linear Trend
trend(t) = m × i
Where:
m = slope (rate of change per sample)
i = sample index (0, 1, 2, ..., n-1)
m ∈ ℝ (any real number, positive or negative)
Implementation:
trend = slope * np.arange(n_points)
Total Change:
Δtotal = m × (n_points - 1)
Oscillation (Sine Wave)
Each oscillation follows:
oscillation(t) = A × sin(2π × f × t + φ)
Where:
A = amplitude (peak height)
f = frequency (Hz, cycles per second)
t = time (seconds)
φ = phase offset (radians)
Constraints:
A ≥ 0
f > 0
φ ∈ ℝ
Frequency-Period Relationship:
f = 1 / P
P = 1 / f
Where:
P = period (seconds per cycle)
Implementation (frequency-based):
time_array = np.arange(n_points) * time_step
oscillation = amplitude * np.sin(2 * np.pi * frequency * time_array + phase)
Implementation (period-based):
frequency = 1.0 / period
oscillation = amplitude * np.sin(2 * np.pi * frequency * time_array + phase)
Transient Response (Second-Order System)
Each transient models an impulse or step response:
transient(t) = A × h(t - t₀) for t ≥ t₀
transient(t) = 0 for t < t₀
Where:
A = amplitude (peak height)
t₀ = onset time (seconds)
h = normalized response function
Impulse Response
Underdamped (ζ < 1):
h(τ) = exp(-ζωₙτ) × sin(ωdτ) / peak
Where:
ωₙ = 2π × fₙ (natural angular frequency)
ωd = ωₙ√(1 - ζ²) (damped natural frequency)
peak = exp(-ζωₙt_p) × sin(ωd × t_p)
t_p = atan2(ωd, ζωₙ) / ωd (time of peak)
Critically Damped (ζ = 1):
h(τ) = ωₙτ × exp(-ωₙτ) / e⁻¹
Overdamped (ζ > 1):
h(τ) = exp(-ζωₙτ) × sinh(ω'τ) / max(|h|)
Where:
ω' = ωₙ√(ζ² - 1)
Step Response
Underdamped (ζ < 1):
h(τ) = 1 - exp(-ζωₙτ) × [cos(ωdτ) + (ζ/√(1-ζ²)) × sin(ωdτ)]
Critically Damped (ζ = 1):
h(τ) = 1 - (1 + ωₙτ) × exp(-ωₙτ)
Overdamped (ζ > 1):
h(τ) = 1 - exp(-ζωₙτ) × [cosh(ω'τ) + (ζ/√(ζ²-1)) × sinh(ω'τ)]
Implementation:
# Impulse response is normalized so peak magnitude ≈ 1.0
# Step response settles to 1.0 (with possible overshoot for underdamped)
# Final signal: amplitude * response
Multi-Channel Correlation
Correlations are applied using Cholesky decomposition:
Correlation Matrix
For N channels, the correlation matrix C is N×N symmetric:
C = [c_ij] where c_ij = correlation between channel i and j
Properties:
c_ii = 1 (self-correlation)
c_ij = c_ji (symmetric)
-1 ≤ c_ij ≤ 1
C must be positive semi-definite
Cholesky Decomposition
C = L × L^T
Where:
L = lower triangular matrix (Cholesky factor)
L^T = transpose of L
Existence: Cholesky decomposition exists if and only if C is positive semi-definite.
Positive Semi-Definite Test:
C is positive semi-definite ⟺ all eigenvalues λ_i ≥ 0
Phoenix uses tolerance: λ_i ≥ -1×10^-10
Correlation Application Algorithm
For independent signals X = [x_1, x_2, ..., x_N]:
- Normalize: ``` X_norm = (X - μ_X) / σ_X
Where: μ_X = mean of each channel σ_X = standard deviation of each channel ```
- Apply Correlation: ``` X_corr = L × X_norm
Where L is the Cholesky factor of correlation matrix C ```
- Denormalize:
X_final = X_corr × σ_X + μ_X
Result: X_final has: - Correlation matrix = C - Original means μ_X - Original standard deviations σ_X
Data Degradation Models
Data Point Removal
Number Mode:
n_remove = N (user-specified count)
Constraints:
0 < N ≤ total_points
N ∈ ℤ⁺ (positive integer)
Percentage Mode:
n_remove = ⌈total_points × (P / 100)⌉
Where:
P = percentage (user-specified)
⌈·⌉ = ceiling function
Constraints:
0 < P ≤ 100
Selection:
indices_to_remove = random_sample(range(total_points), n_remove)
Without replacement (each point removed at most once)
Uniform distribution (each point equally likely)
Outlier Insertion
Quantity (same as removal): - Number mode: exact count - Percentage mode: proportion of points
Value Generation:
1. Constant Value:
outlier_value = c (user-specified constant)
c ∈ ℝ (any real number)
2. Random Range:
outlier_value ~ U(v_min, v_max)
Where:
U(a, b) = uniform distribution between a and b
v_min < v_max (user-specified)
3. Factor Multiplication:
outlier_value = k × original_value
Where:
k = multiplication factor (user-specified)
k ∈ ℝ, k ≠ 0
original_value = value at selected timestamp before replacement
Validation Rules
Time Configuration
Duration Validation:
duration_seconds = days × 86400 + hours × 3600 + minutes × 60 + seconds
Requirements:
duration_seconds > 0
At least one component (days, hours, minutes, seconds) > 0
days ≥ 0
hours ≥ 0
minutes ≥ 0
seconds ≥ 0
Sampling Frequency:
Requirements:
frequency_hz > 0
frequency_hz ≥ 0.001 Hz (minimum)
OR equivalently:
time_step_seconds > 0
time_step_seconds ≥ 0.001 seconds (minimum)
Relationship:
frequency_hz = 1 / time_step_seconds
time_step_seconds = 1 / frequency_hz
Point Count Calculation:
expected_points = ⌈duration_seconds × frequency_hz⌉ + 1
OR:
expected_points = ⌊duration_seconds / time_step_seconds⌋ + 1
Multi-Channel:
total_points = expected_points × n_channels
Limit:
total_points ≤ MAX_POINTS_PER_SERIES (10,000)
Signal Parameters
Mean Value:
μ ∈ ℝ (any real number, no constraints)
Noise Amplitude:
σ ≥ 0 (non-negative)
Trend Slope:
m ∈ ℝ (any real number, positive or negative)
Oscillation Parameters
Frequency:
f > 0 (strictly positive)
Period:
P > 0 (strictly positive)
Amplitude:
A ≥ 0 (non-negative)
Phase:
φ ∈ ℝ (any real number, typically 0 to 2π)
Aliasing Check:
Nyquist Frequency: f_N = f_sampling / 2
ERROR if: f_oscillation > f_N
WARNING if: f_oscillation > f_N / 2
Suggested minimum:
f_sampling ≥ 2.5 × f_oscillation
Transient Parameters
Onset Time:
t₀ ≥ 0 (non-negative)
Natural Frequency:
fₙ > 0 (strictly positive)
Damping Ratio:
ζ > 0 (strictly positive)
Amplitude:
A ∈ ℝ (any real number)
Response Type:
response_type ∈ {"impulse", "step"}
Multi-Channel Validation
Channel Count:
1 ≤ n_channels ≤ MAX_CHANNELS (10)
Channel Names:
Must be non-empty strings
No validation on uniqueness (but recommended)
Correlation Coefficient:
-1.0 ≤ c_ij ≤ 1.0 for all i ≠ j
c_ii = 1.0 (implied, not user-specified)
Correlation Matrix Validation:
Build symmetric matrix C
Compute eigenvalues λ_1, λ_2, ..., λ_N
Check: all λ_i ≥ -1×10^-10 (tolerance for numerical precision)
If any λ_i < -1×10^-10: Matrix is not positive semi-definite → ERROR
Data Degradation Validation
Data Removal:
Number Mode:
0 < n_remove ≤ total_points
n_remove ∈ ℤ⁺
Percentage Mode:
0 < percentage ≤ 100
Outliers:
Number Mode:
0 < n_outliers ≤ total_points
n_outliers ∈ ℤ⁺
Percentage Mode:
0 < percentage ≤ 100
Constant Value Mode:
c ∈ ℝ (any value)
Random Range Mode:
v_min < v_max
Factor Multiplication Mode:
k ≠ 0 (non-zero)
System Limits
Global Constants
MAX_POINTS_PER_SERIES = 10_000
MAX_CHANNELS = 10
MAX_SERIES_PER_USER = 3 # Regular users only
Per-User Limits
Regular Users: - Maximum saved series: 3 - Maximum points per series: 10,000 (total across all channels) - Maximum channels: 10
Superusers: - Maximum saved series: Unlimited - Maximum points per series: 10,000 (enforced) - Maximum channels: 10 (enforced)
Calculated Limits
Single-Channel:
max_duration = MAX_POINTS_PER_SERIES / frequency_hz
Example:
At 1 Hz: 10,000 seconds (2.78 hours)
At 0.1 Hz: 100,000 seconds (27.78 hours)
At 10 Hz: 1,000 seconds (16.67 minutes)
Multi-Channel:
max_duration = MAX_POINTS_PER_SERIES / (frequency_hz × n_channels)
Example (3 channels):
At 1 Hz: 3,333 seconds (55.56 minutes)
At 0.1 Hz: 33,333 seconds (9.26 hours)
Precision Limits
Time Step:
Minimum: 0.001 seconds (1 millisecond)
Maximum: 1,000,000 seconds (practical, no hard limit)
Frequency:
Minimum: 0.001 Hz
Maximum: 1,000 Hz (practical, limited by point count)
Numerical Precision:
All floating-point calculations use IEEE 754 double precision (64-bit)
Precision: ~15-17 decimal digits
Epsilon: 2.220446049250313×10^-16
API Endpoints
Time Series Generation
Generate Preview
Endpoint: POST /phoenix/generate/preview/
Purpose: Generate time series without saving to database
Request Body (JSON):
{
"chart_title": "string (optional)",
"time_config": {
"duration_days": number (optional),
"duration_hours": number (optional),
"duration_minutes": number (optional),
"duration_seconds": number (optional),
"sampling_frequency_hz": number,
"end_time": "ISO 8601 datetime (optional)"
},
"channels": [
{
"name": "string",
"unit": "string (optional)",
"mean": number,
"noise_amplitude": number,
"trend_slope": number (optional),
"oscillations": [
{
"frequency_hz": number (optional),
"period_seconds": number (optional),
"amplitude": number,
"phase": number (optional, default 0)
}
],
"transients": [
{
"onset_seconds": number (optional, default 0),
"amplitude": number,
"natural_frequency_hz": number,
"damping_ratio": number,
"response_type": "impulse" | "step" (optional, default "impulse")
}
]
}
],
"correlations": [
{
"channel_a_index": integer,
"channel_b_index": integer,
"correlation": number (-1.0 to 1.0)
}
],
"data_removal": {
"mode": "number" | "percentage",
"value": number
} (optional),
"outliers": {
"mode": "number" | "percentage",
"quantity": number,
"value_mode": "constant" | "range" | "factor",
"constant_value": number (if value_mode=constant),
"range_min": number (if value_mode=range),
"range_max": number (if value_mode=range),
"factor": number (if value_mode=factor)
} (optional)
}
Response (JSON):
{
"chart_data": {
"traces": [
{
"x": ["ISO 8601 timestamps"],
"y": [numbers],
"name": "string (channel name)",
"type": "scatter",
"mode": "lines"
}
],
"layout": {
"title": "string",
"xaxis": {"title": "Time"},
"yaxis": {"title": "Value"}
}
},
"statistics": {
"channel_name": {
"min": number,
"max": number,
"mean": number,
"count": integer
}
},
"aliasing_warnings": [
{
"channel": "string",
"oscillation_index": integer,
"frequency": number,
"nyquist_frequency": number,
"severity": "error" | "warning",
"message": "string"
}
],
"generation_params": {...} // Echo of input parameters
}
Save Time Series
Endpoint: POST /phoenix/save/
Purpose: Save generated time series to database
Request Body (JSON):
{
"name": "string (required)",
"description": "string (optional)",
"generation_params": {...}, // Same structure as preview request
"open_in_sentinel": boolean (optional, default false)
}
Response (JSON):
{
"success": boolean,
"time_series_id": integer,
"redirect_url": "string (if open_in_sentinel=true)",
"message": "string"
}
Error Responses:
{
"success": false,
"error": "string (error message)",
"error_code": "MAX_SERIES_REACHED" | "POINT_LIMIT_EXCEEDED" | "VALIDATION_ERROR"
}
Time Series Management
List User's Time Series
Endpoint: GET /phoenix/
Response: HTML page with list of saved series
Regenerate from Existing
Endpoint: GET /phoenix/generate/<id>/
Response: HTML page with generation form pre-filled with saved parameters
Delete Time Series
Endpoint: POST /phoenix/delete/<id>/
Response: Redirect to /phoenix/
Data Structures
DataFrame Structure (Pandas)
Single-Channel:
DataFrame(
index=DatetimeIndex, # Timestamps
columns=['value'] # Single data column
)
Example:
value
2024-01-15 10:00:00 100.5
2024-01-15 10:00:01 101.2
2024-01-15 10:00:02 99.8
Multi-Channel:
DataFrame(
index=DatetimeIndex, # Timestamps (shared)
columns=['ch1', 'ch2', ...] # One column per channel
)
Example:
X-Axis Y-Axis Z-Axis
2024-01-15 10:00:00 0.5 0.3 -9.8
2024-01-15 10:00:01 0.6 0.4 -9.7
2024-01-15 10:00:02 0.4 0.2 -9.9
Database Schema (Simplified)
TimeSeriesData Model:
class TimeSeriesData(models.Model):
id = AutoField(primary_key=True)
name = CharField(max_length=255)
description = TextField(blank=True)
created_at = DateTimeField(auto_now_add=True)
updated_at = DateTimeField(auto_now=True)
user = ForeignKey(CustomUser)
# Stored as JSON
generation_params = JSONField()
# Related model
# TimeSeriesMetadata (not detailed here)
File Formats
CSV Format
Single-Channel:
Timestamp,value
2024-01-15T10:00:00,100.5
2024-01-15T10:00:01,101.2
2024-01-15T10:00:02,99.8
Multi-Channel:
Timestamp,Channel1,Channel2,Channel3
2024-01-15T10:00:00,0.5,0.3,-9.8
2024-01-15T10:00:01,0.6,0.4,-9.7
Specifications: - Encoding: UTF-8 - Line endings: LF (\n) - Timestamp format: ISO 8601 - Decimal separator: . (period) - No thousands separator - Header row: Column names - No index column
Excel (XLSX) Format
Structure: Single worksheet with same data as CSV
Specifications: - Format: Office Open XML (.xlsx) - Worksheet name: "Sheet1" - Timestamps: Excel datetime format - Numbers: Native Excel numeric type
JSON Format
Structure:
{
"metadata": {
"name": "string",
"description": "string",
"created_at": "ISO 8601",
"sampling_frequency": number,
"channels": ["array", "of", "channel", "names"]
},
"statistics": {
"channel_name": {
"min": number,
"max": number,
"mean": number,
"count": integer
}
},
"generation_params": {
... // Full generation configuration
},
"data": [
{
"timestamp": "ISO 8601",
"channel1": number,
"channel2": number
}
]
}
Specifications: - Encoding: UTF-8 - Timestamp format: ISO 8601 with timezone - Numbers: JSON number type (IEEE 754 double) - Arrays: Ordered
Error Messages
Validation Errors
Time Configuration:
"Duration must be positive"
→ All duration components are zero
"Sampling frequency must be greater than 0.001 Hz"
→ frequency_hz < 0.001
"Data point limit exceeded: {calculated} points > 10,000"
→ Duration × Frequency × Channels > MAX_POINTS_PER_SERIES
Signal Parameters:
"Noise amplitude must be non-negative"
→ noise_amplitude < 0
"Oscillation frequency must be positive"
→ frequency_hz ≤ 0
"Oscillation amplitude must be non-negative"
→ amplitude < 0
Multi-Channel:
"Maximum {MAX_CHANNELS} channels allowed"
→ n_channels > MAX_CHANNELS
"Correlation value must be between -1.0 and 1.0"
→ correlation < -1.0 or correlation > 1.0
"Correlation matrix is not positive semi-definite"
→ Matrix has negative eigenvalues
"Cannot correlate a channel with itself"
→ channel_a_index == channel_b_index
Data Degradation:
"Cannot remove more points than exist"
→ n_remove > total_points
"Range minimum must be less than maximum"
→ range_min ≥ range_max (outliers)
"Multiplication factor cannot be zero"
→ factor == 0
User Limit Errors
"Maximum series limit reached (3). Delete an existing series or contact admin."
→ User has 3 saved series (MAX_SERIES_PER_USER)
"Point limit exceeded. Maximum 10,000 total points allowed."
→ total_points > MAX_POINTS_PER_SERIES
Aliasing Warnings
Error Level:
"Aliasing ERROR: {channel} oscillation #{i} ({freq} Hz) exceeds Nyquist frequency ({nyquist} Hz)"
→ frequency > nyquist_frequency
Warning Level:
"Aliasing WARNING: {channel} oscillation #{i} ({freq} Hz) approaches Nyquist limit ({nyquist} Hz)"
→ frequency > nyquist_frequency / 2
Implementation Notes
Random Number Generation
Library: NumPy random module
Seed: Not set by default (non-reproducible) - Each generation produces different random noise/degradation - For reproducibility, users should save generation parameters and regenerate
Distribution:
- Noise: np.random.normal(0, sigma, size)
- Removal/Outliers: np.random.choice(indices, size, replace=False)
- Random Range: np.random.uniform(low, high, size)
Floating-Point Considerations
Precision: 64-bit IEEE 754 (double precision)
Common Issues:
Duration calculation: sum of days/hours/minutes/seconds
→ Potential rounding in conversion to seconds
Frequency ↔ Period conversion: f = 1/P
→ Possible precision loss in reciprocal
Correlation matrix eigenvalues: numerical computation
→ Tolerance of 1×10^-10 for positive semi-definite check
Performance Considerations
Generation Time: - Single-channel, 10,000 points: < 100ms (typical) - Multi-channel (10 channels), 10,000 points: < 500ms (typical) - Correlations add overhead: ~50-200ms (Cholesky decomposition)
Memory Usage: - 10,000 points × 1 channel × 8 bytes ≈ 80 KB - 10,000 points × 10 channels × 8 bytes ≈ 800 KB - Negligible for modern systems
Glossary
Aliasing: Phenomenon where high-frequency signal appears as lower frequency when undersampled
Cholesky Decomposition: Factorization of positive semi-definite matrix into L × L^T
Correlation Coefficient: Measure of linear relationship between two variables (-1 to +1)
Nyquist Frequency: Half the sampling frequency; maximum frequency that can be represented
Oscillation: Periodic variation (sine wave component)
Positive Semi-Definite: Matrix with all eigenvalues ≥ 0 (required for valid correlation matrix)
Transient Event: Localized, decaying response of a second-order system to a sudden input
Damping Ratio (ζ): Dimensionless measure of how quickly a transient response decays
Natural Frequency: Characteristic oscillation frequency of a second-order system
Sampling Frequency: Number of samples per second (Hz)
Time Step: Duration between consecutive samples (seconds)
References
Signal Processing
- Nyquist-Shannon Sampling Theorem
- Discrete Fourier Transform
- Digital Signal Processing fundamentals
Statistical Methods
- Gaussian (Normal) Distribution
- Correlation and Covariance
- Cholesky Decomposition
Standards
- ISO 8601: Date and time format
- IEEE 754: Floating-point arithmetic
- UTF-8: Character encoding
This technical reference provides the foundation for understanding Phoenix's implementation. For user-focused documentation, see the Phoenix Overview and feature-specific guides.