Designing Custom Health Score Algorithms
Algorithm Architecture & Data Ingestion Pipelines
Establish a deterministic ingestion layer before algorithmic processing begins. Validate raw crawl exports, server log streams, and synthetic monitoring payloads against strict schemas. Apply baseline transformation rules from Metric Scoring & Data Normalization to standardize telemetry formats. Execute the ETL workflow sequentially: extract, validate, normalize, then score.
Enforce JSON Schema or Protobuf contracts on all incoming payloads. Reject malformed records at the gateway to prevent downstream corruption. Implement idempotent ingestion handlers to block duplicate scoring during pipeline retries. Route validated telemetry directly to a time-series database for aggregation.
Workflow Steps
- Configure schema enforcement for incoming telemetry using JSON Schema or Protobuf.
- Implement idempotent data ingestion to prevent duplicate scoring during pipeline retries.
- Route validated payloads to a time-series database for downstream aggregation.
import pandas as pd
from pydantic import BaseModel, ValidationError, field_validator
from typing import Optional
class TelemetryPayload(BaseModel):
url: str
timestamp: str
lcp_ms: Optional[float]
cls_score: Optional[float]
inp_ms: Optional[float]
wcag_violations: int
@field_validator('lcp_ms', 'inp_ms')
def cap_latency(cls, v):
return max(0.0, min(v, 30000.0)) if v is not None else v
def validate_telemetry_payload(raw_json: dict) -> TelemetryPayload:
try:
return TelemetryPayload(**raw_json)
except ValidationError as e:
raise ValueError(f"Schema violation: {e}")
def normalize_metric_distribution(df: pd.DataFrame) -> pd.DataFrame:
df['lcp_norm'] = (df['lcp_ms'] - df['lcp_ms'].min()) / (df['lcp_ms'].max() - df['lcp_ms'].min())
df['inp_norm'] = (df['inp_ms'] - df['inp_ms'].min()) / (df['inp_ms'].max() - df['inp_ms'].min())
return df.dropna(subset=['lcp_norm', 'inp_norm'])
Dynamic Metric Weighting & Normalization Logic
Assign dynamic weights to heterogeneous metrics like crawl errors, LCP, CLS, and JS execution time. Raw values operate on incompatible scales. Apply z-score transformation to center distributions. Use min-max scaling to bound outputs between zero and one. Reference How to Weight Core Web Vitals in Custom Dashboards to prioritize user-centric signals over infrastructure noise.
Calculate composite indices using a weighted geometric mean. This approach penalizes severe outliers more aggressively than arithmetic averaging. Maintain explicit weight matrices that reflect business impact and crawl frequency. Store all configuration matrices in version control to guarantee auditability.
Workflow Steps
- Apply z-score transformation to raw metric distributions.
- Define weight matrices based on business impact and crawl frequency.
- Calculate composite health index using weighted geometric mean to penalize severe outliers.
- Version control weight matrices via Git for auditability.
import numpy as np
import pandas as pd
def apply_min_max_scaling(series: pd.Series) -> pd.Series:
min_val, max_val = series.min(), series.max()
if max_val == min_val:
return pd.Series(1.0, index=series.index)
return (series - min_val) / (max_val - min_val)
def calculate_weighted_geometric_mean(df: pd.DataFrame, weights: dict) -> pd.Series:
scaled_df = df.apply(apply_min_max_scaling)
log_scaled = np.log(scaled_df + 1e-9)
weight_vector = np.array([weights.get(col, 0.0) for col in scaled_df.columns])
normalized_weights = weight_vector / weight_vector.sum()
return np.exp((log_scaled * normalized_weights).sum(axis=1))
Threshold Calibration & Alert Routing
Establish dynamic alerting boundaries that adapt to site architecture complexity. Commercial landing pages require stricter boundaries than low-priority utility endpoints. Implement tiered severity routing as detailed in Calibrating Error Thresholds for Different Site Sections. Route critical INP regressions directly to SRE channels. Direct WCAG compliance drops to the accessibility engineering queue.
Configure adaptive thresholds using rolling percentile baselines. Calculate P95 and P99 values over a 30-day sliding window. Map severity tiers to automated routing rules. Enforce cooldown periods to suppress alert fatigue during active deployment windows.
Workflow Steps
- Segment URL taxonomy by business value and traffic volume.
- Configure adaptive thresholds using rolling percentile baselines (P95/P99).
- Map severity tiers to automated routing rules (SRE vs SEO team).
- Implement cooldown periods to prevent alert fatigue during deployments.
# alert_routing_config.yaml
thresholds:
commercial:
lcp_p95: 2500
cls_p99: 0.15
inp_p95: 200
utility:
lcp_p95: 4000
cls_p99: 0.25
inp_p95: 500
wcag_compliance:
critical_violations: 0
warning_violations: 5
routing:
critical:
channel: pagerduty-sre
cooldown_minutes: 30
warning:
channel: slack-seo-team
cooldown_minutes: 60
info:
channel: slack-monitoring
cooldown_minutes: 1440
Version Control & Release Cycle Tracking
Isolate algorithmic scoring changes from genuine site health fluctuations. Tag every scoring execution with the corresponding deployment hash and configuration snapshot. Correlate algorithm updates with CI/CD pipeline events using methodologies from Tracking Metric Trends Across Release Cycles. Maintain strict rollback procedures when scoring accuracy degrades.
Embed Git commit SHAs directly into scoring metadata. Execute A/B scoring comparisons during iterative development phases. Automate regression testing against frozen historical baseline datasets. Configure drift detection triggers to initiate model retraining when distribution shifts exceed tolerance thresholds.
Workflow Steps
- Embed Git commit SHA and config version in scoring metadata.
- Run A/B scoring comparisons during algorithm iterations.
- Automate regression testing against historical baseline datasets.
- Document drift detection triggers for model retraining.
# .github/workflows/scoring-pipeline.yaml
name: Health Score Validation
on: [push, schedule]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Tag Scoring Run
run: |
echo "SCORING_VERSION=$(git rev-parse --short HEAD)" >> $GITHUB_ENV
echo "CONFIG_HASH=$(sha256sum weights_matrix.json | awk '{print $1}')" >> $GITHUB_ENV
- name: Run Regression Baseline Check
run: |
python scripts/run_regression_baseline_check.py \
--baseline data/historical_scores_v2.parquet \
--current data/latest_scores.parquet \
--threshold 0.05
Implementation Specifications
Configure the orchestrator to execute daily incremental updates and weekly full recalculations. Route metadata to PostgreSQL. Store raw telemetry in ClickHouse or BigQuery. Cache frequently accessed weight matrices in Redis.
Normalization Pipeline Sequence
- Extract raw metrics from crawl logs and RUM payloads.
- Cap outliers using the Interquartile Range (IQR) method.
- Stratify data by device type before aggregation.
- Apply time-decay weighting to emphasize recent data points.
Common Mistakes
- Hardcoding static weights that ignore seasonal traffic variance.
- Mixing incompatible scales without normalization (e.g., combining milliseconds with boolean flags).
- Ignoring device segmentation, which masks mobile performance degradation behind desktop averages.
- Omitting algorithm versioning, which destroys historical trend analysis capabilities.
- Deploying static thresholds that trigger over-alerting across complex page architectures.