The IC Score: An AI-Powered Formula for SOC Alert Triage

The Problem: Manual Triage Does Not Scale

Our SOC had a scaling problem that no amount of hiring could solve.

Every morning, the queue held 100+ alerts waiting for human eyes. Each alert required 15-30 minutes of investigation: pull SIEM logs, check threat intel feeds, correlate with identity data, review historical patterns, assess asset criticality. Multiply that by the volume and you get analysts drowning in repetitive work while genuinely dangerous alerts sit buried in the pile.

The real issue was not just speed – it was consistency. Two analysts investigating the same alert would reach different conclusions depending on which data sources they checked, how deep they went, and how much fatigue had accumulated. There was no structured scoring, no repeatable methodology, and no clear escalation criteria.

I needed a system that could score alerts deterministically, using every available signal source, and surface only the alerts that truly required human judgment.

Designing the IC Score: A Weighted Scoring Formula

The IC Score (Investigation Confidence Score) is a weighted composite that pulls from 7 signal sources, each contributing a normalized value between 0 and 1. The weights reflect each source’s reliability and predictive power, calibrated over months of production data.

Signal Sources and Weights

#	Signal Source	Weight	What It Measures
1	IP Threat Intelligence (SIEM enrichment)	30%	Known malicious IPs, ASN reputation, geo-anomalies
2	Threat Feed Score	20%	IOC matches against curated commercial and open feeds
3	Historical Data Lake Lookup	20%	Has this entity/pattern appeared in past incidents?
4	AI Triage Summary	15%	LLM-generated severity assessment from raw alert context
5	Employee Context (identity lookup)	10%	Role, department, access level, recent activity patterns
6	Threat Intelligence Correlation	10%	Cross-reference with active campaigns and TTPs
7	Raw Payload Analysis	5%	Pattern matching on the actual event payload

Note the weights sum to 110% intentionally – this creates a natural ceiling effect where a perfect score across all signals caps at ~100 after the normalization step.

Asset Criticality Multiplier

Not all assets are equal. An alert on a domain controller or production database deserves more attention than one on a developer laptop. The criticality multiplier adjusts the raw score:

Asset Tier	Multiplier	Examples
Crown Jewel	x1.5	Domain controllers, key management, production databases
Sensitive	x1.2	Internal APIs, CI/CD pipelines, executive endpoints
Normal	x1.0	Standard workstations, non-sensitive services

The Formula

def calculate_ic_score(signals: dict, asset_tier: str) -> float:
    """
    Calculate the Investigation Confidence Score for an alert.
    
    signals: dict with keys matching SIGNAL_WEIGHTS, values 0.0-1.0
    asset_tier: one of 'crown_jewel', 'sensitive', 'normal'
    """
    SIGNAL_WEIGHTS = {
        "ip_threat_intel": 0.30,
        "threat_feed_score": 0.20,
        "historical_lookup": 0.20,
        "ai_triage_summary": 0.15,
        "employee_context": 0.10,
        "threat_intel_correlation": 0.10,
        "raw_payload_analysis": 0.05,
    }

    ASSET_MULTIPLIERS = {
        "crown_jewel": 1.5,
        "sensitive": 1.2,
        "normal": 1.0,
    }

    # Weighted sum of normalized signals
    raw_score = sum(
        signals.get(source, 0.0) * weight
        for source, weight in SIGNAL_WEIGHTS.items()
    )

    # Apply asset criticality multiplier
    multiplier = ASSET_MULTIPLIERS.get(asset_tier, 1.0)
    final_score = min(raw_score * multiplier * 100, 100.0)

    return round(final_score, 1)

Decision Thresholds

The score alone means nothing without clear action boundaries. After weeks of tuning against historical incident data, we settled on four tiers:

Score Range	Action	Rationale
0-29	Auto-close	Low-confidence signals, known benign patterns, informational noise
30-69	Acknowledge and monitor	Suspicious but insufficient evidence for escalation. Ticket created, watchlist updated
70-84	Escalate for human review	High confidence of malicious activity. Analyst investigates with full context pre-loaded
85+	Auto-containment eligible	Critical threat on critical asset. Automated response (isolate host, revoke session) with immediate analyst notification

The auto-containment tier (85+) is deliberately aggressive. It only fires when multiple high-weight signals converge on a crown jewel asset. In practice, this triggers 2-3 times per week and has a false positive rate below 1%.

Results

After three months in production:

Mean time to triage: < 2 minutes (down from 15-30 minutes)
Signal fidelity: 95% (validated against analyst post-incident reviews)
Human escalation rate: < 5% of total alert volume
Auto-close accuracy: 99.2% (spot-checked weekly by rotating analyst)
Analyst capacity freed: approximately 40 hours/week redirected to threat hunting

The biggest win was not speed – it was consistency. Every alert now receives the same depth of investigation regardless of queue size, time of day, or analyst fatigue.

Lessons Learned

Start with conservative thresholds, tighten over time. Our initial auto-close range was 0-19, not 0-29. We expanded it only after validating six weeks of decisions against manual review. Rushing threshold expansion is how you miss real incidents.

Always include a human override path. Any analyst can manually override any automated decision. The system flags the override, logs the reasoning, and feeds it back into calibration. In the first month, overrides helped us identify two signal sources that needed weight adjustments.

Log every decision for auditability. Every IC Score calculation is stored with its full signal breakdown, the decision taken, and a timestamp. When leadership asks “why was this alert closed?” – you can show exactly which signals contributed and what thresholds applied. This is non-negotiable for compliance and for continuous improvement.

Weight calibration is never done. We re-evaluate weights quarterly using a confusion matrix of scored alerts vs. confirmed incidents. The AI triage signal started at 10% weight and earned its way to 15% after demonstrating consistent accuracy. Treat weights as living parameters.

The formula is the easy part. The real engineering challenge is building reliable enrichment pipelines that feed normalized signals into the scoring function within seconds. If your threat intel lookup takes 45 seconds, your 2-minute triage target is already dead.

The IC Score is not a replacement for skilled analysts – it is a force multiplier. It ensures that human attention goes where it matters most, and that the 95% of alerts that are noise get handled deterministically without burning out your team.