The IC Score: An AI-Powered Formula for SOC Alert Triage
The Problem: Manual Triage Does Not Scale
Our SOC had a scaling problem that no amount of hiring could solve.
Every morning, the queue held 100+ alerts waiting for human eyes. Each alert required 15-30 minutes of investigation: pull SIEM logs, check threat intel feeds, correlate with identity data, review historical patterns, assess asset criticality. Multiply that by the volume and you get analysts drowning in repetitive work while genuinely dangerous alerts sit buried in the pile.
The real issue was not just speed – it was consistency. Two analysts investigating the same alert would reach different conclusions depending on which data sources they checked, how deep they went, and how much fatigue had accumulated. There was no structured scoring, no repeatable methodology, and no clear escalation criteria.
I needed a system that could score alerts deterministically, using every available signal source, and surface only the alerts that truly required human judgment.
Designing the IC Score: A Weighted Scoring Formula
The IC Score (Investigation Confidence Score) is a weighted composite that pulls from 7 signal sources, each contributing a normalized value between 0 and 1. The weights reflect each source’s reliability and predictive power, calibrated over months of production data.
Signal Sources and Weights
| # | Signal Source | Weight | What It Measures |
|---|---|---|---|
| 1 | IP Threat Intelligence (SIEM enrichment) | 30% | Known malicious IPs, ASN reputation, geo-anomalies |
| 2 | Threat Feed Score | 20% | IOC matches against curated commercial and open feeds |
| 3 | Historical Data Lake Lookup | 20% | Has this entity/pattern appeared in past incidents? |
| 4 | AI Triage Summary | 15% | LLM-generated severity assessment from raw alert context |
| 5 | Employee Context (identity lookup) | 10% | Role, department, access level, recent activity patterns |
| 6 | Threat Intelligence Correlation | 10% | Cross-reference with active campaigns and TTPs |
| 7 | Raw Payload Analysis | 5% | Pattern matching on the actual event payload |
Note the weights sum to 110% intentionally – this creates a natural ceiling effect where a perfect score across all signals caps at ~100 after the normalization step.
Asset Criticality Multiplier
Not all assets are equal. An alert on a domain controller or production database deserves more attention than one on a developer laptop. The criticality multiplier adjusts the raw score:
| Asset Tier | Multiplier | Examples |
|---|---|---|
| Crown Jewel | x1.5 | Domain controllers, key management, production databases |
| Sensitive | x1.2 | Internal APIs, CI/CD pipelines, executive endpoints |
| Normal | x1.0 | Standard workstations, non-sensitive services |
The Formula
def calculate_ic_score(signals: dict, asset_tier: str) -> float:
"""
Calculate the Investigation Confidence Score for an alert.
signals: dict with keys matching SIGNAL_WEIGHTS, values 0.0-1.0
asset_tier: one of 'crown_jewel', 'sensitive', 'normal'
"""
SIGNAL_WEIGHTS = {
"ip_threat_intel": 0.30,
"threat_feed_score": 0.20,
"historical_lookup": 0.20,
"ai_triage_summary": 0.15,
"employee_context": 0.10,
"threat_intel_correlation": 0.10,
"raw_payload_analysis": 0.05,
}
ASSET_MULTIPLIERS = {
"crown_jewel": 1.5,
"sensitive": 1.2,
"normal": 1.0,
}
# Weighted sum of normalized signals
raw_score = sum(
signals.get(source, 0.0) * weight
for source, weight in SIGNAL_WEIGHTS.items()
)
# Apply asset criticality multiplier
multiplier = ASSET_MULTIPLIERS.get(asset_tier, 1.0)
final_score = min(raw_score * multiplier * 100, 100.0)
return round(final_score, 1)
Decision Thresholds
The score alone means nothing without clear action boundaries. After weeks of tuning against historical incident data, we settled on four tiers:
| Score Range | Action | Rationale |
|---|---|---|
| 0-29 | Auto-close | Low-confidence signals, known benign patterns, informational noise |
| 30-69 | Acknowledge and monitor | Suspicious but insufficient evidence for escalation. Ticket created, watchlist updated |
| 70-84 | Escalate for human review | High confidence of malicious activity. Analyst investigates with full context pre-loaded |
| 85+ | Auto-containment eligible | Critical threat on critical asset. Automated response (isolate host, revoke session) with immediate analyst notification |
The auto-containment tier (85+) is deliberately aggressive. It only fires when multiple high-weight signals converge on a crown jewel asset. In practice, this triggers 2-3 times per week and has a false positive rate below 1%.
Results
After three months in production:
- Mean time to triage: < 2 minutes (down from 15-30 minutes)
- Signal fidelity: 95% (validated against analyst post-incident reviews)
- Human escalation rate: < 5% of total alert volume
- Auto-close accuracy: 99.2% (spot-checked weekly by rotating analyst)
- Analyst capacity freed: approximately 40 hours/week redirected to threat hunting
The biggest win was not speed – it was consistency. Every alert now receives the same depth of investigation regardless of queue size, time of day, or analyst fatigue.
Lessons Learned
Start with conservative thresholds, tighten over time. Our initial auto-close range was 0-19, not 0-29. We expanded it only after validating six weeks of decisions against manual review. Rushing threshold expansion is how you miss real incidents.
Always include a human override path. Any analyst can manually override any automated decision. The system flags the override, logs the reasoning, and feeds it back into calibration. In the first month, overrides helped us identify two signal sources that needed weight adjustments.
Log every decision for auditability. Every IC Score calculation is stored with its full signal breakdown, the decision taken, and a timestamp. When leadership asks “why was this alert closed?” – you can show exactly which signals contributed and what thresholds applied. This is non-negotiable for compliance and for continuous improvement.
Weight calibration is never done. We re-evaluate weights quarterly using a confusion matrix of scored alerts vs. confirmed incidents. The AI triage signal started at 10% weight and earned its way to 15% after demonstrating consistent accuracy. Treat weights as living parameters.
The formula is the easy part. The real engineering challenge is building reliable enrichment pipelines that feed normalized signals into the scoring function within seconds. If your threat intel lookup takes 45 seconds, your 2-minute triage target is already dead.
The IC Score is not a replacement for skilled analysts – it is a force multiplier. It ensures that human attention goes where it matters most, and that the 95% of alerts that are noise get handled deterministically without burning out your team.