Imagine opening your security dashboard to find 10,000 alerts. Which one do you investigate first?

In 2024, GitGuardian discovered 23.7 million new hardcoded secrets on public GitHub—a 25% surge. 58% are "generic" secrets (passwords, database credentials, API keys) that traditional rule-based systems miss. Secrets appear in 31% of data breaches, taking 292 days to remediate, and 70% of 2022's leaked secrets remain exploitable today.

GitGuardian's Machine Learning automatically ranks incidents by risk, transforming overwhelming alert floods into actionable, prioritized queues. Our ML model examines each incident's context and computes a risk score, surfacing the most dangerous leaks first.

💡
The Impact: 3× Faster Incident Review
Our ML model has improved the security team review efficiency by a factor of 3. Analysts find nearly three times more critical threats when reviewing the same number of top-ranked incidents compared to traditional severity rules.

Building the Foundation: Data, Features, and Expert Knowledge

Teaching Machines What "Dangerous" Means


Our ranking model uses supervised learning, trained on thousands of incidents manually labeled by cybersecurity experts across five severity levels (Info, Low, Medium, High, Critical).

Understanding severity in context: Not all secrets are created equal. Consider these real-world examples:

Critical Severity:

  • AWS access key with AdministratorAccess policy found in a public GitHub repository
  • Production database credentials hardcoded in the main branch Docker image
  • Stripe API key with full payment processing permissions exposed in client-side code

Low Severity:

  • Test API key for a development sandbox with no production access
  • Expired credentials for a decommissioned service
  • Example password in documentation (e.g., password123 used for illustration)

The difference is the blast radius and exploitability. We trained on our Good Samaritan program repository, with experts focusing on generic secrets—the fastest-growing leak category—within their specific contexts.

What the Model "Sees": Rich Contextual Features


We never feed actual secret values into the model. Instead, we use rich metadata: location (GitHub, GitLab, Slack), file type, branch (main vs. dev), accessibility, secret type, age, and occurrences.

The model also incorporates signals from two ML modules: 
Secret Enricher (classifies generic secrets by examining code context)
False-Positive Remover (filters benign strings, reducing false positives by 80%).

This gives a 360-degree view of exploitability.

Under the Hood: Why We Chose XGBoost


Why XGBoost?


We selected XGBoost (eXtreme Gradient Boosting), which is an ensemble of hundreds of decision trees that learn from each other's mistakes. Why?

  1. Speed: Millisecond predictions for thousands of incidents
  2. Efficiency: Optimized for tabular security data
  3. Interpretability: Feature importance scores show which factors (secret type, location, validity) most influence risk, building security team trust

Human-in-the-Loop Refinement


We implemented a feedback loop with security analysts. When misrankings occurred, analysts flagged them for iterative retraining. This ensures the model reflects real-world security expertise, not just statistical patterns. We also tuned for SecOps workflows, prioritizing top-ranked incidents over raw accuracy.

Measuring Success: Beyond Simple Accuracy


Why "Percentage Correct" Fails


Imagine two models, both 90% accurate:

❌ Model A

Correctly identifies:

  • 9 out of 10 low-severity incidents

Misses:

  • The 1 critical breach

Result: False sense of security

✓ Model B

Correctly identifies:

  • The critical breach

Misclassifies:

  • Some low-severity incidents

Result: Real threats caught

Model B is vastly superior. We evaluate analyst value, not just accuracy, using specialized metrics:

Review Utility: Measures cumulative value of top N incidents (Critical=10pts, High=5pts, Medium=2pts, Low=1pt). 

Critical Precision & Recall: How often "critical" flags are correct vs. what % we catch. 

Coverage: Can we score every incident? 

Safe Pruning: Can we auto-close low-risk incidents without missing threats?

The Results: ML vs. Rule-Based Prioritization


Our ML model dramatically outperforms rule-based baselines:

MetricML ModelRule-BasedImprovement
Top-30 Review Utility~9.7 points~3.4 points3× more value
Critical Precision75%~15%5× fewer false alarms
Critical Recall~72%~14%5× better detection
Coverage100%~18%No blind spots
NDCG (Ranking Quality)~0.95~0.81Near-perfect ordering
Safe Pruning36.7%~2%18× more noise reduction

What This Means for Your Team


Faster Triage: Find 3× more critical threats in the same review time. 

Trustworthy Alerts: 75% precision on "critical" flags (vs. 15% for rules)—no more false alarm fatigue. 

Comprehensive Detection: Catch 72% of all critical leaks (vs. 14% for rules). 

No Blind Spots: 100% coverage vs. 18% for rules. 

Massive Noise Reduction: Safely auto-close 36.7% of incidents while missing only 2% of critical threats.

Real-World Impact for SecOps Teams


Daily Operations Transformation


Before: 10,000 unranked alerts, hours of manual triage, missed critical incidents, 292-day average remediation. 

After: Risk-ranked dashboard, top 10 alerts are 75% certain threats, 72% of critical leaks surfaced, low-priority incidents auto-filtered, and dramatically reduced detection time.

ML prioritization rebuilds trust: analysts believe "critical" flags (75% precision), safely defer "low" flags (minimal false negatives), eliminate alert fatigue, and remove anxiety about missing threats.

From Detection to Prevention

Our ML prioritization transforms millions of raw detections into actionable, risk-ranked queues. SecOps teams no longer guess which leak is most dangerous. The model identifies it with rigorous accuracy. This closes the gap between detection and prevention.

The stakes: 70% of 2022's leaked secrets remain valid, and secrets appear in 31% of breaches. Prioritization is the difference between proactive security and reactive crisis management.

Learn More About GitGuardian's ML-Powered Security

Interested in seeing how ML-based prioritization could transform your security operations?

Explore our resources:

Ready to experience prioritization that actually works? Request a demo to see our ML model in action with your own security data.