<- AI Systems Dossier
ENTRY NO. 02SaaS / B2B Revenue OperationsLive deliveryLead Scoring / Feature EngineeringLIVE

Revon

Identity · Revenue signal engine

AI revenue signal engine — lead scoring, prioritisation, and next-best-action from inbound noise

Lead Scoring / Feature Engineering
Problem

No scoring layer — inbound lead triage relied entirely on manual gut-feel review.

System Flow
6 stages · click to inspect
->
->
->
->
->
Click any stage to inspect
Signature Module · Revenue signal engine

Inspect scoring

Lead Scoring / Feature Engineering

See how an inbound lead is decomposed into features, scored, ranked, and handed off with reasoning attached.

Lead Scoring Trace · Prototype Benchmark● live
INPUTAcme Corp · Series B · 85 employees
↓ Clearbit enrich + LLM extraction· 1.2s
FEATSICP alignment ·················· 0.82
Intent keyword density ·········· 0.71
Tech stack overlap ·············· 0.68
Hiring signal (RevOps) ·········· HIGH
↓ XGBoost · 43 features· <5ms inference
SCORE0.74 → HOT TIER
↓ SHAP explanation + outreach draft
ROUTEAE notified via Slack · CRM updated· 1.8s total

Business Impact

Outcomes
Before

Lead triage method: Manual gut-feel review

->
After

Lead triage method: Calibrated ML score + LLM signals

Inbound leads are scored, ranked, and routed automatically. High-intent signals surface to reps within minutes with a structured qualification card. The modelled pipeline opportunity uplift is €850k based on historical conversion rates applied to improved triage. Manual review overhead is estimated at 11h/week per rep — time redirected to active selling. Both figures are modelled estimates, not measured outcomes.

€850k
Modelled pipeline uplift
11h/wk
Manual review reduction
78%
Top-3 prioritisation accuracy

Engineering Evaluation

Nightly eval
78%
Top-3 Prioritisation Accuracy
91%
Enrichment Completeness
89%
Score Consistency
82%
SHAP Signal Accuracy
Headline business result€850k
Modelled pipeline uplift

Why This Is Hard

4 engineering challenges
Challenge · 01

Feature distribution shift across lead sources

LinkedIn and form leads have systematically different signal distributions. Solved via source-specific feature scaling and a source-channel indicator feature that lets the model adapt to distribution differences without separate per-source models.

Challenge · 02

LLM semantic score calibration

raw GPT-4o-mini scores clustered near 0.5 without explicit calibration. Resolved by prompting the model to use the full 0–1 scale with anchor examples at each decile, followed by a distribution normalisation step to enforce uniform spread across the scoring output.

Challenge · 03

Recency bias in training data

the model initially over-weighted recently-contacted leads due to label timing effects in the CRM data. Fixed via time-normalised feature engineering and a temporal train/test split that evaluates on a future cohort, not a random split of the same period.

Challenge · 04

Sparse data for early-stage startups

~18% of inbound have no Clearbit coverage. LLM web extraction fills 73% of gaps, but the remaining 27% degrade enrichment quality. A completeness score gate (minimum 5 core fields) flags low-coverage records before scoring and includes a data confidence label in the qualification card.

Engineering Depth

5 topics · click to expand

XGBoost binary classifier trained on CRM outcome data with a temporal train/test split. Features: 60% structured firmographic signals, 40% LLM-extracted semantic signals. SHAP TreeExplainer computes per-prediction feature contributions displayed on AE qualification cards. Model is designed for monthly retraining on a rolling 18-month window — retraining is triggered automatically when feature drift metrics exceed threshold.

Key numbers
1.8sAvg inference latency
43Total input features
Built withLead ScoringFeature EngineeringXGBoostLLM FeaturesRevOps