Capstone • 2025Exploratory Data AnalysisBusiness Impact AnalysisMarket Analysis & Deployment Strategy
Bank Telemarketing Propensity System
Leakage-free pre-call prioritization + decision support to improve outreach efficiency using Machine Learning.
AUC (best)
0.81 (NN), 0.80 (LR)
Capture @ Top 10%
~49% of subscribers
Capture @ Top 20%
~72% of subscribers
Estimated savings
~€165k per 41k-contact campaign

Problem
Telemarketing conversion is low and each call has real cost. The goal was to rank customers before outreach (without using post-call leakage signals like call duration) so teams can prioritize who to call first and measure lift in conversion and cost per acquisition.
Approach
- Defined the decision: pre-call ranking (what we know before calling) vs post-call outcomes (what happens during/after the call).
- Built leakage-free training data by excluding duration and other post-call signals; standardized preprocessing and evaluation.
- Trained and compared models (Logistic Regression, tree boosting, Neural Network) and validated with AUC + decile/capture metrics.
- Converted model scores into an operational call list: tiers/deciles + threshold guidance for different capacity levels.
- Added explainability (global + per-customer drivers) to support stakeholder trust and adoption.
Impact
- Enables a measurable pipeline from leads → ranked call list → tier-based outreach execution.
- Captures ~49% of subscribers in the top 10% of contacts and ~72% in the top 20% (capacity-friendly targeting).
- Estimated cost per acquisition improves ~€44 → ~€13 (~70%), with ~€165k savings per ~41k-contact campaign (planning estimate).
- Improves adoption readiness via explainability + documented rollout, monitoring, and retraining plan.
Deployment & Monitoring
- Integration point: CRM exports a daily lead list → scoring pipeline → ranked call list returned to the calling team.
- Rollout: A/B test (ranked list vs baseline) with clear success metrics (conversion/cost-per-acquisition) and a fixed evaluation window.
- Monitoring: track score drift, conversion drift, tier performance, and recalibrate thresholds monthly/quarterly.
- Governance: leakage guardrails (no duration), data quality checks, and documented retraining cadence.
Risks & Tradeoffs
- Target leakage: call duration inflates performance—explicitly excluded to keep results realistic for pre-call decisioning.
- Class imbalance: success rate is low, so thresholds and capture metrics matter more than accuracy.
- Operational constraints: calling capacity changes—system supports tiered targeting strategies.
Tools
PythonTableauMS Office