This project presents an end-to-end fraud detection solution developed as part of a technical assessment for a Data Scientist role at VOM. The objective is to design, evaluate, and document a machine learning model capable of identifying fraudulent card transactions under real-time decision constraints and asymmetric business costs.
The work follows the CRISP-DM methodology, covering business understanding, exploratory analysis, feature preparation, modeling, evaluation, and a proposed deployment and monitoring strategy. Two models were implemented and compared: a class-weighted Logistic Regression baseline and a Gradient Boosting model (XGBoost).
The XGBoost model achieved near-perfect performance on the synthetic dataset, with strong recall and precision while minimizing false positives. Threshold analysis highlights the importance of aligning technical decisions with business risk tolerance and operational capacity.
Beyond predictive performance, the project emphasizes production-oriented considerations such as governance, monitoring, interpretability, and scalability. This repository is intended as a practical reference for applied machine learning in risk and decision systems.
This repository documents a technical challenge proposed by VOM as part of a Data Scientist recruitment process. The objective is to design and evaluate a fraud detection model for card transactions using a structured data science methodology.
The project is published for educational and reference purposes, allowing candidates and practitioners to learn from the problem framing, modeling decisions, and analytical workflow.
This repository documents a complete and realistic data science workflow based on a real technical challenge. Although the project originated from a recruitment process, its publication is intended for educational purposes.
Many technical assignments remain private, limiting collective learning and transparency around practical problem-solving. By open-sourcing this project, the goal is to provide a concrete reference for:
- Structuring an end-to-end machine learning project using CRISP-DM.
- Translating business objectives into measurable modeling goals.
- Handling class imbalance and asymmetric cost problems.
- Evaluating and comparing models beyond accuracy metrics.
- Incorporating deployment and monitoring considerations early in the design process.
The repository may serve as a template or learning resource for similar fraud detection and risk modeling problems. All data used is synthetic and does not represent real production systems.
VOM provides a low-code decision engine that enables companies to create, manage, and evolve automated decision policies (e.g., credit approval, fraud prevention, pricing, and risk management).
In this challenge, the role of the Data Scientist is to propose a fraud detection solution that could be integrated into such a decision engine. Transactions are evaluated in real time, and the model must balance fraud prevention with customer experience.
Approving a fraudulent transaction is significantly more costly than incorrectly declining a legitimate one, creating an asymmetric cost structure that directly influences model evaluation, metric selection, and decision threshold tuning.
The project explicitly follows the CRISP-DM methodology:
- Business Understanding
- Data Understanding
- Data Preparation
- Modeling
- Evaluation
- Deployment and Monitoring (conceptual proposal)
This structure ensures traceability between business objectives, analytical decisions, and operational considerations.
The dataset is synthetic and represents individual card transactions with behavioral and contextual features and a fraud label.
- distance_from_home — Distance between customer residence and transaction location.
- distance_from_last_transaction — Distance between current and previous transaction.
- ratio_to_median_purchase_price — Transaction amount relative to historical median.
- repeat_retailer — Whether the merchant was previously used by the customer.
- used_chip — Whether the card chip was used.
- used_pin_number — Whether a PIN was used.
- online_order — Whether the transaction occurred online.
- fraud (target) — Binary label (
1= fraud,0= legitimate).
The primary goal is to maximize fraud detection recall while maintaining an acceptable false positive rate to preserve customer experience and operational efficiency.
Model evaluation and threshold selection explicitly reflect the asymmetric business cost of fraud versus false declines.
This project uses synthetic data and a simplified business scenario intended solely for technical assessment and learning purposes. It does not represent real production systems or operational constraints.
- Python 3.x
- pandas, numpy
- scikit-learn
- XGBoost
- matplotlib, seaborn
- Jupyter Notebook
Two supervised classification models were evaluated: a class-weighted Logistic Regression baseline and an XGBoost model.
- ROC-AUC: ~0.98
- Fraud Recall: ~0.95
- Fraud Precision: ~0.58
The model achieved strong discriminative power and high fraud recall but generated a high false positive rate (~48%). While suitable as a baseline, this behavior increases operational costs and customer friction, limiting production suitability in high-volume environments.
- ROC-AUC: ~0.999
- Fraud Recall: ~1.00
- Fraud Precision: ~0.98
The XGBoost model achieved near-perfect performance on the validation set, detecting almost all fraud cases while keeping false positives very low. This balance delivers strong business value by reducing financial losses and minimizing unnecessary transaction declines.
Feature importance indicates that transaction amount relative to historical behavior, distance metrics, online transactions, and merchant recurrence are strong predictive signals.
Threshold tuning demonstrates the trade-off between recall and precision. Lower thresholds increase fraud capture but raise false positives, while higher thresholds reduce false alarms at the risk of missed fraud. Threshold selection should therefore be driven by business risk tolerance and operational capacity rather than default model settings.
XGBoost is recommended for production deployment due to its superior predictive performance, robustness to nonlinear relationships, and favorable balance between fraud prevention and customer experience.
Fraud detection involves asymmetric costs. Recall for fraud must be prioritized while controlling false positives. Accuracy alone is insufficient for decision-making.
Strong class imbalance (~9% fraud) was effectively handled using class weighting. Oversampling methods should be applied cautiously and validated carefully.
While Logistic Regression offers interpretability, it produced excessive false positives. Tree-based ensembles captured nonlinear patterns more effectively and delivered superior operational performance.
Thresholds directly affect fraud capture, manual review workload, and customer experience. They should be tuned collaboratively with business stakeholders and monitored continuously.
Near-perfect performance likely overestimates real-world behavior. Production systems face concept drift, noisy data, and adversarial dynamics, requiring continuous monitoring and retraining.
Reliable systems require deployment pipelines, model versioning, monitoring, logging, alerting, and safe rollout strategies (e.g., shadow mode or A/B testing).
Explainability supports regulatory compliance, operational trust, and debugging. Feature importance and model interpretation should be part of production workflows.