Skip to content
/ cfd Public

An Enterprise-Grade Credit Card Fraud Detection System.

License

Notifications You must be signed in to change notification settings

ARPAHLS/cfd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CFD Logo

An Enterprise-Grade Credit Card Fraud Detection System

License Python Dataset ARPA


Overview

CFD is a modular, high-performance fraud detection system designed to identify malicious financial transactions with exceptional accuracy. Built on the PaySim dataset, it processes millions of transaction logs to flag fraud in real-time, leveraging advanced ensemble learning techniques to handle extreme class imbalances.

Tip

Deep Dive: For a technical breakdown, see the System Architecture Guide. For a business perspective on value and usage, including impact and reliability analysis, see the Executive Summary.

Key Features

  • Advanced AI Modeling: Utilizes weighted Random Forest and Gradient Boosting classifiers.
  • Enterprise Architecture: Fully modular design with separate data ingestion, feature engineering, and evaluation layers.
  • Audit Compliance: Comprehensive JSON-based audit logging for every system action.
  • Automated Reporting: Generates instant performance metrics (ROC-AUC, Confusion Matrix).
  • Production Ready: rapid train/predict capabilities via CLI.

Model Performance

The current release features a Random Forest Classifier trained on 6.3 million transactions.

Metric Score Notes
ROC-AUC 0.999 Excellent discrimination capability
Precision 1.00 Zero false positives on test set
Recall 1.00 100% fraud detection rate on test set

Important

Performance Note: This model is trained on synthetic data (PaySim), where fraud patterns are often deterministic (e.g., specific account emptying rules). The near-perfect scores (0.999 AUC) are expected for this specific dataset but would likely be lower (~0.95+) in a real-world, noisy environment. We have verified that this is not due to target leakage (see Feature Analysis).

The trained model is available at models/fraud_model.pkl.

Project Structure

credit_card_fraud/
├── data/               # Dataset storage
├── docs/               # System Documentation
│   └── SYSTEM_OVERVIEW.md
├── src/                # Core Logic
│   ├── data_loader.py  # Ingestion & Cleaning
│   ├── features.py     # Feature Engineering
│   ├── model.py        # Model Definition
│   ├── evaluation.py   # Reporting
│   └── utils.py        # Logging
├── tests/              # Unit Tests
├── logs/               # Audit Logs
├── models/             # Serialized Models
└── main.py             # CLI Entry Point

Quick Start

1. Installation

# Clone the repository
git clone https://github.com/arpahls/cfd.git
cd cfd

# Setup Environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install Dependencies
pip install -r requirements.txt

2. Usage

Train the Model

python main.py --mode train --model_type rf

Run Predictions

python main.py --mode predict --model_path models/fraud_model.pkl

Run Tests

pytest tests/

For details on our Unit vs. Integration testing strategy, see the Testing Guide.

License

This project is licensed under the MIT License - see the LICENSE file for details.


ARPA Logo
Developed and Maintained by ARPA HELLENIC LOGICAL SYSTEMS
Support: systems@arpacorp.net

About

An Enterprise-Grade Credit Card Fraud Detection System.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages