GitHub - ARPAHLS/cfd: An Enterprise-Grade Credit Card Fraud Detection System.

An Enterprise-Grade Credit Card Fraud Detection System

Overview

CFD is a modular, high-performance fraud detection system designed to identify malicious financial transactions with exceptional accuracy. Built on the PaySim dataset, it processes millions of transaction logs to flag fraud in real-time, leveraging advanced ensemble learning techniques to handle extreme class imbalances.

Tip

Deep Dive: For a technical breakdown, see the System Architecture Guide. For a business perspective on value and usage, including impact and reliability analysis, see the Executive Summary.

Key Features

Advanced AI Modeling: Utilizes weighted Random Forest and Gradient Boosting classifiers.
Enterprise Architecture: Fully modular design with separate data ingestion, feature engineering, and evaluation layers.
Audit Compliance: Comprehensive JSON-based audit logging for every system action.
Automated Reporting: Generates instant performance metrics (ROC-AUC, Confusion Matrix).
Production Ready: rapid train/predict capabilities via CLI.

Model Performance

The current release features a Random Forest Classifier trained on 6.3 million transactions.

Metric	Score	Notes
ROC-AUC	0.999	Excellent discrimination capability
Precision	1.00	Zero false positives on test set
Recall	1.00	100% fraud detection rate on test set

Important

Performance Note: This model is trained on synthetic data (PaySim), where fraud patterns are often deterministic (e.g., specific account emptying rules). The near-perfect scores (0.999 AUC) are expected for this specific dataset but would likely be lower (~0.95+) in a real-world, noisy environment. We have verified that this is not due to target leakage (see Feature Analysis).

The trained model is available at models/fraud_model.pkl.

Project Structure

credit_card_fraud/
├── data/               # Dataset storage
├── docs/               # System Documentation
│   └── SYSTEM_OVERVIEW.md
├── src/                # Core Logic
│   ├── data_loader.py  # Ingestion & Cleaning
│   ├── features.py     # Feature Engineering
│   ├── model.py        # Model Definition
│   ├── evaluation.py   # Reporting
│   └── utils.py        # Logging
├── tests/              # Unit Tests
├── logs/               # Audit Logs
├── models/             # Serialized Models
└── main.py             # CLI Entry Point

Quick Start

1. Installation

# Clone the repository
git clone https://github.com/arpahls/cfd.git
cd cfd

# Setup Environment
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install Dependencies
pip install -r requirements.txt

2. Usage

Train the Model

python main.py --mode train --model_type rf

Run Predictions

python main.py --mode predict --model_path models/fraud_model.pkl

Run Tests

pytest tests/

For details on our Unit vs. Integration testing strategy, see the Testing Guide.

License

This project is licensed under the MIT License - see the LICENSE file for details.

_{Developed and Maintained by ARPA HELLENIC LOGICAL SYSTEMS}
_{Support: systems@arpacorp.net}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Key Features

Model Performance

Project Structure

Quick Start

1. Installation

2. Usage

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
docs		docs
models		models
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

License

ARPAHLS/cfd

Folders and files

Latest commit

History

Repository files navigation

Overview

Key Features

Model Performance

Project Structure

Quick Start

1. Installation

2. Usage

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages