Skip to content

CodeBy-HP/Sentiment-Classification-Mlflow-DVC

Repository files navigation

🎭 Sentiment Classification - MLOps Pipeline

Production-grade sentiment analysis with automated experimentation, testing, and deployment

Python MLflow FastAPI Docker DVC


🎯 Overview

End-to-end MLOps system for sentiment analysis featuring automated experimentation, intelligent model promotion, and cloud deployment.


🌈 User Interface

Screenshot 2025-12-17 155611

🌈 Video Demo

Watch Demo

▶️ Click to watch demo


🌈 Architecture and Workflow Diagrams

Screenshot 2025-12-17 155305 Screenshot 2025-12-17 155321 Screenshot 2025-12-17 155355

✨ Key Features

🔬 SYSTEMATIC EXPERIMENTATION

  • Tested multiple models with BoW and TF-IDF
  • Tracked all experiments using MLflow

🔄 AUTOMATED ML PIPELINE

  • End-to-end DVC pipeline for data → model
  • Fully reproducible with versioned parameters

🎯 SMART MODEL PROMOTION

  • Automatically promotes only high-quality models
  • Uses MLflow registry for staging and production

🚀 COMPLETE CI/CD PIPELINE

  • Automated builds and deployments via GitHub Actions
  • Dockerized deployment on AWS EC2

🧪 COMPREHENSIVE TESTING

  • Validates model performance and API endpoints
  • Prevents faulty models from being deployed

🌐 PRODUCTION-READY APPLICATION

  • FastAPI app for real-time sentiment prediction
  • Clean UI with health and confidence checks

🛠️ Tech Stack

  • Machine Learning: Pandas & NumPy, NLTK
  • Mlops Tools: MLflow, DVC, DagShub
  • Deployement & CICD: Docker, GitHub Actions, AWS (EC2, ECR) , FastAPI

📁 Project Structure

Sentiment-Classification/
├── sentiment_classification/
│   ├── data/              # Data ingestion & preprocessing
│   ├── features/          # Feature engineering (BoW/TF-IDF)
│   ├── modeling/          # Training, evaluation, registry
│   └── connections/       # AWS S3 integration
├── fastapi_app/
│   ├── app.py            # FastAPI application
│   └── templates/        # Web interface
├── notebooks/            # Experimentation notebooks
├── scripts/
│   └── promote_model.py  # Smart model promotion
├── tests/                # Unit tests
├── data/                 # Dataset (tracked by DVC)
├── models/               # Saved models
├── .github/workflows/    # CI/CD pipeline
├── dvc.yaml              # DVC pipeline definition
└── Dockerfile            # Container configuration

🚀 Setup & Deployment

Want to run this project?

👉 Complete Setup Instructions

Includes local setup, DVC pipeline execution, MLflow tracking, Docker deployment, and AWS deployment guide.


🎓 What I Learned

  • Building reproducible ML pipelines with DVC
  • Experiment tracking and model versioning with MLflow
  • Conditional deployment strategies
  • CI/CD for ML systems
  • Docker containerization best practices
  • AWS cloud deployment (ECR + EC2)
  • Writing production-ready ML code
  • Comprehensive testing for ML systems

🔮 Future Enhancements

  • Kubernetes: Migrate to K8s for auto-scaling
  • Redis Caching: Cache predictions for faster responses
  • Authentication: Add user management with OAuth2/JWT
  • Monitoring: Implement Prometheus + Grafana dashboards
  • A/B Testing: Compare model versions in production
  • Explainability: Add SHAP/LIME for prediction explanations

👤 Author

Harsh Patel
📧 code.by.hp@gmail.com
🔗 GitHubLinkedIn


⭐ Star this repo if you find it useful

About

ML system that Experiments, Tests, and Deploys itself From raw data to production predictions - completely automated.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published