Skip to content

MeridianAlgo/FinAI

Fin.AI 🤖

A lightweight, trainable transformer-based language model with automated daily training via GitHub Actions.

Hugging Face GitHub Actions License: MIT

Features

  • Scalable Architecture: GPT-style transformer, easily adjustable from tiny (10M) to large (350M+) parameters
  • Automated Training: Daily training on different Hugging Face datasets via GitHub Actions
  • Day-based Dataset Rotation: Different dataset trains each day (Monday-Sunday)
  • Hugging Face Integration: Model automatically uploaded to HuggingFace Hub
  • Wandb Integration: Real-time training metrics and visualization
  • CPU-Optimized: Runs efficiently on GitHub Actions free tier (Ubuntu CPU)
  • Easy Configuration: YAML-based model and dataset configuration

🤗 Model

The trained model is available on Hugging Face:

MeridianAlgo/Fin.AI

Download Model

from huggingface_hub import hf_hub_download

# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./model")

Use with Fin.AI

from fin_ai.model import FinAIModel

model = FinAIModel.from_pretrained("./model")

Quick Start

Local Training

# Install dependencies
pip install -r requirements.txt

# Train the model
python train.py --config config/model_config.yaml --datasets config/datasets.yaml

# Generate text
python generate.py --model checkpoints/model --prompt "Once upon a time"

GitHub Actions (Automated)

The model trains automatically every day at 6 AM UTC. Each day uses a different dataset:

  • Monday: WikiText-2 (encyclopedia text)
  • Tuesday: TinyStories (short stories)
  • Wednesday: CNN News (news articles)
  • Thursday: Dolly (instruction data)
  • Friday: arXiv (scientific papers)
  • Saturday: SQuAD (Q&A data)
  • Sunday: WikiText-103 (large encyclopedia)

After training, the model is automatically uploaded to Hugging Face.

Configuration

Model Sizes

Size Parameters Layers Heads Embed Dim Speed
tiny ~10M 4 4 256 ⚡ Fast
small ~25M 6 6 384 🚀 Medium
medium ~85M 12 8 512 🐢 Slow
large ~350M 24 12 768 🐌 Very Slow

Edit config/model_config.yaml to change model size:

model:
  size_preset: "tiny"  # or small, medium, large

Datasets

Edit config/datasets.yaml to customize datasets for each day:

datasets:
  - name: "wikitext"
    subset: "wikitext-2-raw-v1"
    split: "train"
    text_column: "text"
    day: 1  # Monday
    max_samples: 100000

Training Parameters

Adjust in config/model_config.yaml:

training:
  batch_size: 4
  learning_rate: 5.0e-4
  max_steps: 500
  warmup_steps: 100
  eval_steps: 100

Project Structure

fin-ai/
├── fin_ai/                 # Main package
│   ├── model/             # Transformer architecture
│   │   ├── config.py      # Model configuration
│   │   └── transformer.py # GPT-style model
│   ├── data/              # Dataset loading
│   │   └── dataset.py     # HF dataset utilities
│   └── training/          # Training loop
│       └── trainer.py     # Trainer with checkpointing
├── config/                # Configuration files
│   ├── model_config.yaml  # Model & training config
│   └── datasets.yaml      # Dataset configuration
├── train.py               # Main training script
├── generate.py            # Text generation script
├── requirements.txt       # Python dependencies
└── .github/workflows/     # GitHub Actions
    └── train.yml          # Daily training workflow

Usage

Training

# Train with default config
python train.py

# Override max steps
python train.py --max-steps 1000

# Limit dataset samples (for testing)
python train.py --max-samples 10000

# Custom output directory
python train.py --output-dir ./my_checkpoints

Generation

# Generate from prompt
python generate.py --prompt "The future of AI"

# Customize generation
python generate.py \
  --model checkpoints/model \
  --prompt "Hello world" \
  --max-tokens 200 \
  --temperature 0.8 \
  --top-k 50 \
  --top-p 0.9

Monitoring Training

Wandb Dashboard

If you have a Wandb account, add your API key as a GitHub secret:

  1. Get your API key from wandb.ai
  2. Add WANDB_API_KEY to GitHub repo secrets
  3. View live training at wandb.ai/your-username/fin-ai

Local Checkpoints

Checkpoints are saved to checkpoints/:

checkpoints/
├── model/                 # Latest model
│   ├── config.json
│   └── model.pt
├── checkpoint-100.pt      # Intermediate checkpoints
├── checkpoint-200.pt
└── best_model.pt          # Best evaluation checkpoint

Performance

On GitHub Actions free tier (Ubuntu CPU):

  • Tiny model: ~16 seconds per step
  • 500 steps: ~2.2 hours (fits in 3-hour limit)
  • Daily training: ~500 steps per day
  • Monthly: ~15,000 steps (~7.5M tokens)

Architecture

Fin.AI uses a GPT-2 style transformer with:

  • Multi-head self-attention with rotary positional embeddings
  • Feed-forward layers with SwiGLU activation
  • Pre-norm architecture for stable training
  • Gradient accumulation for larger effective batch sizes
  • Mixed precision training (when GPU available)

Customization

Add New Datasets

Edit config/datasets.yaml:

datasets:
  - name: "your-dataset"
    subset: null
    split: "train"
    text_column: "text"
    day: 1
    max_samples: 50000

Change Training Schedule

Edit .github/workflows/train.yml:

schedule:
  - cron: '0 6 * * *'  # Daily at 6 AM UTC

Adjust Model Size

Edit config/model_config.yaml:

model:
  size_preset: "small"  # Larger model

Troubleshooting

Training too slow

  • Reduce batch_size in config
  • Use smaller size_preset (tiny)
  • Reduce max_seq_len to 256

Out of memory

  • Reduce batch_size
  • Reduce max_seq_len
  • Use gradient_accumulation_steps to simulate larger batches

Dataset loading fails

  • Check dataset name on Hugging Face
  • Verify text_column matches dataset schema
  • Try with max_samples limit first

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Areas for enhancement:

  • GPU support for faster training
  • Distributed training across multiple machines
  • Model quantization for inference
  • Web UI for generation
  • Fine-tuning on custom data

Security

For security concerns, please see SECURITY.md.

Code of Conduct

This project follows the Contributor Covenant Code of Conduct.

License

MIT License - see LICENSE file

Acknowledgments

Status

🚀 Active Development - Daily training on GitHub Actions


Questions? Open an issue on GitHub!

About

We are researching and developing our own in-house LLM, which will be focused on finance-based chats and requests.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages