Generate datasets of captchas and their character-level masks for training computer vision (OCR, segmentation, etc) models.
Based on the captcha library, enhanced with mask extraction for each character, via my custom fork here.
- Easy generation of captcha images with associated per-character mask images
- Customizable captcha length, fonts, noise, backgrounds etc.
| Captcha Example | Character Mask Example |
|---|---|
![]() |
![]() |
(Replace the image paths above with your own example images)
git clone https://github.com/your-username/captcha-dataset-generator.git
cd captcha-dataset-generatorpip install git+https://github.com/your-username/captcha-masks.gitpython generate_dataset.py --count <nbr of captchas to generate> --output /path/to/output/directory --length <nbr of characters in each captcha>captcha (modified fork)
MIT

