A Windows application that provides AI assistance through keyboard shortcuts, designed to be invisible to screen recording software.
- Screen OCR using Gemini Vision (because tesseract and EasyOCR are not good enough)
- AI integration with OpenAI (or the many free compatible endpoints)
- Global keyboard shortcuts
- Stealth mode (invisible to screen recording)
- Modular architecture with separate AI processing component
- Could work on macOS but you'll need to edit some lines
It was made as a response to the following post on X:
As apparantly on whatever planet they live, everyone makes $7000+ a month.Still, Do NOT missuse it, I'm not responsible for your or anyone else's use of this program.
- Install Python 3.8 or higher
- Install dependencies:
uv pip install -r requirements.txt - Create a
.envfile with your configuration (please do not include any comments in the file):# Solving Model Configuration SOLVING_MODEL_API_KEY=your_api_key_here SOLVING_MODEL_BASE_URL=your_custom_endpoint_url SOLVING_MODEL=your_model_name # OCR Model Configuration OCR_API_KEY=your_ocr_api_key_here OCR_BASE_URL=your_ocr_endpoint_url OCR_MODEL=your_ocr_model_name # Hotkey Configuration CAPTURE_HOTKEY=Alt+Enter QUIT_HOTKEY=Ctrl+Alt+Q RESET_HOTKEY=Ctrl+Alt+R
- Press the configured capture hotkey (default:
Alt+Enter) to capture screen and get AI assistance - Press the configured quit hotkey (default:
Ctrl+Alt+Q) to close the application - Press the configured reset hotkey (default:
Ctrl+Alt+R) to reset the application state
You can configure the hotkeys in the .env file using the following format:
- Modifiers:
Ctrl,Alt,Win,Shift - Keys: Any single key (e.g.,
R,Q,Enter) - Format:
Modifier1+Modifier2+Key(e.g.,Ctrl+Alt+R,Alt+Enter)
pyinstaller --onefile --noconsole --icon=clueme.ico --name=clueme --add-data ".env;." --exclude-module PyQt5 --exclude-module PyQt6 clueme.pyThe application is built with a modular architecture:
clueme.py: Main application file handling UI and hotkey managementai_processor.py: Dedicated module for AI processing and OpenAI integrationocr.py: OCR functionality using Gemini Vision
If you have ollama with a vision model you can specify it to be the endpoint for both OCR and Solving models (specify the models too).
I only tested it on Windows 11 24H2 but the flag for screen capture exclusion might not work on older versions.
Testing on windows 10 build 19045 revealed that it doesn't work unless the window isn't frameless, which was implemented as a dynamic check during runtime.
You can intergrate whisper, adding the generated STT as context for each message, but it's not implemented, you're welcome to put in a pr.

