A simple document scanning pipeline implemented in C++ with OpenCV. The program detects the largest 4-point contour in each frame of a video or webcam feed, applies a perspective transform, and shows a flattened, top-down “scan” of the document.
- Preprocessing (grayscale -> blur -> Canny -> dilate -> erode)
- Contour detection and 4-point polygon approximation
- Automatic ordering of detected points
- Perspective warp to get a top-down scan
- 2×2 stacked debug view (original, threshold, contour view, warped result)
- Live processing from video file or webcam
- C++17
- OpenCV 4.x
- CMake >= 3.10
Install OpenCV (on Linux):
sudo apt install libopencv-devRun build.sh to build the project.
./build/document_scanner ./assets/testvideo.mp4The program displays two windows:
- Work Flow (2×2 grid)
- Original frame
- Thresholded frame
- Contours
- Warped (scanned) document
- Result
- Clean final warp of the detected document
Press q to quit.
Note
No ML, just pure classical OpenCV contour detection. Hence, works best with well-lit videos where the document edge is clear.