Paper ID: 2211.16098
Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks
Rui-Yang Ju, Yu-Shian Lin, Yanlin Jin, Chih-Chia Chen, Chun-Tse Chien, Jen-Shiun Chiang
The efficient segmentation of text information from the background in degraded color document images is an important challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to various types of degradation over time, such as staining, yellowing, and ink seepage, badly affecting document image binarization results. This work proposes a three-stage method to generate binarization image results for degraded colour document images using generative adversarial networks (GANs). Stage-1 involves applying discrete wavelet transform and retaining the low-low subband images for document image enhancement. In Stage-2, the original input image is split into red, green, and blue (RGB) three single-channel images and one grayscale image, and each image is trained with independent GANs to extract color foreground information. In Stage-3, the output images of Stage-2 and the resized input images are used to train independent GANs to generate document binarization results, enabling the combination of global and local features. The experimental results show that the Avg-Score of the proposed method is 77.64, 77.95, 79.05, 76.38, 75.34, and 77.00 on the (H)-DIBCO 2011, 2013, 2014, 2016, 2017, and 2018 datasets, which achieves the state-of-the-art level. The implementation code for this work is available at https://github.com/abcpp12383/ThreeStageBinarization.
Submitted: Nov 29, 2022