H.3.2.2. Computer vision
Mohammad Hossein Khosravi
Abstract
Document Image Quality Assessment (DIQA) is critical for ensuring the reliability of downstream applications such as Optical Character Recognition (OCR), digital archiving, and automated document workflows. In this paper, we propose a deep learning-based DIQA framework using a Siamese neural network ...
Read More
Document Image Quality Assessment (DIQA) is critical for ensuring the reliability of downstream applications such as Optical Character Recognition (OCR), digital archiving, and automated document workflows. In this paper, we propose a deep learning-based DIQA framework using a Siamese neural network architecture with an InceptionV3 backbone. Our model leverages a composite loss function that combines linear regression loss with a monotonic ranking constraint to jointly optimize for score-level accuracy and perceptual consistency. Unlike prior works that rely on handcrafted features or narrow degradation types, our approach generalizes across diverse distortions commonly observed in scanned and photographed documents. Experimental results on the SOC and SmartDoc-QA datasets demonstrate that the proposed model exhibits a strong correlation with OCR accuracy, achieving SROCC values of 0.952 and 0.873, respectively, and outperforming several state-of-the-art DIQA methods.
H.5. Image Processing and Computer Vision
Jalaluddin Zarei; Mohammad Hossein Khosravi
Abstract
Agricultural experts try to detect leaf diseases in the shortest possible time. However, limitations such as lack of manpower, poor eyesight, lack of sufficient knowledge, and quarantine restrictions in the transfer of diseases to the laboratory can be acceptable reasons to use digital technology to ...
Read More
Agricultural experts try to detect leaf diseases in the shortest possible time. However, limitations such as lack of manpower, poor eyesight, lack of sufficient knowledge, and quarantine restrictions in the transfer of diseases to the laboratory can be acceptable reasons to use digital technology to detect pests and diseases and finally dispose of them. One of the available solutions in this field is using convolutional neural networks. On the other hand, the performance of CNNs depends on the large amount of data. While there is no suitable dataset for the native trees of South Khorasan province, this motivates us to create a suitable dataset with a large amount of data. In this article, we introduce a new dataset in 9 classes of images of Healthy Barberry leaves, Barberry Rust disease, Barberry Pandemis ribeana Tortricidae pest, Healthy Jujube leaves, Jujube Ziziphus Tingid disease, Jujube Parenchyma-Eating Butterfly pest, Healthy Pomegranate leaves, Pomegranate Aphis punicae pest, and Pomegranate Leaf-Cutting Bees pest and also check the performance of several well-known convolutional neural networks using all gradient descent optimizer algorithms on this dataset. Our most important achievement is the creation of a dataset with a high data volume of pests and diseases in different classes. In addition, our experiments show that common CNN architectures, along with gradient descent optimizers, have an acceptable performance on the proposed dataset. We call the proposed dataset ”Birjand Native Plant Leaves (BNPL) Dataset”. It is available at the address https://kaggle.com/datasets/ec17162ca01825fb362419503cbc84c73d162bffe936952253ed522705228e06.