Original/Review Paper
H.6.5.2. Computer vision
Rozhin Mohammadizand; Razieh Rastgoo
Abstract
Sign language is a structured, non-vocal form of communication primarily used by individuals who are deaf or hard of hearing, who often face challenges interacting with non-signers. To address this, translation systems between sign and spoken language are essential, encompassing sign language recognition ...
Read More
Sign language is a structured, non-vocal form of communication primarily used by individuals who are deaf or hard of hearing, who often face challenges interacting with non-signers. To address this, translation systems between sign and spoken language are essential, encompassing sign language recognition and production. In this work, we focus on sign language production and propose a deep learning framework for generating skeleton-based video representations of sign language at the word level. Our approach employs a conditional Generative Adversarial Network (cGAN) with transformer embeddings in both generator and discriminator, augmented with bone-length and joint-angle constraints and a classifier-guided loss to ensure anatomically plausible and semantically consistent gestures. We further introduce a novel loss function to improve human keypoint generation for sign representation. Extensive experiments on three benchmark datasets demonstrate that our method outperforms state-of-the-art approaches according to statistical (MMD) and perceptual (FID) metrics, while qualitative analyses confirm that the generated gestures are temporally smooth, anatomically accurate, and semantically meaningful. These results highlight the effectiveness of our model in advancing word-level sign language synthesis.
Applied Article
H.3.8. Natural Language Processing
Hassan Deldar; Mohammad Mehdi Homayounpour
Abstract
In most of the countries, the legislative process has a long history, which has led to increasing diversity and multiplicity of laws. This has made it difficult to access laws that are valid in both time and place. The focus of this article is on the application of artificial intelligence in the domain ...
Read More
In most of the countries, the legislative process has a long history, which has led to increasing diversity and multiplicity of laws. This has made it difficult to access laws that are valid in both time and place. The focus of this article is on the application of artificial intelligence in the domain of legal statutes to assist in identifying the need for amendments to laws or specific provisions. The general framework of the proposed process consists of two key components.First, the texts of legal clauses or articles are enriched through the generation of enriched data using large language models, which involves producing embedding vectors, thematic classification,and extracting the provisions of each law. Second, a retrieval-augmented text generation (RAG) system is developed with the aid of large language models to determine conflicts or the need for expurgation in the output, utilizing the enriched data, predefined prompts, and the Chain of Thought (CoT) technique.The proposed method was evaluated on two benchmark datasets.On the COLIEE 2025 dataset, our approach outperformed the 2024 winners in legal implication tasks, achieving an F1 score of 0.6521 with minimal prompting. The second evaluation used over 1,000 legal clauses covering abrogation and neutral rules, yielding an impressive F1 score exceeding 73.41%.The findings of the proposed methodology demonstrate that, even with limited expertise in the legal domain, it is possible to identify conflicts and the necessity for refining legal texts to an acceptable degree within a reasonable timeframe for legal experts, leveraging the capabilities of large language models.
Original/Review Paper
A.5. I/O and Data Communications
Somayyeh Jafarali Jassbi; Sajjad Daliri
Abstract
The rapid growth of the Internet‑of‑Things (IoT) imposes significant challenges on task offloading in fog environments, including service latency, resource constraints, and trust management. Fog computing mitigates these limitations by moving computation and storage closer to end devices. This paper ...
Read More
The rapid growth of the Internet‑of‑Things (IoT) imposes significant challenges on task offloading in fog environments, including service latency, resource constraints, and trust management. Fog computing mitigates these limitations by moving computation and storage closer to end devices. This paper presents BCOFF (Blockchain‑based Computation Offloading Framework for Fog), a secure and efficient framework that jointly optimizes resource allocation and enables verifiable task offloading. In BCOFF, resource allocation is performed using the Grey Wolf Optimization (GWO) algorithm, while blockchain provides a tamper-resistant execution record. Specifically, the blockchain serves three purposes: (i) recording offloading decisions and cryptographic hashes of task results to support post‑execution auditability, (ii) validating the integrity of returned results by matching them with the on‑chain hash reference, and (iii) coordinating consensus among fog nodes through a lightweight Validator‑Selection Proof‑of‑Stake (VNPoS) mechanism. VNPoS is a simplified adaptation of the Nominated Proof‑of‑Stake (NPoS) model that selects validators using stake‑based nomination with variance‑aware stake normalization. By avoiding computationally intensive cryptographic puzzles, VNPoS significantly reduces consensus overhead and is therefore suitable for resource‑constrained fog environments. Experimental evaluation using the iFogSim simulator with workloads of 800–1500 tasks shows that BCOFF reduces execution time by 15–27%, lowers host‑selection latency by 22–25%, and decreases energy consumption by 5–9% compared with existing approaches. These results demonstrate that integrating GWO‑based scheduling with the VNPoS blockchain mechanism provides a more efficient and verifiable fog-offloading framework.
Original/Review Paper
H.3. Artificial Intelligence
mohammad khaki; fereshte dehghani
Abstract
The exponential growth of online food platforms has transformed restaurant discovery, yet traditional recommendation systems often struggle with the "cold-start problem" and the inability to capture latent synergies between restaurant attributes. This study proposes a multi-stage machine learning framework ...
Read More
The exponential growth of online food platforms has transformed restaurant discovery, yet traditional recommendation systems often struggle with the "cold-start problem" and the inability to capture latent synergies between restaurant attributes. This study proposes a multi-stage machine learning framework designed to uncover hidden patterns within Zomato restaurant data to enhance rating prediction accuracy. To overcome the limitations of raw, uncurated datasets, we implement a hybrid methodology integrating unsupervised latent structure discovery with supervised ensemble classification. Specifically, a dual-clustering strategy (K-means and Hierarchical clustering) is employed to synthesize novel latent features that represent complex relationships between service quality, price, and user preferences. To ensure model robustness, we utilize the Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance and Sequential Feature Selection (SFS) to optimize the input space. Experimental results demonstrate that the proposed framework significantly outperforms traditional baseline models, with the Random Forest classifier achieving a 90\% accuracy and a 37\% absolute improvement in F1-score. Granular error analysis reveals that most misclassifications are confined to adjacent rating categories, indicating a high degree of ordinal consistency. These findings underscore the effectiveness of leveraging latent structures to build scalable and interpretable recommendation engines, particularly in data-sparse environments.
Original/Review Paper
H.5. Image Processing and Computer Vision
Mohammad M. AlyanNezhadi; Hesamoddin Pourrostami; Mousa Nazari; Farzan Afshari
Abstract
In Iran’s financial market, the authentication of gold coins is majorly required for transparency, reducing fraud, and proper valuation. Differentiating between bank-issued and non-bank-issued coins pose a challenge as their appearance is almost the same. This paper suggests a classification method ...
Read More
In Iran’s financial market, the authentication of gold coins is majorly required for transparency, reducing fraud, and proper valuation. Differentiating between bank-issued and non-bank-issued coins pose a challenge as their appearance is almost the same. This paper suggests a classification method that is based on deep learning and has three main components: extracting area of interest, aligning images through a CNN regressor, and classifying coins through a CNN classifier. The method is tested on a set of 130 coins images (71 coins from banks and 59 coins from non-banks) and is benchmarked against baseline models employing feature extraction and SVMs. The proposed method outperforms the baseline with 99% accuracy. The results prove that the model works effectively in authenticating the coins, which enables safe transactions in the gold market.
Technical Paper
H.3.2.6. Games and infotainment
Mohammadreza Mohammadnejad; Morteza Dorrigiv; Farzin Yaghmaee
Abstract
Research in recommender systems has largely relied on standardized datasets such as MovieLens, Amazon Reviews, and Last.fm. However, these datasets are unsuitable for in-game recommendations, particularly in Multiplayer Online Battle Arenas (MOBAs), due to the sequential, team-based, and adversarial ...
Read More
Research in recommender systems has largely relied on standardized datasets such as MovieLens, Amazon Reviews, and Last.fm. However, these datasets are unsuitable for in-game recommendations, particularly in Multiplayer Online Battle Arenas (MOBAs), due to the sequential, team-based, and adversarial nature of gameplay. To identify essential characteristics for in-game recommendation datasets, we perform a cross-domain analysis of widely used recommendation datasets, evaluating their structural and distributional properties, including interaction space, matrix shape, sparsity, and Gini-based feature–shape diversity. Building on these insights, we curate DOTA-Draft, a research-ready dataset from raw professional Dota 2 matches, encoding sequential pick/ban states, patch versions, and match outcomes. Using this dataset, we conduct top-k drafting recommendation tasks and provide baseline results with Bayesian Personalized Ranking (BPR) and GRU4Rec. To facilitate adoption, DOTA-Draft is packaged in a RecBole-compatible format. This work establishes principled benchmarks for in-game recommendation, demonstrates the inadequacy of traditional user–item paradigms in dynamic, adversarial environments, and provides a foundation for developing models that account for sequential, multi-agent decision-making.
Original/Review Paper
H.5. Image Processing and Computer Vision
Amirhossein Zare Kordkheili; Amirreza Zare Kordkheili; Sekine Asadi Amiri
Abstract
Brain tumor detection is a critical task in medical imaging, requiring accurate and reliable methods. Recent advancements in deep learning have shown great potential in this field. In this article, we present a novel method for brain tumor detection based on a Convolutional Block Attention Module (CBAM) ...
Read More
Brain tumor detection is a critical task in medical imaging, requiring accurate and reliable methods. Recent advancements in deep learning have shown great potential in this field. In this article, we present a novel method for brain tumor detection based on a Convolutional Block Attention Module (CBAM) enhanced attention ensemble of deep learning networks. Initially, image augmentation is applied to increase data diversity. We utilize two deep neural network models, EfficientNet-B1 and ResNet-101, for tumor detection. First, we enhance the performance of these models by integrating the CBAM attention module into their architectures. Then, we ensemble the two networks using a soft voting strategy to achieve higher detection accuracy. The proposed method is evaluated on the three-class Figshare dataset, achieving an accuracy of 99.09% in detecting tumors in MRI images, which outperforms existing methods. This approach leverages the strengths of an ensemble of models, offering a promising solution for improving the accuracy and reliability of brain tumor detection in medical imaging.
Technical Paper
G.5. Information Technology and Systems Applications
Naga Subrahmanyeswari Nimmakayala; Krishna Prasad M H M
Abstract
Breast cancer detection is critical for early diagnosis and treatment. This paper utilized the BreakHis dataset, comprising 7,907 histopathological images of breast tumors (benign and malignant) captured at varying magnification levels. Initially, a basic CNN was applied, followed by advanced deep learning ...
Read More
Breast cancer detection is critical for early diagnosis and treatment. This paper utilized the BreakHis dataset, comprising 7,907 histopathological images of breast tumors (benign and malignant) captured at varying magnification levels. Initially, a basic CNN was applied, followed by advanced deep learning architectures including ResNet, EfficientNet, Mobilenet, Densenet and VGG19. Among these models, ResNet achieved the highest accuracy of 90.2%. For improving performance, a hybrid combination of hand-crafted features (pHash, HOG, GLCM, Hu Moments, SIFT, ORB and LBP) and transfer learning features (EfficientNet, DenseNet, ResNet, VGG19 MobileNet) was considered. Combined, these features were merged into a single feature vector, and were classified using ML algorithms: Logistic Regression, Naive Bayes, KNN, Decision Tree, Random Forest, Gradient Boosting and XGBoost. XGBoost yielded the highest accuracy of 96.2%. Additionally, deep learning models including Multilayer Perceptron (MLP) and Artificial Neural Networks (ANN) were explored, with ANN slightly outperforming MLP, achieving an accuracy of 98.3% compared to 97.5% for MLP. The results highlight the efficacy of combining traditional and deep learning-based features for improved diagnostic accuracy
Original/Review Paper
H.3.2.2. Computer vision
Mohammad Hossein Khosravi
Abstract
Document Image Quality Assessment (DIQA) is critical for ensuring the reliability of downstream applications such as Optical Character Recognition (OCR), digital archiving, and automated document workflows. In this paper, we propose a deep learning-based DIQA framework using a Siamese neural network ...
Read More
Document Image Quality Assessment (DIQA) is critical for ensuring the reliability of downstream applications such as Optical Character Recognition (OCR), digital archiving, and automated document workflows. In this paper, we propose a deep learning-based DIQA framework using a Siamese neural network architecture with an InceptionV3 backbone. Our model leverages a composite loss function that combines linear regression loss with a monotonic ranking constraint to jointly optimize for score-level accuracy and perceptual consistency. Unlike prior works that rely on handcrafted features or narrow degradation types, our approach generalizes across diverse distortions commonly observed in scanned and photographed documents. Experimental results on the SOC and SmartDoc-QA datasets demonstrate that the proposed model exhibits a strong correlation with OCR accuracy, achieving SROCC values of 0.952 and 0.873, respectively, and outperforming several state-of-the-art DIQA methods.