H.3.12. Distributed Artificial Intelligence
Z. Amiri; A. Pouyan; H Mashayekhi
Abstract
Recently, data collection from seabed by means of underwater wireless sensor networks (UWSN) has attracted considerable attention. Autonomous underwater vehicles (AUVs) are increasingly used as UWSNs in underwater missions. Events and environmental parameters in underwater regions have a stochastic nature. ...
Read More
Recently, data collection from seabed by means of underwater wireless sensor networks (UWSN) has attracted considerable attention. Autonomous underwater vehicles (AUVs) are increasingly used as UWSNs in underwater missions. Events and environmental parameters in underwater regions have a stochastic nature. The target area must be covered by sensors to observe and report events. A ‘topology control algorithm’ characterizes how well a sensing field is monitored and how well pairs of sensors are mutually connected in UWSNs. It is prohibitive to use a central controller to guide AUVs’ behavior due to ever changing, unknown environmental conditions, limited bandwidth and lossy communication media. In this research, a completely decentralized three-dimensional topology control algorithm for AUVs is proposed. It is aimed at achieving maximal coverage of the target area. The algorithm enables AUVs to autonomously decide on and adjust their speed and direction based on the information collected from their neighbors. Each AUV selects the best movement at each step by independently executing a Particle Swarm Optimization (PSO) algorithm. In the fitness function, the global average neighborhood degree is used as the upper limit of the number of neighbors of each AUV. Experimental results show that limiting number of neighbors for each AUV can lead to more uniform network topologies with larger coverage. It is further shown that the proposed algorithm is more efficient in terms of major network parameters such as target area coverage, deployment time, and average travelled distance by the AUVs.
H.3.8. Natural Language Processing
B. Bokharaeian; A. Diaz
Abstract
Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction ...
Read More
Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in the research literature. This paper aims to explore clause dependency related features alongside to linguistic-based negation scope and cues to overcome complexity of the sentences. The results show by employing the proposed features combined with a bag of words kernel, the performance of the used kernel methods improves. Moreover, experiments show the enhanced local context kernel outperforms other methods. The proposed method can be used as an alternative approach for sentence simplification techniques in biomedical area which is an error-prone task.
H.6.3.2. Feature evaluation and selection
E. Enayati; Z. Hassani; M. Moodi
Abstract
Breast cancer is one of the most common cancer in the world. Early detection of cancers cause significantly reduce in morbidity rate and treatment costs. Mammography is a known effective diagnosis method of breast cancer. A way for mammography screening behavior identification is women's awareness evaluation ...
Read More
Breast cancer is one of the most common cancer in the world. Early detection of cancers cause significantly reduce in morbidity rate and treatment costs. Mammography is a known effective diagnosis method of breast cancer. A way for mammography screening behavior identification is women's awareness evaluation for participating in mammography screening programs. Todays, intelligence systems could identify main factors on specific incident. These could help to the experts in the wide range of areas specially health scopes such as prevention, diagnosis and treatment. In this paper we use a hybrid model called H-BwoaSvm which BWOA is used for detecting effective factors on mammography screening behavior and SVM for classification. Our model is applied on a data set which collected from a segmental analytical descriptive study on 2256 women. Proposed model is operated on data set with 82.27 and 98.89 percent accuracy and select effective features on mammography screening behavior.
H.3. Artificial Intelligence
V. Ghasemi; A. Pouyan; M. Sharifi
Abstract
This paper proposes a scheme for activity recognition in sensor based smart homes using Dempster-Shafer theory of evidence. In this work, opinion owners and their belief masses are constructed from sensors and employed in a single-layered inference architecture. The belief masses are calculated using ...
Read More
This paper proposes a scheme for activity recognition in sensor based smart homes using Dempster-Shafer theory of evidence. In this work, opinion owners and their belief masses are constructed from sensors and employed in a single-layered inference architecture. The belief masses are calculated using beta probability distribution function. The frames of opinion owners are derived automatically for activities, to achieve more flexibility and extensibility. Our method is verified via two experiments. In the first experiment, it is compared to a naïve Bayes approach and three ontology based methods. In this experiment our method outperforms the naïve Bayes classifier, having 88.9% accuracy. However, it is comparable and similar to the ontology based schemes, but since no manual ontology definition is needed, our method is more flexible and extensible than the previous ones. In the second experiment, a larger dataset is used and our method is compared to three approaches which are based on naïve Bayes classifiers, hidden Markov models, and hidden semi Markov models. Three features are extracted from sensors’ data and incorporated in the benchmark methods, making nine implementations. In this experiment our method shows an accuracy of 94.2% that in most of the cases outperforms the benchmark methods, or is comparable to them.
H.6.2.5. Statistical
M. Mohammadi; M. Sarmad
Abstract
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations ...
Read More
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus, the idea of Debruyne’s “outlier map” is employed in this paper to identify the outlying points in the SVM classification problem. However, due to the computational reasons such as convenience and rapidity, a robust Mahalanobis distance based on the minimum covariance determinant estimator is utilized. This method has a good compatibility by the data with low dimensional structure. In addition to the classification accuracy, the margin width is used as the criterion for the performance assessment. The larger margin is more desired, due to the higher generalization ability. It should be noted that, by omission of the detected outliers using the suggested outlier map the generalization ability and accuracy of SVM are increased. This leads to the conclusion that the proposed method is very efficient in identifying the outliers. The capability of recognizing the outlying and misclassified observations for this new version of outlier map has been retained similar to the older version, which is tested on the simulated and real world data.
H.6. Pattern Recognition
Kh. Sadatnejad; S. Shiry Ghidari; M. Rahmati
Abstract
Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data ...
Read More
Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. Projection to tangent spaces truly preserves topology along radial geodesics. In this paper, we propose a method for extrinsic inference on Riemannian manifold using kernel approach while topology of the entire dataset is preserved. We show that computing the Gramian matrix using geodesic distances, on a complete Riemannian manifold with unique minimizing geodesic between each pair of points, provides a feature mapping which preserves the topology of data points in the feature space. The proposed approach is evaluated on real datasets composed of EEG signals of patients with two different mental disorders, texture, visual object classes, and tracking datasets. To assess the effectiveness of our scheme, the extracted features are examined by other state-of-the-art techniques for extrinsic inference over symmetric positive definite (SPD) Riemannian manifold. Experimental results show the superior accuracy of the proposed approach over approaches which use kernel trick to compute similarity on SPD manifolds without considering the topology of dataset or partially preserving topology.
V. Ghasemi; M. Javadian; S. Bagheri Shouraki
Abstract
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional ...
Read More
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the data points as one-dimensional ink drop patterns, in order to summarize the effects of all data points, and then applies a threshold on the resulting vectors. It is based on an ensemble clustering method which performs one-dimensional density partitioning to produce ensemble of clustering solutions. Then, it assigns a unique prime number to the data points that exist in each partition as their labels. Consequently, a combination is performed by multiplying the labels of every data point in order to produce the absolute labels. The data points with identical absolute labels are fallen into the same cluster. The hierarchical property of the algorithm is intended to cluster complex data by zooming in each already formed cluster to find further sub-clusters. The algorithm is verified using several synthetic and real-world datasets. The results show that the proposed method has a promising performance, compared to some well-known high-dimensional data clustering algorithms.
G.4. Information Storage and Retrieval
V. Derhami; J. Paksima; H. Khajeh
Abstract
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking ...
Read More
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the ranking system is considered as the agent of the learning system and the selection of documents is displayed to the user as the agent's action. Reinforcement signal in this system is calculated based on user's click on the documents. Action-values in the RRLUFF algorithm are calculated for each feature of the document-query pair. In RRLUFF method, each feature is scored based on the number of the documents related to the query and their position in the ranked list of that feature. For learning, documents are sorted according to modified scores for the next query. Then, according to the position of a document in the ranking list, some documents are selected based on the random distribution of their scores to display to the user. OHSUMED and DOTIR benchmark datasets are used to evaluate the proposed method. The evaluation results indicate that the proposed method is more effective than the related methods in terms of P@n, NDCG@n, MAP, and NWN.
H.3.5. Knowledge Representation Formalisms and Methods
N. Khozouie; F. Fotouhi Ghazvini; B. Minaei
Abstract
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model ...
Read More
Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is constructed according to the four-dimensional objects approach and three-dimensional events for the data collected from a WBAN. In order to support mobility and reasoning on temporal data transmitted from WBAN, a hierarchical model based on ontology is presented. It supports the relationship between heterogeneous environments and reasoning on the context data for extracting higher-level knowledge. Location is considered a temporal attribute. To support temporal entity, reification method and Allen’s algebra relations are used. Using reification, new classes Time_slice and Time_Interval and new attributes ts_time_slice and ts_time_Interval are defined in context-aware ontology. Then the thirteen logic relations of Allen such as Equal, After, Before is added by OWL-Time ontology to the properties. Integration and consistency of context-aware ontology are checked by the Pellet reasoner. This hybrid context-aware ontology is evaluated by three experts using the FOCA method based on the Goal-Question-Metrics (GQM) approach. This evaluation methodology diagnoses the ontology numerically and decreases the subjectivity and dependency on the evaluator’s experience. The overall performance quality according to completeness, adaptability, conciseness, consistency, computational efficiency and clarity metrics is 0.9137.
Hanieh Mohamadi; Asadollah Shahbahrami; Javad Akbari
Abstract
Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color ...
Read More
Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input image. At first, the images are retrieved based on the input keywords. Then, visual features are extracted to retrieve ideal output images. For extraction of color features we have used color moments and for texture we have used color co-occurrence matrix. The COREL image database have been used for our experimental results. The experimental results show that the performance of the combination of both text- and content- based features is much higher than each of them which is applied separately.
H.5.10. Applications
Z. Dorrani; M.S. Mahmoodi
Abstract
The edges of an image define the image boundary. When the image is noisy, it does not become easy to identify the edges. Therefore, a method requests to be developed that can identify edges clearly in a noisy image. Many methods have been proposed earlier using filters, transforms and wavelets with Ant ...
Read More
The edges of an image define the image boundary. When the image is noisy, it does not become easy to identify the edges. Therefore, a method requests to be developed that can identify edges clearly in a noisy image. Many methods have been proposed earlier using filters, transforms and wavelets with Ant colony optimization (ACO) that detect edges. We here used ACO for edge detection of noisy images with Gaussian noise and salt and pepper noise. As the image edge frequencies are close to the noise frequency band, the edge detection using the conventional edge detection methods is challenging. The movement of ants depends on local discrepancy of image’s intensity value. The simulation results compared with existing conventional methods and are provided to support the superior performance of ACO algorithm in noisy images edge detection. Canny, Sobel and Prewitt operator have thick, non continuous edges and with less clear image content. But the applied method gives thin and clear edges.
I.3.7. Engineering
A. Ardakani; V. R. Kohestani
Abstract
The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design ...
Read More
The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the over-fitting. This article examines the capability of C4.5 decision tree for the prediction of seismic liquefaction potential of soil based on the Cone Penetration Test (CPT) data. The database contains the information about cone resistance (q_c), total vertical stress (σ_0), effective vertical stress (σ_0^'), mean grain size (D_50), normalized peak horizontal acceleration at ground surface (a_max), cyclic stress ratio (τ/σ_0^') and earthquake magnitude (M_w). The overall classification success rate for the entire data set is 98%. The results of C4.5 decision tree have been compared with the available artificial neural network (ANN) and relevance vector machine (RVM) models. The developed C4.5 decision tree provides a viable tool for civil engineers to determine the liquefaction potential of soil.
A.10. Power Management
Kh. Valipour; A. Ghasemi
Abstract
The optimal reactive power dispatch (ORPD) is a very important problem aspect of power system planning and is a highly nonlinear, non-convex optimization problem because consist of both continuous and discrete control variables. Since the power system has inherent uncertainty, hereby, this paper presents ...
Read More
The optimal reactive power dispatch (ORPD) is a very important problem aspect of power system planning and is a highly nonlinear, non-convex optimization problem because consist of both continuous and discrete control variables. Since the power system has inherent uncertainty, hereby, this paper presents both of the deterministic and stochastic models for ORPD problem in multi objective and single objective formulation, respectively. The deterministic model consider three main issues in ORPD problem as real power loss, voltage deviation and voltage stability index, but, in the stochastic model the uncertainty on the demand and the equivalent availability of shunt reactive power compensators have been investigated. To solve them, propose a new modified harmony search algorithm (HSA) which implemented in single and multi objective forms. Since, like many other general purpose optimization methods, the original HSA often traps into local optima, to aim with this cope, an efficient local search method called chaotic local search (CLS) and global search operator are proposed in the internal architecture of the original HSA algorithm to improve its ability in finding of best solution because ORPD problem is very complex problem with different types of continuous and discrete constrains i.e. excitation settings of generators, sizes of fixed capacitors, tap positions of tap changing transformers and the amount of reactive compensation devices. Moreover, fuzzy decision-making method is employed to select the best solution from the set of Pareto solutions.
H.6.5.10. Remote sensing
M. Imani
Abstract
Due to abundant spectral information contained in the hyperspectral images, they are suitable data for anomalous targets detection. The use of spatial features in addition to spectral ones can improve the anomaly detection performance. An anomaly detector, called nonparametric spectral-spatial detector ...
Read More
Due to abundant spectral information contained in the hyperspectral images, they are suitable data for anomalous targets detection. The use of spatial features in addition to spectral ones can improve the anomaly detection performance. An anomaly detector, called nonparametric spectral-spatial detector (NSSD), is proposed in this work which utilizes the benefits of spatial features and local structures extracted by the morphological filters. The obtained spectral-spatial hypercube has high dimensionality. So, accurate estimates of the background statistics in small local windows may not be obtained. Applying conventional detectors such as Local Reed Xiaoli (RX) to the high dimensional data is not possible. To deal with this difficulty, a nonparametric distance, without any need to estimate the data statistics, is used instead of the Mahalanobis distance. According to the experimental results, the detection accuracy improvement of the proposed NSSD method compared to Global RX, Local RX, weighted RX, linear filtering based RX (LF-RX), background joint sparse representation detection (BJSRD), Kernel RX, subspace RX (SSRX) and RX and uniform target detector (RX-UTD) in average is 47.68%, 27.86%, 13.23%, 29.26%, 3.33%, 17.07%, 15.88%, and 44.25%, respectively.
H.5. Image Processing and Computer Vision
A. Asilian Bidgoli; H. Ebrahimpour-Komle; M. Askari; Seyed J. Mousavirad
Abstract
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, ...
Read More
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help us obtain good performance by two schemes of task-parallelization and dataparallelization models. Parallel SPK algorithm ran over a cluster of computers and achieved less run time. A speedup value equal to 13 is obtained for a configuration with up to 5 Quad processors.
H.5.10. Applications
S. Shoorabi Sani
Abstract
In this study, a system for monitoring the structural health of bridge deck and predicting various possible damages to this section was designed based on measuring the temperature and humidity with the use of wireless sensor networks, and then it was implemented and investigated. A scaled model of a ...
Read More
In this study, a system for monitoring the structural health of bridge deck and predicting various possible damages to this section was designed based on measuring the temperature and humidity with the use of wireless sensor networks, and then it was implemented and investigated. A scaled model of a conventional medium sized bridge (length of 50 meters, height of 10 meters, and with 2 piers) was examined for the purpose of this study. This method includes installing two sensor nodes with the ability of measuring temperature and humidity on both side of the bridge deck. The data collected by the system including temperature and humidity values are received by a LABVIEW-based software to be analyzed and stored in a database. Proposed SHM monitoring system is equipped by a novel method of using data mining techniques on the database of climatic conditions of past few years related to the location of the bridge to predict the occurrence and severity of future damages. In addition, this system has several alarm levels which are based on analysis of bridge conditions with fuzzy inference method, so it can issue proactive and precise warnings and alarms in terms of place of occurrence and severity of possible damages in the bridge deck to ensure total productive (TPM) and proactive maintenance. Very low costs, increased efficiency of the bridge service, and reduced maintenance costs makes this SHM system a practical and applicable system. The data and results related to all mentioned subjects were thoroughly discussed .
H.3.8. Natural Language Processing
A. Khazaei; M. Ghasemzadeh
Abstract
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of ...
Read More
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of documents based on their content, it is expected that the answer to this question is yes. On the other hand, many differences between various languages can cause the answer to this question to be no. This research has focused on k-means that is one of the basic and popular document clustering methods. We want to know whether the clusters of aligned Persian and English texts obtained by the k-means are similar. To find an answer to this question, Mizan English-Persian Parallel Corpus was considered as benchmark. After features extraction using text mining techniques and applying the PCA dimension reduction method, the k-means clustering was performed. The morphological difference between English and Persian languages caused the larger feature vector length for Persian. So almost in all experiments, the English results were slightly richer than those in Persian. Aside from these differences, the overall behavior of Persian and English clusters was similar. These similar behaviors showed that results of k-means research on English can be expanded to Persian. Finally, there is hope that despite many differences between various languages, clustering methods may be extendable to other languages.
F.4.5. Markov processes
E. Golrasan; H. Sameti
Abstract
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator ...
Read More
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM framework, namely sparse code shrinkage-HMM (SCS-HMM).The proposed method on TIMIT database in the presence of three noise types at three SNR levels in terms of PESQ and SNR are evaluated and compared with Auto-Regressive HMM (AR-HMM) and speech enhancement based on HMM with discrete cosine transform (DCT) coefficients using Laplace and Gaussian distributions (LaGa-HMMDCT). The results confirm the superiority of SCS-HMM method in presence of non-stationary noises compared to LaGa-HMMDCT. The results of SCS-HMM method represent better performance of this method compared to AR-HMM in presence of white noise based on PESQ measure.
H. Haghshenas Gorgani; A. R. Jahantigh Pak
Abstract
Identification of the factors affecting teaching quality of engineering drawing and interaction between them is necessary until it is determined which manipulation will improve the quality of teaching this course. Since the above issue is a Multi-Criteria Decision Making (MCDM) problem and on the other ...
Read More
Identification of the factors affecting teaching quality of engineering drawing and interaction between them is necessary until it is determined which manipulation will improve the quality of teaching this course. Since the above issue is a Multi-Criteria Decision Making (MCDM) problem and on the other hand, we are faced with human factors, the Fuzzy DEMATEL method is suggested for solving it. Also, because DEMATEL analysis does not lead to a weighting of the criteria, it is combined with the ANP and a hybrid fuzzy DEMATEL-ANP (FDANP) methodology is used. The results of investigating 7 Dimensions and 21 Criteria show that the quality of teaching this course increases, if the updated teaching methods and contents to be used, the evaluation policy to be tailored to the course, the course professor and his/her assistants be available to correct students' mistakes and there is also an interactive system based on student comments.
H.3. Artificial Intelligence
T. Zare; M. T. Sadeghi; H. R. Abutalebi; J. Kittler
Abstract
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared ...
Read More
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the topic of metric learning, especially using kernel functions, which map data to feature spaces with enhanced class separability, and implicitly define a new metric in the original feature space. The formulation of the problem of metric learning depends on the supervisory information available for the task. In this paper, we focus on semi-supervised kernel based distance metric learning where the training data set is unlabelled, with the exception of a small subset of pairs of points labelled as belonging to the same class (cluster) or different classes (clusters). The proposed method involves creating a pool of kernel functions. The corresponding kernels matrices are first clustered to remove redundancy in representation. A composite kernel constructed from the kernel clustering result is then expanded into an orthogonal set of basis functions. The mixing parameters of this expansion are then optimised using point similarity and dissimilarity information conveyed by the labels. The proposed method is evaluated on synthetic and real data sets. The results show the merit of using similarity and dissimilarity information jointly as compared to using just the similarity information, and the superiority of the proposed method over all the recently introduced metric learning approaches.
H.3.8. Natural Language Processing
A. Akkasi; E. Varoglu
Abstract
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality ...
Read More
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracted is naturally imbalanced since chemical entities are fewer compared to other segments in text. In this paper, the class imbalance problem in the context of chemical named entity recognition has been studied and adopted version of random undersampling for NER data, has been leveraged to generate a pool of classifiers. In order to keep the classes’ distribution balanced within each sentence, the well-known random undersampling method is modified to a sentence based version where the random removal of samples takes place within each sentence instead of considering the dataset as a whole. Furthermore, to take the advantages of combination of a set of diverse predictors, an ensemble of classifiers trained with the set of different training data resulted by sentence-based undersampling, is created. The proposed approach is developed and tested using the ChemDNER corpus released by BioCreative IV. Results show that the proposed method improves the classification performance of the baseline classifiers mainly as a result of an increase in recall. Furthermore, the combination of high performing classifiers trained using undersampled train data surpasses the performance of all single best classifiers and the combination of classifiers using full data.
F.3.4. Applications
N. Ashrafi Payaman; M.R. Kangavari
Abstract
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. ...
Read More
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and attribute-based summaries respectively. For an attributed graph, a high quality summary is one that covers both graph structure and its attributes with the user-specified degrees of importance. Recently two methods has been proposed for summarizing a graph based on both graph structure and attribute similarities. In this paper, a new method for hybrid summarization of a given attributed graph has proposed and the quality of the summary generated by this method has compared with the recently proposed method for this purpose. Experimental results showed that our proposed method generates a summary with a better quality.
A.R. Tajary; E. Tahanian
Abstract
Wireless network on chip (WiNoC) is one of the promising on-chip interconnection networks for on-chip system architectures. In addition to wired links, these architectures also use wireless links. Using these wireless links makes packets reach destination nodes faster and with less power consumption. ...
Read More
Wireless network on chip (WiNoC) is one of the promising on-chip interconnection networks for on-chip system architectures. In addition to wired links, these architectures also use wireless links. Using these wireless links makes packets reach destination nodes faster and with less power consumption. These wireless links are provided by wireless interfaces in wireless routers. The WiNoC architectures differ in the position of the wireless routers and how they interact with other routers. So, the placement of wireless interfaces is an important step in designing WiNoC architectures. In this paper, we propose a simulated annealing (SA) placement method which considers the routing algorithm as a factor in designing cost function. To evaluate the proposed method, the Noxim, which is a cycle-accurate network-on-chip simulator, is used. The simulation results show that the proposed method can reduce flit latency by up to 24.6% with about a 0.2% increase in power consumption.
Document and Text Processing
S. Momtazi; A. Rahbar; D. Salami; I. Khanijazani
Abstract
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, ...
Read More
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use semantic models for document vector representations. Latent Dirichlet allocation (LDA) topic modeling and doc2vec neural document embedding are two well-known techniques for this purpose. In this paper, we first study the conceptual difference between the two models and show that they have different behavior and capture semantic features of texts from different perspectives. We then proposed a hybrid approach for document vector representation to benefit from the advantages of both models. The experimental results on 20newsgroup show the superiority of the proposed model compared to each of the baselines on both text clustering and classification tasks. We achieved 2.6% improvement in F-measure for text clustering and 2.1% improvement in F-measure in text classification compared to the best baseline model.
C.5. Operating Systems
M. Tajamolian; M. Ghasemzadeh
Abstract
In order to achieve the virtual machines live migration, the two "pre-copy" and "post-copy" strategies are presented. Each of these strategies, depending on the operating conditions of the machine, may perform better than the other. In this article, a new algorithm is presented that automatically decides ...
Read More
In order to achieve the virtual machines live migration, the two "pre-copy" and "post-copy" strategies are presented. Each of these strategies, depending on the operating conditions of the machine, may perform better than the other. In this article, a new algorithm is presented that automatically decides how the virtual machine live migration takes place. In this approach, the virtual machine memory is considered as an informational object that has a revision number and it is constantly changing. We have determined precise criteria for evaluating the behavior of a virtual machine and automatically select the appropriate live migration strategy. Also in this article, different aspects of required simulations and implementations are considered. Analytical evaluation shows that using the proposed scheme and the presented algorithm, can significantly improve the virtual machines live migration process.