IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E101-D No.1  (Publication Date:2018/01/01)

    Special Section on Enriched Multimedia — Potential and Possibility of Multimedia Contents for the Future —
  • FOREWORD Open Access

    Akinori ITO  

     
    FOREWORD

      Page(s):
    1-1
  • BiometricJammer: Method to Prevent Acquisition of Biometric Information by Surreptitious Photography on Fingerprints Open Access

    Isao ECHIZEN  Tateo OGANE  

     
    INVITED PAPER

      Pubricized:
    2017/10/16
      Page(s):
    2-12

    Advances in fingerprint authentication technology have led to it being used in a growing range of personal devices such as PCs and smartphones. However, they have also made it possible to capture fingerprints remotely with a digital camera, putting the target person at risk of illegal log-ins and identity theft. This article shows how fingerprint captured in this manner can be authenticated and how people can protect their fingerprints against surreptitious photography. First we show that photographed fingerprints have enough information to spoof fingerprint authentication systems by demonstrating with “fake fingers” made from such photographs. Then we present a method that defeats the use of surreptitious photography without preventing the use of legitimate fingerprint authentication devices. Finally, we demonstrate that an implementation of the proposed method called “BiometricJammer,” a wearable device put on a fingertip, can effectively prevent the illegal acquisition of fingerprints by surreptitious photography while still enabling contact-based fingerprint sensors to respond normally.

  • Robust Image Identification without Visible Information for JPEG Images

    Kenta IIDA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2017/10/16
      Page(s):
    13-19

    A robust identification scheme for JPEG images is proposed in this paper. The aim is to robustly identify JPEG images that are generated from the same original image, under various compression conditions such as differences in compression ratios and initial quantization matrices. The proposed scheme does not provide any false negative matches in principle. In addition, secure features, which do not have any visual information, are used to achieve not only a robust identification scheme but also secure one. Conventional schemes can not avoid providing false negative matches under some compression conditions, and are required to manage a secret key for secure identification. The proposed scheme is applicable to the uploading process of images on social networks like Twitter for image retrieval and forensics. A number of experiments are carried out to demonstrate that the effectiveness of the proposed method. The proposed method outperforms conventional ones in terms of query performances, while keeping a reasonable security level.

  • Scalable Distributed Video Coding for Wireless Video Sensor Networks

    Hong YANG  Linbo QING  Xiaohai HE  Shuhua XIONG  

     
    PAPER

      Pubricized:
    2017/10/16
      Page(s):
    20-27

    Wireless video sensor networks address problems, such as low power consumption of sensor nodes, low computing capacity of nodes, and unstable channel bandwidth. To transmit video of distributed video coding in wireless video sensor networks, we propose an efficient scalable distributed video coding scheme. In this scheme, the scalable Wyner-Ziv frame is based on transmission of different wavelet information, while the Key frame is based on transmission of different residual information. A successive refinement of side information for the Wyner-Ziv and Key frames are proposed in this scheme. Test results show that both the Wyner-Ziv and Key frames have four layers in quality and bit-rate scalable, but no increase in complexity of the encoder.

  • A Study on Quality Metrics for 360 Video Communications

    Huyen T. T. TRAN  Cuong T. PHAM  Nam PHAM NGOC  Anh T. PHAM  Truong Cong THANG  

     
    PAPER

      Pubricized:
    2017/10/16
      Page(s):
    28-36

    360 videos have recently become a popular virtual reality content type. However, a good quality metric for 360 videos is still an open issue. In this work, our goal is to identify appropriate objective quality metrics for 360 video communications. Especially, fourteen objective quality measures at different processing phases are considered. Also, a subjective test is conducted in this study. The relationship between objective quality and subjective quality is investigated. It is found that most of the PSNR-related quality measures are well correlated with subjective quality. However, for evaluating video quality across different contents, a content-based quality metric is needed.

  • On the Security of Block Scrambling-Based EtC Systems against Extended Jigsaw Puzzle Solver Attacks

    Tatsuya CHUMAN  Kenta KURIHARA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2017/10/16
      Page(s):
    37-44

    The aim of this paper is to apply automatic jigsaw puzzle solvers, which are methods of assembling jigsaw puzzles, to the field of information security. Encryption-then-Compression (EtC) systems have been considered for the user-controllable privacy protection of digital images in social network services. Block scrambling-based encryption schemes, which have been proposed to construct EtC systems, have enough key spaces for protecting brute-force attacks. However, each block in encrypted images has almost the same correlation as that of original images. Therefore, it is required to consider the security from different viewpoints from number theory-based encryption methods with provable security such as RSA and AES. In this paper, existing jigsaw puzzle solvers, which aim to assemble puzzles including only scrambled and rotated pieces, are first reviewed in terms of attacking strategies on encrypted images. Then, an extended jigsaw puzzle solver for block scrambling-based encryption scheme is proposed to solve encrypted images including inverted, negative-positive transformed and color component shuffled blocks in addition to scrambled and rotated ones. In the experiments, the jigsaw puzzle solvers are applied to encrypted images to consider the security conditions of the encryption schemes.

  • Shoulder-Surfing Resistant Authentication Using Pass Pattern of Pattern Lock

    So HIGASHIKAWA  Tomoaki KOSUGI  Shogo KITAJIMA  Masahiro MAMBO  

     
    PAPER

      Pubricized:
    2017/10/16
      Page(s):
    45-52

    We study an authentication method using secret figures of Pattern Lock, called pass patterns. In recent years, it is important to prevent the leakage of personal and company information on mobile devices. Android devices adopt a login authentication called Pattern Lock, which achieves both high resistance to Brute Force Attack and usability by virtue of pass pattern. However, Pattern Lock has a problem that pass patterns directly input to the terminal can be easily remembered by shoulder-surfing attack. In this paper, we propose a shoulder-surfing resistant authentication using pass pattern of Pattern Lock, which adopts a challenge & response authentication and also uses users' short-term memory. We implement the proposed method as an Android application and measure success rate, authentication time and the resistance against shoulder surfing. We also evaluate security and usability in comparison with related work.

  • Speech Privacy for Sound Surveillance Using Super-Resolution Based on Maximum Likelihood and Bayesian Linear Regression

    Ryouichi NISHIMURA  Seigo ENOMOTO  Hiroaki KATO  

     
    PAPER

      Pubricized:
    2017/10/16
      Page(s):
    53-63

    Surveillance with multiple cameras and microphones is promising to trace activities of suspicious persons for security purposes. When these sensors are connected to the Internet, they might also jeopardize innocent people's privacy because, as a result of human error, signals from sensors might allow eavesdropping by malicious persons. This paper presents a proposal for exploiting super-resolution to address this problem. Super-resolution is a signal processing technique by which a high-resolution version of a signal can be reproduced from a low-resolution version of the same signal source. Because of this property, an intelligible speech signal is reconstructed from multiple sensor signals, each of which is completely unintelligible because of its sufficiently low sampling rate. A method based on Bayesian linear regression is proposed in comparison with one based on maximum likelihood. Computer simulations using a simple sinusoidal input demonstrate that the methods restore the original signal from those which are actually measured. Moreover, results show that the method based on Bayesian linear regression is more robust than maximum likelihood under various microphone configurations in noisy environments and that this advantage is remarkable when the number of microphones enrolled in the process is as small as the minimum required. Finally, listening tests using speech signals confirmed that mean opinion score (MOS) of the reconstructed signal reach 3, while those of the original signal captured at each single microphone are almost 1.

  • A Local Feature Aggregation Method for Music Retrieval

    Jin S. SEO  

     
    LETTER

      Pubricized:
    2017/10/16
      Page(s):
    64-67

    The song-level feature summarization is an essential building block for browsing, retrieval, and indexing of digital music. This paper proposes a local pooling method to aggregate the feature vectors of a song over the universal background model. Two types of local activation patterns of feature vectors are derived; one representation is derived in the form of histogram, and the other is given by a binary vector. Experiments over three publicly-available music datasets show that the proposed local aggregation of the auditory features is promising for music-similarity computation.

  • Tolerance Evaluation of Audio Watermarking Method Based on Modification of Sound Pressure Level between Channels

    Harumi MURATA  Akio OGIHARA  Shigetoshi HAYASHI  

     
    LETTER

      Pubricized:
    2017/10/16
      Page(s):
    68-71

    We have proposed an audio watermarking method based on modification of sound pressure level between channels. This method is focused on the invariability of sound localization against sound processing like MP3 and the imperceptibility about slightly change of sound localization. In this paper, we investigate about tolerance evaluation against various attacks in reference to IHC criteria.

  • Special Section on Semantic Web and Linked Data
  • FOREWORD Open Access

    Takahiro KAWAMURA  

     
    FOREWORD

      Page(s):
    72-72
  • A Joint Neural Model for Fine-Grained Named Entity Classification of Wikipedia Articles

    Masatoshi SUZUKI  Koji MATSUDA  Satoshi SEKINE  Naoaki OKAZAKI  Kentaro INUI  

     
    PAPER

      Pubricized:
    2017/09/15
      Page(s):
    73-81

    This paper addresses the task of assigning labels of fine-grained named entity (NE) types to Wikipedia articles. Information of NE types are useful when extracting knowledge of NEs from natural language text. It is common to apply an approach based on supervised machine learning to named entity classification. However, in a setting of classifying into fine-grained types, one big challenge is how to alleviate the data sparseness problem since one may obtain far fewer instances for each fine-grained types. To address this problem, we propose two methods. First, we introduce a multi-task learning framework, in which NE type classifiers are all jointly trained with a neural network. The neural network has a hidden layer, where we expect that effective combinations of input features are learned across different NE types. Second, we propose to extend the input feature set by exploiting the hyperlink structure of Wikipedia. While most of previous studies are focusing on engineering features from the articles' contents, we observe that the information of the contexts the article is mentioned can also be a useful clue for NE type classification. Concretely, we propose to learn article vectors (i.e. entity embeddings) from Wikipedia's hyperlink structure using a Skip-gram model. Then we incorporate the learned article vectors into the input feature set for NE type classification. To conduct large-scale practical experiments, we created a new dataset containing over 22,000 manually labeled articles. With the dataset, we empirically show that both of our ideas gained their own statistically significant improvement separately in classification accuracy. Moreover, we show that our proposed methods are particularly effective in labeling infrequent NE types. We've made the learned article vectors publicly available. The labeled dataset is available if one contacts the authors.

  • A Comparative Study of Rule-Based Inference Engines for the Semantic Web

    Thanyalak RATTANASAWAD  Marut BURANARACH  Kanda Runapongsa SAIKAEW  Thepchai SUPNITHI  

     
    PAPER

      Pubricized:
    2017/09/15
      Page(s):
    82-89

    With the Semantic Web data standards defined, more applications demand inference engines in providing support for intelligent processing of the Semantic Web data. Rule-based inference engines or rule-based reasoners are used in many domains, such as in clinical support, and e-commerce recommender system development. This article reviews and compares key features of three freely-available rule-based reasoners: Jena inference engine, Euler YAP Engine, and BaseVISor. A performance evaluation study was conducted to assess the scalability and efficiency of these systems using data and rule sets adapted from the Berlin SPARQL Benchmark. We describe our methodology in assessing rule-based reasoners based on the benchmark. The study result shows the efficiency of the systems in performing reasoning tasks over different data sizes and rules involving various rule properties. The review and comparison results can provide a basis for users in choosing appropriate rule-based inference engines to match their application requirements.

  • An Automatic Knowledge Graph Creation Framework from Natural Language Text

    Natthawut KERTKEIDKACHORN  Ryutaro ICHISE  

     
    PAPER

      Pubricized:
    2017/09/15
      Page(s):
    90-98

    Knowledge graphs (KG) play a crucial role in many modern applications. However, constructing a KG from natural language text is challenging due to the complex structure of the text. Recently, many approaches have been proposed to transform natural language text to triples to obtain KGs. Such approaches have not yet provided efficient results for mapping extracted elements of triples, especially the predicate, to their equivalent elements in a KG. Predicate mapping is essential because it can reduce the heterogeneity of the data and increase the searchability over a KG. In this article, we propose T2KG, an automatic KG creation framework for natural language text, to more effectively map natural language text to predicates. In our framework, a hybrid combination of a rule-based approach and a similarity-based approach is presented for mapping a predicate to its corresponding predicate in a KG. Based on experimental results, the hybrid approach can identify more similar predicate pairs than a baseline method in the predicate mapping task. An experiment on KG creation is also conducted to investigate the performance of the T2KG. The experimental results show that the T2KG also outperforms the baseline in KG creation. Although KG creation is conducted in open domains, in which prior knowledge is not provided, the T2KG still achieves an F1 score of approximately 50% when generating triples in the KG creation task. In addition, an empirical study on knowledge population using various text sources is conducted, and the results indicate the T2KG could be used to obtain knowledge that is not currently available from DBpedia.

  • Classification of Linked Data Sources Using Semantic Scoring

    Semih YUMUSAK  Erdogan DOGDU  Halife KODAZ  

     
    PAPER

      Pubricized:
    2017/09/15
      Page(s):
    99-107

    Linked data sets are created using semantic Web technologies and they are usually big and the number of such datasets is growing. The query execution is therefore costly, and knowing the content of data in such datasets should help in targeted querying. Our aim in this paper is to classify linked data sets by their knowledge content. Earlier projects such as LOD Cloud, LODStats, and SPARQLES analyze linked data sources in terms of content, availability and infrastructure. In these projects, linked data sets are classified and tagged principally using VoID vocabulary and analyzed according to their content, availability and infrastructure. Although all linked data sources listed in these projects appear to be classified or tagged, there are a limited number of studies on automated tagging and classification of newly arriving linked data sets. Here, we focus on automated classification of linked data sets using semantic scoring methods. We have collected the SPARQL endpoints of 1,328 unique linked datasets from Datahub, LOD Cloud, LODStats, SPARQLES, and SpEnD projects. We have then queried textual descriptions of resources in these data sets using their rdfs:comment and rdfs:label property values. We analyzed these texts in a similar manner with document analysis techniques by assuming every SPARQL endpoint as a separate document. In this regard, we have used WordNet semantic relations library combined with an adapted term frequency-inverted document frequency (tfidf) analysis on the words and their semantic neighbours. In WordNet database, we have extracted information about comment/label objects in linked data sources by using hypernym, hyponym, homonym, meronym, region, topic and usage semantic relations. We obtained some significant results on hypernym and topic semantic relations; we can find words that identify data sets and this can be used in automatic classification and tagging of linked data sources. By using these words, we experimented different classifiers with different scoring methods, which results in better classification accuracy results.

  • An Ontological Model for Fire Emergency Situations

    Kattiuscia BITENCOURT  Frederico ARAÚJO DURÃO  Manoel MENDONÇA  Lassion LAIQUE BOMFIM DE SOUZA SANTANA  

     
    PAPER

      Pubricized:
    2017/09/15
      Page(s):
    108-115

    The emergency response process is quite complex since there is a wide variety of elements to be evaluated for taking decisions. Uncertainties generated by subjectivity and imprecision affect the safety and effectiveness of actions. The aim of this paper is to develop an onto-logy for emergency response protocols, in particular, to fires in buildings. This developed ontology supports the knowledge sharing, evaluation and review of the protocols used, contributing to the tactical and strategic planning of organizations. The construction of the ontology was based on the methodology Methontology. The domain specification and conceptualization were based in qualitative research, in which were evaluated 131 terms with definitions, of which 85 were approved by specialists. From there, in the Protégé tool, the domain's taxonomy and the axioms were created. The specialists validated the ontology using the assessment by human approach (taxonomy, application and structure). Thus, a sustainable ontology model to the rescue tactical phase was ensured.

  • Temporal and Spatial Expansion of Urban LOD for Solving Illegally Parked Bicycles in Tokyo

    Shusaku EGAMI  Takahiro KAWAMURA  Akihiko OHSUGA  

     
    PAPER

      Pubricized:
    2017/09/15
      Page(s):
    116-129

    The illegal parking of bicycles is a serious urban problem in Tokyo. The purpose of this study was to sustainably build Linked Open Data (LOD) to assist in solving the problem of illegally parked bicycles (IPBs) by raising social awareness, in cooperation with the Office for Youth Affairs and Public Safety of the Tokyo Metropolitan Government (Tokyo Bureau). We first extracted information on the problem factors and designed LOD schema for IPBs. Then we collected pieces of data from the Social Networking Service (SNS) and the websites of municipalities to build the illegally parked bicycle LOD (IPBLOD) with more than 200,000 triples. We then estimated the temporal missing data in the LOD based on the causal relations from the problem factors and estimated spatial missing data based on geospatial features. As a result, the number of IPBs can be inferred with about 70% accuracy, and places where bicycles might be illegally parked are estimated with about 31% accuracy. Then we published the complemented LOD and a Web application to visualize the distribution of IPBs in the city. Finally, we applied IPBLOD to large social activity in order to raise social awareness of the IPB issues and to remove IPBs, in cooperation with the Tokyo Bureau.

  • Regular Section
  • Changes of Evaluation Values on Component Rank Model by Taking Code Clones into Consideration

    Reishi YOKOMORI  Norihiro YOSHIDA  Masami NORO  Katsuro INOUE  

     
    PAPER-Software System

      Pubricized:
    2017/10/05
      Page(s):
    130-141

    There are many software systems that have been used and maintained for a long time. By undergoing such a maintenance process, similar code fragments were intentionally left in the source code of such software, and knowing how to manage a software system that contains a lot of similar code fragments becomes a major concern. In this study, we proposed a method to pick up components that were commonly used in similar code fragments from a target software system. This method was realized by using the component rank model and by checking the differences of evaluation values for each component before and after merging components that had similar code fragments. In many cases, components whose evaluation value had decreased would be used by both the components that were merged, so we considered that these components were commonly used in similar code fragments. Based on the proposed approach, we implemented a system to calculate differences of evaluation values for each component, and conducted three evaluation experiments to confirm that our method was useful for detecting components that were commonly used in similar code fragments, and to confirm how our method can help developers when developers add similar components. Based on the experimental results, we also discuss some improvement methods and provide the results from applications of these methods.

  • Pivot Generation Algorithm with a Complete Binary Tree for Efficient Exact Similarity Search

    Yuki YAMAGISHI  Kazuo AOYAMA  Kazumi SAITO  Tetsuo IKEDA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/10/20
      Page(s):
    142-151

    This paper presents a pivot-set generation algorithm for accelerating exact similarity search in a large-scale data set. To deal with the large-scale data set, it is important to efficiently construct a search index offline as well as to perform fast exact similarity search online. Our proposed algorithm efficiently generates competent pivots with two novel techniques: hierarchical data partitioning and fast pivot optimization techniques. To make effective use of a small number of pivots, the former recursively partitions a data set into two subsets with the same size depending on the rank order from each of two assigned pivots, resulting in a complete binary tree. The latter calculates a defined objective function for pivot optimization with a low computational cost by skillfully operating data objects mapped into a pivot space. Since the generated pivots provide the tight lower bounds on distances between a query object and the data objects, an exact similarity search algorithm effectively avoids unnecessary distance calculations. We demonstrate that the search algorithm using the pivots generated by the proposed algorithm reduces distance calculations with an extremely high rate regarding a range query problem for real large-scale image data sets.

  • Concurrency Control Protocol for Parallel B-Tree Structures That Improves the Efficiency of Request Transfers and SMOs within a Node

    Tomohiro YOSHIHARA  Dai KOBAYASHI  Haruo YOKOTA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/10/18
      Page(s):
    152-170

    Many concurrency control protocols for B-trees use latch-coupling because its execution is efficient on a single machine. Some studies have indicated that latch-coupling may involve a performance bottleneck when using multicore processors in a shared-everything environment, but no studies have considered the possible performance bottleneck caused by sending messages between processing elements (PEs) in shared-nothing environments. We propose two new concurrency control protocols, “LCFB” and “LCFB-link”, which require no latch-coupling in optimistic processes. The LCFB-link also innovates B-link approach within each PE to reduce the cost of modifications in the PE, as a solution to the difficulty of consistency management for the side pointers in a parallel B-tree. The B-link algorithm is well known as a protocol without latch-coupling, but B-link has the difficulty of guaranteeing the consistency of the side pointers in a parallel B-tree. Experimental results in various environments indicated that the system throughput of the proposed protocols was always superior to those of the conventional protocols, particularly in large-scale configurations, and using LCFB-link was effective for higher update ratios. In addition, to mitigate access skew, data should migrate between PEs. We have demonstrated that our protocols always improve the system throughput and are effective as concurrency controls for data migration.

  • Fisheye Map Using Stroke-Based Generalization for Web Map Services

    Daisuke YAMAMOTO  Masaki MURASE  Naohisa TAKAHASHI  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/10/05
      Page(s):
    171-180

    A fisheye map lets users view both detailed and wide areas. The Focus+Glue+Context map is a fisheye map suited for Web map systems; it consists of a detailed map (i.e., Focus), wide-area map (i.e., Context), and an area to absorb the difference in scales between Focus and Context (i.e., Glue). Because Glue is compressed, the road density is too high to draw all of the roads in this area. Although existing methods can filter roads to draw, they have problems with rendering the road density and connectivity in Glue. This paper proposes an improved method to filter roads in Glue by applying a generalization method based on weighted strokes. In addition, a technique to speed up the proposed method by using a weighted stroke database is described. A prototype Web map system with a high level of response was developed and evaluated in terms of its connectivity, road density, and response.

  • An Efficient Algorithm for Location-Aware Query Autocompletion Open Access

    Sheng HU  Chuan XIAO  Yoshiharu ISHIKAWA  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2017/10/05
      Page(s):
    181-192

    Query autocompletion is an important and practical technique when users want to search for desirable information. As mobile devices become more and more popular, one of the main applications is location-aware service, such as Web mapping. In this paper, we propose a new solution to location-aware query autocompletion. We devise a trie-based index structure and integrate spatial information into trie nodes. Our method is able to answer both range and top-k queries. In addition, we discuss the extension of our method to support the error tolerant feature in case user's queries contain typographical errors. Experiments on real datasets show that the proposed method outperforms existing methods in terms of query processing performance.

  • Personal Viewpoint Navigation Based on Object Trajectory Distribution for Multi-View Videos

    Xueting WANG  Kensho HARA  Yu ENOKIBORI  Takatsugu HIRAYAMA  Kenji MASE  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2017/10/12
      Page(s):
    193-204

    Multi-camera videos with abundant information and high flexibility are useful in a wide range of applications, such as surveillance systems, web lectures, news broadcasting, concerts and sports viewing. Viewers can enjoy an enhanced viewing experience by choosing their own viewpoint through viewing interfaces. However, some viewers may feel annoyed by the need for continual manual viewpoint selection, especially when the number of selectable viewpoints is relatively large. In order to solve this issue, we propose an automatic viewpoint navigation method designed especially for sports. This method focuses on a viewer's personal preference for viewpoint selection, instead of common and professional editing rules. We assume that different trajectory distributions of viewing objects cause a difference in the viewpoint selection according to personal preference. We learn the relationship between the viewer's personal viewpoint-selection tendency and the spatio-temporal game context represented by the objects trajectories. We compare three methods based on Gaussian mixture model, SVM with a general histogram and SVM with a bag-of-words to seek the best learning scheme for this relationship. The performance of the proposed methods are evaluated by assessing the degree of similarity between the selected viewpoints and the viewers' edited records.

  • Learning Supervised Feature Transformations on Zero Resources for Improved Acoustic Unit Discovery

    Michael HECK  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2017/10/20
      Page(s):
    205-214

    In this work we utilize feature transformations that are common in supervised learning without having prior supervision, with the goal to improve Dirichlet process Gaussian mixture model (DPGMM) based acoustic unit discovery. The motivation of using such transformations is to create feature vectors that are more suitable for clustering. The need of labels for these methods makes it difficult to use them in a zero resource setting. To overcome this issue we utilize a first iteration of DPGMM clustering to generate frame based class labels for the target data. The labels serve as basis for learning linear discriminant analysis (LDA), maximum likelihood linear transform (MLLT) and feature-space maximum likelihood linear regression (fMLLR) based feature transformations. The novelty of our approach is the way how we use a traditional acoustic model training pipeline for supervised learning to estimate feature transformations in a zero resource scenario. We show that the learned transformations greatly support the DPGMM sampler in finding better clusters, according to the performance of the DPGMM posteriorgrams on the ABX sound class discriminability task. We also introduce a method for combining posteriorgram outputs of multiple clusterings and demonstrate that such combinations can further improve sound class discriminability.

  • Optimal Permutation Based Block Compressed Sensing for Image Compression Applications

    Yuqiang CAO  Weiguo GONG  Bo ZHANG  Fanxin ZENG  Sen BAI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2017/10/20
      Page(s):
    215-224

    Block compressed sensing (CS) with optimal permutation is a promising method to improve sampling efficiency in CS-based image compression. However, the existing optimal permutation scheme brings a large amount of extra data to encode the permutation information because it needs to know the permutation information to accomplish signal reconstruction. When the extra data is taken into consideration, the improvement in sampling efficiency of this method is limited. In order to solve this problem, a new optimal permutation strategy for block CS (BCS) is proposed. Based on the proposed permutation strategy, an improved optimal permutation based BCS method called BCS-NOP (BCS with new optimal permutation) is proposed in this paper. Simulation results show that the proposed approach reduces the amount of extra data to encode the permutation information significantly and thereby improves the sampling efficiency compared with the existing optimal permutation based BCS approach.

  • An Empirical Study of Classifier Combination Based Word Sense Disambiguation

    Wenpeng LU  Hao WU  Ping JIAN  Yonggang HUANG  Heyan HUANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/08/23
      Page(s):
    225-233

    Word sense disambiguation (WSD) is to identify the right sense of ambiguous words via mining their context information. Previous studies show that classifier combination is an effective approach to enhance the performance of WSD. In this paper, we systematically review state-of-the-art methods for classifier combination based WSD, including probability-based and voting-based approaches. Furthermore, a new classifier combination based WSD, namely the probability weighted voting method with dynamic self-adaptation, is proposed in this paper. Compared with existing approaches, the new method can take into consideration both the differences of classifiers and ambiguous instances. Exhaustive experiments are performed on a real-world dataset, the results show the superiority of our method over state-of-the-art methods.

  • A GPU-Based Rasterization Algorithm for Boolean Operations on Polygons

    Yi GAO  Jianxin LUO  Hangping QIU  Bin TANG  Bo WU  Weiwei DUAN  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2017/09/29
      Page(s):
    234-238

    This paper presents a new GPU-based rasterization algorithm for Boolean operations that handles arbitary closed polygons. We construct an efficient data structure for interoperation of CPU and GPU and propose a fast GPU-based contour extraction method to ensure the performance of our algorithm. We then design a novel traversing strategy to achieve an error-free calculation of intersection point for correct Boolean operations. We finally give a detail evaluation and the results show that our algorithm has a higher performance than exsiting algorithms on processing polygons with large amount of vertices.

  • SEDONA: A Novel Protocol for Identifying Infrequent, Long-Running Daemons on a Linux System

    Young-Kyoon SUH  

     
    LETTER-Software Engineering

      Pubricized:
    2017/05/30
      Page(s):
    239-243

    Measuring program execution time is a much-used technique for performance evaluation in computer science. Without proper care, however, timed results may vary a lot, thus making it hard to trust their validity. We propose a novel timing protocol to significantly reduce such variability by eliminating executions involving infrequent, long-running daemons.

  • An Approach to Effective Recommendation Considering User Preference and Diversity Simultaneously

    Sang-Chul LEE  Sang-Wook KIM  Sunju PARK  Dong-Kyu CHAE  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2017/09/28
      Page(s):
    244-248

    This paper addresses recommendation diversification. Existing diversification methods have difficulty in dealing with the tradeoff between accuracy and diversity. We point out the root of the problem in diversification methods and propose a novel method that can avoid the problem. Our method aims to find an optimal solution of the objective function that is carefully designed to consider user preference and the diversity among recommended items simultaneously. In addition, we propose an item clustering and a greedy approximation to achieve efficiency in recommendation.

  • A White-Box Cryptographic Implementation for Protecting against Power Analysis

    Seungkwang LEE  

     
    LETTER-Information Network

      Pubricized:
    2017/10/19
      Page(s):
    249-252

    Encoded lookup tables used in white-box cryptography are known to be vulnerable to power analysis due to the imbalanced encoding. This means that the countermeasures against white-box attacks can not even defend against gray-box attacks. For this reason, those who want to defend against power analysis through the white-box cryptographic implementation need to find other ways. In this paper, we propose a method to defend power analysis without resolving the problematic encoding problem. Compared with the existing white-box cryptography techniques, the proposed method has twice the size of the lookup table and nearly the same amount of computation.

  • Regular Expression Filtering on Multiple q-Grams

    Seon-Ho SHIN  HyunBong KIM  MyungKeun YOON  

     
    LETTER-Information Network

      Pubricized:
    2017/10/11
      Page(s):
    253-256

    Regular expression matching is essential in network and big-data applications; however, it still has a serious performance bottleneck. The state-of-the-art schemes use a multi-pattern exact string-matching algorithm as a filtering module placed before a heavy regular expression engine. We design a new approximate string-matching filter using multiple q-grams; this filter not only achieves better space compactness, but it also has higher throughput than the existing filters.

  • Gender Attribute Mining with Hand-Dorsa Vein Image Based on Unsupervised Sparse Feature Learning

    Jun WANG  Guoqing WANG  Zaiyu PAN  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2017/10/12
      Page(s):
    257-260

    Gender classification with hand-dorsa vein information, a new soft biometric trait, is solved with the proposed unsupervised sparse feature learning model, state-of-the-art accuracy demonstrates the effectiveness of the proposed model. Besides, we also argue that the proposed data reconstruction model is also applicable to age estimation when comprehensive database differing in age is accessible.

  • Statistical Property Guided Feature Extraction for Volume Data

    Li WANG  Xiaoan TANG  Junda ZHANG  Dongdong GUAN  

     
    LETTER-Pattern Recognition

      Pubricized:
    2017/10/13
      Page(s):
    261-264

    Feature visualization is of great significances in volume visualization, and feature extraction has been becoming extremely popular in feature visualization. While precise definition of features is usually absent which makes the extraction difficult. This paper employs probability density function (PDF) as statistical property, and proposes a statistical property guided approach to extract features for volume data. Basing on feature matching, it combines simple liner iterative cluster (SLIC) with Gaussian mixture model (GMM), and could do extraction without accurate feature definition. Further, GMM is paired with a normality test to reduce time cost and storage requirement. We demonstrate its applicability and superiority by successfully applying it on homogeneous and non-homogeneous features.

  • A Simple and Effective Generalization of Exponential Matrix Discriminant Analysis and Its Application to Face Recognition

    Ruisheng RAN  Bin FANG  Xuegang WU  Shougui ZHANG  

     
    LETTER-Pattern Recognition

      Pubricized:
    2017/10/18
      Page(s):
    265-268

    As an effective method, exponential discriminant analysis (EDA) has been proposed and widely used to solve the so-called small-sample-size (SSS) problem. In this paper, a simple and effective generalization of EDA is presented and named as GEDA. In GEDA, a general exponential function, where the base of exponential function is larger than the Euler number, is used. Due to the property of general exponential function, the distance between samples belonging to different classes is larger than that of EDA, and then the discrimination property is largely emphasized. The experiment results on the Extended Yale and CMU-PIE face databases show that, GEDA gets more advantageous recognition performance compared to EDA.

  • Encoding Detection and Bit Rate Classification of AMR-Coded Speech Based on Deep Neural Network

    Seong-Hyeon SHIN  Woo-Jin JANG  Ho-Won YUN  Hochong PARK  

     
    LETTER-Speech and Hearing

      Pubricized:
    2017/10/20
      Page(s):
    269-272

    A method for encoding detection and bit rate classification of AMR-coded speech is proposed. For each texture frame, 184 features consisting of the short-term and long-term temporal statistics of speech parameters are extracted, which can effectively measure the amount of distortion due to AMR. The deep neural network then classifies the bit rate of speech after analyzing the extracted features. It is confirmed that the proposed features provide better performance than the conventional spectral features designed for bit rate classification of coded audio.

  • Learning Deep Relationship for Object Detection

    Nuo XU  Chunlei HUO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/09/28
      Page(s):
    273-276

    Object detection has been a hot topic of image processing, computer vision and pattern recognition. In recent years, training a model from labeled images using machine learning technique becomes popular. However, the relationship between training samples is usually ignored by existing approaches. To address this problem, a novel approach is proposed, which trains Siamese convolutional neural network on feature pairs and finely tunes the network driven by a small amount of training samples. Since the proposed method considers not only the discriminative information between objects and background, but also the relationship between intraclass features, it outperforms the state-of-arts on real images.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.