Naoki YAMADA Yuji YAMAGATA Naoki FUKUTA
On an inference-enabled Linked Open Data (LOD) endpoint, usually a query execution takes longer than on an LOD endpoint without inference engine due to its processing of reasoning. Although there are two separate kind of approaches, query modification approaches, and ontology modifications have been investigated on the different contexts, there have been discussions about how they can be chosen or combined for various settings. In this paper, for reducing query execution time on an inference-enabled LOD endpoint, we compare these two promising methods: query rewriting and ontology modification, as well as trying to combine them into a cluster of such systems. We employ an evolutionary approach to make such rewriting and modification of queries and ontologies based on the past-processed queries and their results. We show how those two approaches work well on implementing an inference-enabled LOD endpoint by a cluster of SPARQL endpoints.
Existing noise inference algorithms neglected the smooth characteristics of noise data, which results in executing slowly of noise inference. In order to address this problem, we present a noise inference algorithm based on fast context-aware tensor decomposition (F-CATD). F-CATD improves the noise inference algorithm based on context-aware tensor decomposition algorithm. It combines the smoothness constraint with context-aware tensor decomposition to speed up the process of decomposition. Experiments with New York City 311 noise data show that the proposed method accelerates the noise inference. Compared with the existing method, F-CATD reduces 4-5 times in terms of time consumption while keeping the effectiveness of the results.
This paper studies a novel iterative detection algorithm for data detection in orthogonal frequency division multiplexing systems in the presence of phase noise (PHN) and channel estimation errors. By simplifying the maximum a posteriori algorithm based on the theory of variational inference, an optimization problem over variational free energy is formulated. After that, the estimation of data, PHN and channel state information is obtained jointly and iteratively. The simulations indicate the validity of this algorithm and show a better performance compared with the traditional schemes.
Takayoshi SHOUDAI Kazuhide AIKOH Yusuke SUZUKI Satoshi MATSUMOTO Tetsuhiro MIYAHARA Tomoyuki UCHIDA
An efficient means of learning tree-structural features from tree-structured data would enable us to construct effective mining methods for tree-structured data. Here, a pattern representing rich tree-structural features common to tree-structured data and a polynomial time algorithm for learning important tree patterns are necessary for mining knowledge from tree-structured data. As such a tree pattern, we introduce a term tree pattern t such that any edge label of t belongs to a finite alphabet Λ, any internal vertex of t has ordered children and t has a new kind of structured variable, called a height-constrained variable. A height-constrained variable has a pair of integers (i, j) as constraints, and it can be replaced with a tree whose trunk length is at least i and whose height is at most j. This replacement is called height-constrained replacement. A sequence of consecutive height-constrained variables is called a variable-chain. In this paper, we present polynomial time algorithms for solving the membership problem and the minimal language (MINL) problem for term tree patternshaving no variable-chain. The membership problem for term tree patternsis to decide whether or not a given tree can be obtained from a given term tree pattern by applying height-constrained replacements to all height-constrained variables in the term tree pattern. The MINL problem for term tree patternsis to find a term tree pattern t such that the language generated by t is minimal among languages, generated by term tree patterns, which contain all given tree-structured data. Finally, we show that the class, i.e., the set of all term tree patternshaving no variable-chain, is polynomial time inductively inferable from positive data if |Λ| ≥ 2.
Keita KOBAYASHI Hiroyuki TSUJI Tomoaki KIMURA
In this paper, we propose a digital image enlargement method based on a fuzzy technique that improves half-pixel generation, especially for convex and concave signals. The proposed method is a modified version of the image enlargement scheme previously proposed by the authors, which achieves accurate half-pixel interpolation and enlarges the original image by convolution with the Lanczos function. However, the method causes impulse-like artifacts in the enlarged image. In this paper, therefore, we introduce a fuzzy set and fuzzy rule for generating half-pixels to improve the interpolation of convex and concave signals. Experimental results demonstrate that, in terms of image quality, the proposed method shows superior performance compared to bicubic interpolation and our previous method.
Ding XIAO Rui WANG Lingling WU
With the surge of social media platform, users' profile information become treasure to enhance social network services. However, attributes information of most users are not complete, thus it is important to infer latent attributes of users. Contemporary attribute inference methods have a basic assumption that there are enough labeled data to train a model. However, in social media, it is very expensive and difficult to label a large amount of data. In this paper, we study the latent attribute inference problem with very small labeled data and propose the SRW-COND solution. In order to solve the difficulty of small labeled data, SRW-COND firstly extends labeled data with a simple but effective greedy algorithm. Then SRW-COND employs a supervised random walk process to effectively utilize the known attributes information and link structure of users. Experiments on two real datasets illustrate the effectiveness of SRW-COND.
Keisuke IMOTO Suehiro SHIMAUCHI
We propose a novel method for estimating acoustic scenes such as user activities, e.g., “cooking,” “vacuuming,” “watching TV,” or situations, e.g., “being on the bus,” “being in a park,” “meeting,” utilizing the information of acoustic events. There are some methods for estimating acoustic scenes that associate a combination of acoustic events with an acoustic scene. However, the existing methods cannot adequately express acoustic scenes, e.g., “cooking,” that have more than one subordinate category, e.g., “frying ingredients” or “plating food,” because they directly associate acoustic events with acoustic scenes. In this paper, we propose an acoustic scene estimation method based on a hierarchical probabilistic generative model of an acoustic event sequence taking into account the relation among acoustic scenes, their subordinate categories, and acoustic event sequences. In the proposed model, each acoustic scene is represented as a probability distribution over their unsupervised subordinate categories, called “acoustic sub-topics,” and each acoustic sub-topic is represented as a probability distribution over acoustic events. Acoustic scene estimation experiments with real-life sounds showed that the proposed method could correctly extract subordinate categories of acoustic scenes.
Zhikai XU Hongli ZHANG Xiangzhan YU Shen SU
Location-based services (LBSs) are useful for many applications in internet of things(IoT). However, LBSs has raised serious concerns about users' location privacy. In this paper, we propose a new location privacy attack in LBSs called hidden location inference attack, in which the adversary infers users' hidden locations based on the users' check-in histories. We discover three factors that influence individual check-in behaviors: geographic information, human mobility patterns and user preferences. We first separately evaluate the effects of each of these three factors on users' check-in behaviors. Next, we propose a novel algorithm that integrates the above heterogeneous factors and captures the probability of hidden location privacy leakage. Then, we design a novel privacy alert framework to warn users when their sharing behavior does not match their sharing rules. Finally, we use our experimental results to demonstrate the validity and practicality of the proposed strategy.
Shogo OKADA Mi HANG Katsumi NITTA
This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R2) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.
In this paper, a self optimization beamforming null control (SOBNC) scheme is proposed. There is a need of maintaining signal to interference plus noise ratio (SINR) threshold to control modulation and coding schemes (MCS) in recent technologies like Wi-Fi, Long Term Evolution (LTE) and Long Term Evolution Advanced (LTE-A). Selection of MCS depends on the SINR threshold that allows maintaining key performance index (KPI) like block error rate (BLER), bit error rate (BER) and throughput at certain level. The SOBNC is used to control the antenna pattern for SINR estimation and improve the SINR performance of the wireless communication systems. The nulling comes with a price; if wider nulls are introduced, i.e. more number of nulls are used, the 3dB beam-width and peak side lobe level (SLL) in antenna pattern changes critically. This paper proposes a method which automatically controls the number of nulls in the antenna pattern as per the changing environment based on adaptive-network based fuzzy interference system (ANFIS) to maintain output SINR level higher or equal to the required threshold. Finally, simulation results show a performance superiority of the proposed SOBNC compared with minimum mean square error (MMSE) based adaptive nulling control algorithm and conventional fixed null scheme.
Yasunori ISHIHARA Yasuhiro USHIROZAKO Kengo MORI Jun FURUKAWA
In this letter, we propose a secrecy criterion for outsourcing encrypted databases. In encrypted databases, encryption schemes revealing some information are often used in order to manipulate encrypted data efficiently. The proposed criterion is based on inference analysis for databases: We simulate attacker's inference on specified secret information with and without the revealed information from the encrypted database. When the two inference results are the same, then secrecy of the specified information is preserved against outsourcing the encrypted database. We also show that the proposed criterion is decidable under a practical setting.
Resource Description Framework (RDF) access control suffers from an authorization conflict problem caused by RDF inference. When an access authorization is specified, it can lie in conflict with other access authorizations that have the opposite security sign as a result of RDF inference. In our former study, we analyzed the authorization conflict problem caused by subsumption inference, which is the key inference in RDF. The Rule Interchange Format (RIF) is a Web standard rule language recommended by W3C, and can be combined with RDF data. Therefore, as in RDF inference, an authorization conflict can be caused by RIF inference. In addition, this authorization conflict can arise as a result of the interaction of RIF inference and RDF inference rather than of RIF inference alone. In this paper, we analyze the authorization conflict problem caused by RIF inference and suggest an efficient authorization conflict detection algorithm. The algorithm exploits the graph labeling-based algorithm proposed in our earlier paper. Through experiments, we show that the performance of the graph labeling-based algorithm is outstanding for large RDF data.
Nozomi MIYA Tota SUKO Goki YASUDA Toshiyasu MATSUSHIMA
In this paper, sequential prediction is studied. The typical assumptions about the probabilistic model in sequential prediction are following two cases. One is the case that a certain probabilistic model is given and the parameters are unknown. The other is the case that not a certain probabilistic model but a class of probabilistic models is given and the parameters are unknown. If there exist some parameters and some models such that the distributions that are identified by them equal the source distribution, an assumed model or a class of models can represent the source distribution. This case is called that specifiable condition is satisfied. In this study, the decision based on the Bayesian principle is made for a class of probabilistic models (not for a certain probabilistic model). The case that specifiable condition is not satisfied is studied. Then, the asymptotic behaviors of the cumulative logarithmic loss for individual sequence in the sense of almost sure convergence and the expected loss, i.e. redundancy are analyzed and the constant terms of the asymptotic equations are identified.
Analysis of the trust network proves beneficial to the users in Online Social Networks (OSNs) for decision-making. Since the construction of trust propagation paths connecting unfamiliar users is the preceding work of trust inference, it is vital to find appropriate trust propagation paths. Most of existing trust network discovery algorithms apply the classical exhausted searching approaches with low efficiency and/or just take into account the factors relating to trust without regard to the role of distrust relationships. To solve the issues, we first analyze the trust discounting operators with structure balance theory and validate the distribution characteristics of balanced transitive triads. Then, Maximum Indirect Referral Belief Search (MIRBS) and Minimum Indirect Functional Uncertainty Search (MIFUS) strategies are proposed and followed by the Optimal Trust Inference Path Search (OTIPS) algorithms accordingly on the basis of the bidirectional versions of Dijkstra's algorithm. The comparative experiments of path search, trust inference and edge sign prediction are performed on the Epinions data set. The experimental results show that the proposed algorithm can find the trust inference path with better efficiency and the found paths have better applicability to trust inference.
Many kinds of data can be represented as a network or graph. It is crucial to infer the latent structure underlying such a network and to predict unobserved links in the network. Mixed Membership Stochastic Blockmodel (MMSB) is a promising model for network data. Latent variables and unknown parameters in MMSB have been estimated through Bayesian inference with the entire network; however, it is important to estimate them online for evolving networks. In this paper, we first develop online inference methods for MMSB through sequential Monte Carlo methods, also known as particle filters. We then extend them for time-evolving networks, taking into account the temporal dependency of the network structure. We demonstrate through experiments that the time-dependent particle filter outperformed several baselines in terms of prediction performance in an online condition.
Pablo MARTINEZ LERIN Daisuke YAMAMOTO Naohisa TAKAHASHI
Travel recommendation and travel diary generation applications can benefit significantly from methods that infer the durations and locations of visits from travelers' GPS data. However, conventional inference methods, which cluster GPS points on the basis of their spatial distance, are not suited to inferring visit durations. This paper presents a pace-based clustering method to infer visit locations and durations. The method contributes two novel techniques: (1) It clusters GPS points logged during visits by considering the speed and applying a probabilistic density function for each trip. Consequently, it avoids clustering GPS points that are near but unrelated to visits. (2) It also includes additional GPS points in the clusters by considering their temporal sequence. As a result, it is able to complete the clusters with GPS points that are far from the visits but are logged during the visits, caused, for example, by GPS noise indoors. The results of an experimental evaluation comparing our proposed method with three published inference methods indicate that our proposed method infers the duration of a visit with an average error rate of 8.7%, notably outperforming the other methods.
Hirokazu KAMEOKA Misa SATO Takuma ONO Nobutaka ONO Shigeki SAGAYAMA
This paper deals with the problem of underdetermined blind source separation (BSS) where the number of sources is unknown. We propose a BSS approach that simultaneously estimates the number of sources, separates the sources based on the sparseness of speech, estimates the direction of arrival of each source, and performs permutation alignment. We confirmed experimentally that reasonably good separation was obtained with the present method without specifying the number of sources.
Zezhong LI Hideto IKEDA Junichi FUKUMOTO
In most phrase-based statistical machine translation (SMT) systems, the translation model relies on word alignment, which serves as a constraint for the subsequent building of a phrase table. Word alignment is usually inferred by GIZA++, which implements all the IBM models and HMM model in the framework of Expectation Maximum (EM). In this paper, we present a fully Bayesian inference for word alignment. Different from the EM approach, the Bayesian inference makes use of all possible parameter values rather than estimating a single parameter value, from which we expect a more robust inference. After inferring the word alignment, current SMT systems usually train the phrase table from Viterbi word alignment, which is prone to learn incorrect phrases due to the word alignment mistakes. To overcome this drawback, a new phrase extraction method is proposed based on multiple Gibbs samples from Bayesian inference for word alignment. Empirical results show promising improvements over baselines in alignment quality as well as the translation performance.
Hua FAN Quanyuan WU Jianfeng ZHANG
Despite the improvement of the accuracy of RFID readers, there are still erroneous readings such as missed reads and ghost reads. In this letter, we propose two effective models, a Bayesian inference-based decision model and a path-based detection model, to increase the accuracy of RFID data cleaning in RFID based supply chain management. In addition, the maximum entropy model is introduced for determining the value of sliding window size. Experiment results validate the performance of the proposed method and show that it is able to clean raw RFID data with a higher accuracy.
Chittaphone PHONHARATH Kenji HASHIMOTO Hiroyuki SEKI
We study a static analysis problem on k-secrecy, which is a metric for the security against inference attacks on XML databases. Intuitively, k-secrecy means that the number of candidates of sensitive data of a given database instance or the result of unauthorized query cannot be narrowed down to k-1 by using available information such as authorized queries and their results. In this paper, we investigate the decidability of the schema k-secrecy problem defined as follows: for a given XML database schema, an authorized query and an unauthorized query, decide whether every database instance conforming to the given schema is k-secret. We first show that the schema k-secrecy problem is undecidable for any finite k>1 even when queries are represented by a simple subclass of linear deterministic top-down tree transducers (LDTT). We next show that the schema ∞-secrecy problem is decidable for queries represented by LDTT. We give an algorithm for deciding the schema ∞-secrecy problem and analyze its time complexity. We show the schema ∞-secrecy problem is EXPTIME-complete for LDTT. Moreover, we show similar results LDTT with regular look-ahead.