Moeko YOSHIDA Hiromichi NASHIMOTO Teruyuki MIYAJIMA
This paper proposes a partial transmit sequences (PTS)-based PAPR reduction method and a phase factor estimation method without side information for OFDM systems with QPSK and 16QAM modulation. In the transmitter, an iterative algorithm that minimizes the p-norm of a transmitted signal determines phase factors to reduce PAPR. Unlike conventional methods, the phase factors are allowed to take continuous values in a limited range. In the receiver, the phase factor is blindly estimated by evaluating the phase differences between the equalizer's output and its closest constellation points. Simulation results show that the proposed PAPR reduction method is more computationally efficient than the conventional PTS. Moreover, the combined use of the two proposed methods achieves a satisfactory tradeoff between PAPR and BER by limiting the phase factors properly.
Shouhei FUKUNAGA Yoshimasa TAKABATAKE Tomohiro I Hiroshi SAKAMOTO
A grammar compression is a restricted context-free grammar (CFG) that derives a single string deterministically. The goal of a grammar compression algorithm is to develop a smaller CFG by finding and removing duplicate patterns, which is simply a frequent pattern discovery process. Any frequent pattern can be obtained in linear time; however, a huge working space is required for longer patterns, and the entire string must be preloaded into memory. We propose an online algorithm to address this problem approximately within compressed space. For an input sequence of symbols, a1,a2,..., let Gi be a grammar compression for the string a1a2…ai. In this study, an online algorithm is considered one that can compute Gi+1 from (Gi,ai+1) without explicitly decompressing Gi. Here, let G be a grammar compression for string S. We say that variable X approximates a substring P of S within approximation ratio δ iff for any interval [i,j] with P=S[i,j], the parse tree of G has a node labeled with X that derives S[l,r] for a subinterval [l,r] of [i,j] satisfying |[l,r]|≥δ|[i,j]|. Then, G solves the frequent pattern discovery problem approximately within δ iff for any frequent pattern P of S, there exists a variable that approximates P within δ. Here, δ is called the approximation ratio of G for S. Previously, the best approximation ratio obtained by a polynomial time algorithm was Ω(1/lg2|P|). The main contribution of this work is to present a new lower bound Ω(1/<*|S|lg|P|) that is smaller than the previous bound when lg*|S|
Mohd SHAHRIZAN OTHMAN Aleksandar SHURBEVSKI Hiroshi NAGAMOCHI
Given an edge-weighted bipartite digraph G=(A,B;E), the Bipartite Traveling Salesman Problem (BTSP) asks to find the minimum cost of a Hamiltonian cycle of G, or determine that none exists. When |A|=|B|=n, the BTSP can be solved using polynomial space in O*(42nnlog n) time by using the divide-and-conquer algorithm of Gurevich and Shelah (SIAM Journal of Computation, 16(3), pp.486-502, 1987). We adapt their algorithm for the bipartite case, and show an improved time bound of O*(42n), saving the nlog n factor.
Takayoshi SHOUDAI Yuta YOSHIMURA Yusuke SUZUKI Tomoyuki UCHIDA Tetsuhiro MIYAHARA
A cograph (complement reducible graph) is a graph which can be generated by disjoint union and complement operations on graphs, starting with a single vertex graph. Cographs arise in many areas of computer science and are studied extensively. With the goal of developing an effective data mining method for graph structured data, in this paper we introduce a graph pattern expression, called a cograph pattern, which is a special type of cograph having structured variables. Firstly, we show that a problem whether or not a given cograph pattern g matches a given cograph G is NP-complete. From this result, we consider the polynomial time learnability of cograph pattern languages defined by cograph patterns having variables labeled with mutually different labels, called linear cograph patterns. Secondly, we present a polynomial time matching algorithm for linear cograph patterns. Next, we give a polynomial time algorithm for obtaining a minimally generalized linear cograph pattern which explains given positive data. Finally, we show that the class of linear cograph pattern languages is polynomial time inductively inferable from positive data.
Zhenghang CUI Issei SATO Masashi SUGIYAMA
As the emergence and the thriving development of social networks, a huge number of short texts are accumulated and need to be processed. Inferring latent topics of collected short texts is an essential task for understanding its hidden structure and predicting new contents. A biterm topic model (BTM) was recently proposed for short texts to overcome the sparseness of document-level word co-occurrences by directly modeling the generation process of word pairs. Stochastic inference algorithms based on collapsed Gibbs sampling (CGS) and collapsed variational inference have been proposed for BTM. However, they either require large computational complexity, or rely on very crude estimation that does not preserve sufficient statistics. In this work, we develop a stochastic divergence minimization (SDM) inference algorithm for BTM to achieve better predictive likelihood in a scalable way. Experiments show that SDM-BTM trained by 30% data outperforms the best existing algorithm trained by full data.
Shujiao LIAO Qingxin ZHU Rui LIANG
Rough set theory is an important branch of data mining and granular computing, among which neighborhood rough set is presented to deal with numerical data and hybrid data. In this paper, we propose a new concept called inconsistent neighborhood, which extracts inconsistent objects from a traditional neighborhood. Firstly, a series of interesting properties are obtained for inconsistent neighborhoods. Specially, some properties generate new solutions to compute the quantities in neighborhood rough set. Then, a fast forward attribute reduction algorithm is proposed by applying the obtained properties. Experiments undertaken on twelve UCI datasets show that the proposed algorithm can get the same attribute reduction results as the existing algorithms in neighborhood rough set domain, and it runs much faster than the existing ones. This validates that employing inconsistent neighborhoods is advantageous in the applications of neighborhood rough set. The study would provide a new insight into neighborhood rough set theory.
Inbok LEE Victor C. VALGENTI Min S. KIM Sung-il OH
In this paper we show a simple heuristic for constructing smaller automata for a set of regular expressions, based on suffix sorting: finding common prefixes and suffixes in regular expressions and merging them. It is an important problem in network security. We applied our approach to random and real-world regular expressions. Experimental results showed that our approach yields up to 12 times enhancement in throughput.
Chao SUN Ling YANG Juan DU Fenggang SUN Li CHEN Haipeng XI Shenglei DU
In this paper, we first propose two batch blind source separation and equalization algorithms based on support vector regression (SVR) for linear time-invariant multiple input multiple output (MIMO) systems. The proposed algorithms combine the conventional cost function of SVR with error functions of classical on-line algorithm for blind equalization: both error functions of constant modulus algorithm (CMA) and radius directed algorithm (RDA) are contained in the penalty term of SVR. To recover all sources simultaneously, the cross-correlations of equalizer outputs are included in the cost functions. Simulation experiments show that the proposed algorithms can recover all sources successfully and compensate channel distortion simultaneously. With the use of iterative re-weighted least square (IRWLS) solution of SVR, the proposed algorithms exhibit low computational complexity. Compared with traditional algorithms, the new algorithms only require fewer samples to achieve convergence and perform a lower residual interference. For multilevel signals, the single algorithms based on constant modulus property usually show a relatively high residual error, then we propose two dual-mode blind source separation and equalization schemes. Between them, the dual-mode scheme based on SVR merely requires fewer samples to achieve convergence and further reduces the residual interference.
Dal-Jae YUN Jae-In LEE Ky-Ung BAE Won-Young SONG Noh-Hoon MYUNG
Three-dimensional (3-D) scattering center models use a finite number of point scatterers to efficiently represent complex radar target signature. Using the CLEAN algorithm, 3-D scattering center model is extracted from the inverse synthetic aperture radar (ISAR) image, which is generated based on the shooting and bouncing ray (SBR) technique. The conventional CLEAN extracts the strongest peak iteratively based on the assumption that the scattering centers are isolated. In a realistic target, however, both interference from the closely spaced points and additive noise distort the extraction process. This paper proposes a matched filter-based CLEAN algorithm to improve accuracy efficiently. Using the matched filtering of which impulse response is the known point spread function (PSF), a point most correlated with the PSF is extracted. Thus, the proposed method optimally enhances the accuracy in the presence of massive distortions. Numerical simulations using canonical and realistic targets demonstrate that the extraction accuracy is improved without loss of time-efficiency compared with the existing CLEAN algorithms.
Satoshi TAOKA Tadachika OKI Toshiya MASHIMA Toshimasa WATANABE
The k-edge-connectivity augmentation problem with multipartition constraints (kECAMP, for short) is defined by “Given a multigraph G=(V,E) and a multipartition π={V1,...,Vr} (r≥2) of V, that is, $V = igcup_{h = 1}^r V_h$ and Vi∩Vj=∅ (1≤i
Seongwook LEE Young-Jun YOON Seokhyun KANG Jae-Eun LEE Seong-Cheol KIM
In this paper, we propose a received signal interpolation method for enhancing the performance of multiple signal classification (MUSIC) algorithm. In general, the performance of the conventional MUSIC algorithm is very sensitive to signal-to-noise ratio (SNR) of the received signal. When array elements receive the signals with nonuniform SNR values, the resolution performance is degraded compared to elements receiving the signals with uniform SNR values. Hence, we propose a signal calibration technique for improving the resolution of the algorithm. First, based on original signals, rough direction of arrival (DOA) estimation is conducted. In this stage, using frequency-domain received signals, SNR values of each antenna element in the array are estimated. Then, a deteriorated element that has a relatively lower SNR value than those of the other elements is selected by our proposed scheme. Next, the received signal of the selected element is spatially interpolated based on the signals received from the neighboring elements and the DOA information extracted from the rough estimation. Finally, fine DOA estimation is performed again with the calibrated signal. Simulation results show that the angular resolution of the proposed method is better than that of the conventional MUSIC algorithm. Also, we apply the proposed scheme to actual data measured in the testing ground, and it gives us more enhanced DOA estimation result.
Atsushi OOKA Eum SUYONG Shingo ATA Masayuki MURATA
Information-centric networking (ICN) has received increasing attention from all over the world. The novel aspects of ICN (e.g., the combination of caching, multicasting, and aggregating requests) is based on names that act as addresses for content. The communication with name has the potential to cope with the growing and complicating Internet technology, for example, Internet of Things, cloud computing, and a smart society. To realize ICN, router hardware must implement an innovative cache replacement algorithm that offers performance far superior to a simple policy-based algorithm while still operating with feasible computational and memory overhead. However, most previous studies on cache replacement policies in ICN have proposed policies that are too blunt to achieve significant performance improvement, such as first-in first-out (popularly, FIFO) and random policies, or impractical policies in a resource-restricted environment, such as least recently used (LRU). Thus, we propose CLOCK-Pro Using Switching Hash-tables (CUSH) as the suitable policy for network caching. CUSH can identify and keep popular content worth caching in a network environment. CUSH also employs CLOCK and hash-tables, which are low-overhead data structure, to satisfy the cost requirement. We numerically evaluate our proposed approach, showing that our proposal can achieve cache hits against the traffic traces that simple conventional algorithms hardly cause any hits.
Kento SUZUKI Nobukazu TAKAI Yoshiki SUGAWARA Masato KATO
Automatic design of analog circuits using a programmed algorithm is in great demand because optimal analog circuit design in a short time is required due to the limited development time. Although an automatic design using equation-based method can design simple circuits fast and accurately, it cannot solve complex circuits. On the other hand, an automatic design using optimization algorithm such as Ant Colony Optimization, Genetic Algorithm, and so on, can design complex circuits. However, because these algorithms are based on the stochastic optimization technique and determine the circuit parameters at random, a lot of circuits which do not operate in principle are generated and simulated to find the circuit which meets specifications. In this paper, to reduce the search space and the redundant simulations, automatic design using both equation-based method and a genetic algorithm is proposed. The proposed method optimizes the bias circuit blocks using the equation-based method and signal processing blocks using Genetic Algorithm. Simulation results indicate that the evaluation value which considers the trade-off of the circuit specification is larger than the conventional method and the proposed method can design 1.4 times more circuits which satisfy the minimum requirements than the conventional method.
Yating GAO Guixia KANG Jianming CHENG Ningbo ZHANG
Wireless sensor networks usually deploy sensor nodes with limited energy resources in unattended environments so that people have difficulty in replacing or recharging the depleted devices. In order to balance the energy dissipation and prolong the network lifetime, this paper proposes a routing spanning tree-based clustering algorithm (RSTCA) which uses routing spanning tree to analyze clustering. In this study, the proposed scheme consists of three phases: setup phase, cluster head (CH) selection phase and steady phase. In the setup phase, several clusters are formed by adopting the K-means algorithm to balance network load on the basis of geographic location, which solves the randomness problem in traditional distributed clustering algorithm. Meanwhile, a conditional inter-cluster data traffic routing strategy is created to simplify the networks into subsystems. For the CH selection phase, a novel CH selection method, where CH is selected by a probability based on the residual energy of each node and its estimated next-time energy consumption as a function of distance, is formulated for optimizing the energy dissipation among the nodes in the same cluster. In the steady phase, an effective modification that counters the boundary node problem by adjusting the data traffic routing is designed. Additionally, by the simulation, the construction procedure of routing spanning tree (RST) and the effect of the three phases are presented. Finally, a comparison is made between the RSTCA and the current distributed clustering protocols such as LEACH and LEACH-DT. The results show that RSTCA outperforms other protocols in terms of network lifetime, energy dissipation and coverage ratio.
Takuma NAKAJIMA Masato YOSHIMI Celimuge WU Tsutomu YOSHINAGA
Cooperative caching is a key technique to reduce rapid growing video-on-demand's traffic by aggregating multiple cache storages. Existing strategies periodically calculate a sub-optimal allocation of the content caches in the network. Although such technique could reduce the generated traffic between servers, it comes with the cost of a large computational overhead. This overhead will be the cause of preventing these caches from following the rapid change in the access pattern. In this paper, we propose a light-weight scheme for cooperative caching by grouping contents and servers with color tags. In our proposal, we associate servers and caches through a color tag, with the aim to increase the effective cache capacity by storing different contents among servers. In addition to the color tags, we propose a novel hybrid caching scheme that divides its storage area into colored LFU (Least Frequently Used) and no-color LRU (Least Recently Used) areas. The colored LFU area stores color-matching contents to increase cache hit rate and no-color LRU area follows rapid changes in access patterns by storing popular contents regardless of their tags. On the top of the proposed architecture, we also present a new routing algorithm that takes benefit of the color tags information to reduce the traffic by fetching cached contents from the nearest server. Evaluation results, using a backbone network topology, showed that our color-tag based caching scheme could achieve a performance close to the sub-optimal one obtained with a genetic algorithm calculation, with only a few seconds of computational overhead. Furthermore, the proposed hybrid caching could limit the degradation of hit rate from 13.9% in conventional non-colored LFU, to only 2.3%, which proves the capability of our scheme to follow rapid insertions of new popular contents. Finally, the color-based routing scheme could reduce the traffic by up to 31.9% when compared with the shortest-path routing.
Yu SUZUKI Masato ITO Satoshi KANDA Kousuke IMAMURA Yoshio MATSUDA Tetsuya MATSUMURA
This paper describes the design and implementation of a real-time optical flow processor using a single field-programmable gate array (FPGA) chip. By introducing the modified initial flow generation method, the successive over-relaxation (SOR) method for both layers, the optimization of the reciprocal operation method, and the image division method, it is now possible to both reduce hardware requirements and improve flow accuracy. Additionally, by introducing a pipeline structure to this processor, high-throughput hardware implementation could be achieved. Total logic cell (LC) amounts and processer memory capacity are reduced by about 8% and 16%, respectively, compared to our previous hierarchical optical flow estimation (HOE) processor. The results of our evaluation confirm that this processor can perform 30 fps wide extended graphics array (WXGA) 175.7MHz real-time optical flow processing with a single FPGA.
Toru FUJITA Koji NAKANO Yasuaki ITO Daisuke TAKAFUJI
The main contribution of this paper is to present an efficient GPU implementation of bulk computation of the CKY parsing for a context-free grammar, which determines if a context-free grammar derives each of a lot of input strings. The bulk computation is to execute the same algorithm for a lot of inputs in turn or at the same time. The CKY parsing is to determine if a context-free grammar derives a given string. We show that the bulk computation of the CKY parsing can be implemented in the GPU efficiently using Bitwise Parallel Bulk Computation (BPBC) technique. We also show the rule minimization technique and the dynamic scheduling method for further acceleration of the CKY parsing on the GPU. The experimental results using NVIDIA TITAN X GPU show that our implementation of the bitwise-parallel CKY parsing for strings of length 32 takes 395µs per string with 131072 production rules for 512 non-terminal symbols.
Tso-Cho CHEN Erl-Huei LU Chia-Jung LI Kuo-Tsang HUANG
In this paper, a weighted multiple bit flipping (WMBF) algorithman for decoding low-density parity-check (LDPC) codes is proposed first. Then the improved WMBF algorithm which we call the efficient weighted bit-flipping (EWBF) algorithm is developed. The EWBF algorithm can dynamically choose either multiple bit-flipping or single bit-flipping in each iteration according to the log-likelihood ratio of the error probability of the received bits. Thus, it can efficiently increase the convergence speed of decoding and prevent the decoding process from falling into loop traps. Compared with the parallel weighted bit-flipping (PWBF) algorithm, the EWBF algorithm can achieve significantly lower computational complexity without performance degradation when the Euclidean geometry (EG)-LDPC codes are decoded. Furthermore, the flipping criterion does not require any parameter adjustment.
Takumi KIMURA Norikazu TAKAHASHI
Nonnegative Matrix Factorization (NMF) with sparseness and smoothness constraints has attracted increasing attention. When these properties are considered, NMF is usually formulated as an optimization problem in which a linear combination of an approximation error term and some regularization terms must be minimized under the constraint that the factor matrices are nonnegative. In this paper, we focus our attention on the error measure based on the Euclidean distance and propose a new iterative method for solving those optimization problems. The proposed method is based on the Hierarchical Alternating Least Squares (HALS) algorithm developed by Cichocki et al. We first present an example to show that the original HALS algorithm can increase the objective value. We then propose a new algorithm called the Gauss-Seidel HALS algorithm that decreases the objective value monotonically. We also prove that it has the global convergence property in the sense of Zangwill. We finally verify the effectiveness of the proposed algorithm through numerical experiments using synthetic and real data.
Xin JIANG Xiangyang LEI Lian ZENG Takahiro WATANABE
Recent Network on Chip (NoC) design must take the thermal issue into consideration due to its great impact on the network performance and reliability, especially for 3D NoC. In this work, we design a virtual channel based fully adaptive routing algorithm for the runtime 3D NoC thermal-aware management. To improve the network throughput and latency, we use two virtual channels for each horizontal direction and design a routing function which can not only avoid deadlock and livelock, but also ensure high adaptivity and routability in the throttled network. For path selection, we design a strategy that takes priority to the distance, but also considers path diversity and traffic state. For throttling information collection, instead of transmitting the topology information of the whole network, we use a 12 bits register to reserve the router state for one hop away, which saves the hardware cost largely and decreases the network latency. In the experiments, we test our proposed routing algorithm in different states with different sizes, and the proposed algorithm shows better network latency and throughput with low power compared with traditional algorithms.