Naoto ISHIDA Takashi ISHIO Yuta NAKAMURA Shinji KAWAGUCHI Tetsuya KANDA Katsuro INOUE
Defects in spacecraft software may result in loss of life and serious economic damage. To avoid such consequences, the software development process incorporates code review activity. A code review conducted by a third-party organization independently of a software development team can effectively identify defects in software. However, such review activity is difficult for third-party reviewers, because they need to understand the entire structure of the code within a limited time and without prior knowledge. In this study, we propose a tool to visualize inter-module dataflow for source code of spacecraft software systems. To evaluate the method, an autonomous rover control program was reviewed using this visualization. While the tool does not decreases the time required for a code review, the reviewers considered the visualization to be effective for reviewing code.
Luo CHEN Ye WU Wei XIONG Ning JING
In terms of spatial online aggregation, traditional stand-alone serial methods gradually become limited. Although parallel computing is widely studied nowadays, there scarcely has research conducted on the index-based parallel online aggregation methods, specifically for spatial data. In this letter, a parallel multilevel indexing method is proposed to accelerate spatial online aggregation analyses, which contains two steps. In the first step, a parallel aR tree index is built to accelerate aggregate query locally. In the second step, a multilevel sampling data pyramid structure is built based on the parallel aR tree index, which contribute to the concurrent returned query results with certain confidence degree. Experimental and analytical results verify that the methods are capable of handling billion-scale data.
Toshiki SHIBAHARA Kohei YAMANISHI Yuta TAKATA Daiki CHIBA Taiga HOKAGUCHI Mitsuaki AKIYAMA Takeshi YAGI Yuichi OHSITA Masayuki MURATA
The number of infected hosts on enterprise networks has been increased by drive-by download attacks. In these attacks, users of compromised popular websites are redirected toward websites that exploit vulnerabilities of a browser and its plugins. To prevent damage, detection of infected hosts on the basis of proxy logs rather than blacklist-based filtering has started to be researched. This is because blacklists have become difficult to create due to the short lifetime of malicious domains and concealment of exploit code. To detect accesses to malicious websites from proxy logs, we propose a system for detecting malicious URL sequences on the basis of three key ideas: focusing on sequences of URLs that include artifacts of malicious redirections, designing new features related to software other than browsers, and generating new training data with data augmentation. To find an effective approach for classifying URL sequences, we compared three approaches: an individual-based approach, a convolutional neural network (CNN), and our new event de-noising CNN (EDCNN). Our EDCNN reduces the negative effects of benign URLs redirected from compromised websites included in malicious URL sequences. Evaluation results show that only our EDCNN with proposed features and data augmentation achieved a practical classification performance: a true positive rate of 99.1%, and a false positive rate of 3.4%.
Two kinds of problems - multiterminal hypothesis testing and one-to-many lossy source coding - are investigated in a unified way. It is demonstrated that a simple key idea, which is developed by Iriyama for one-to-one source coding systems, can be applied to multiterminal source coding systems. In particular, general bounds on the error exponents for multiterminal hypothesis testing and one-to-many lossy source coding are given.
Masayuki ARAI Shingo INUYAMA Kazuhiko IWASAKI
As semiconductor device manufacturing technology evolves toward higher integration and reduced feature size, the gap between the defect level estimated at the design stage and that reported for fabricated devices has become wider, making it more difficult to control total manufacturing cost including test cost and cost for field failure. To estimate fault coverage more precisely considering occurrence probabilities of faults, we have proposed weighted fault coverage estimation based on critical area corresponding to each fault. Previously different fault models were handled separately; thus, pattern compression efficiency and runtime were not optimized. In this study, we propose a fast test pattern generation scheme that considers weighted bridge and open fault coverage in an integrated manner. The proposed scheme applies two-step test pattern generation, wherein test patterns generated at second step that target only bridge faults are reordered with a search window of fixed size, achieving O(n) computational complexity. Experimental results indicate that with 10% of the initial target fault size and a fixed, small window size, the proposed scheme achieves approximately 100 times runtime reduction when compared to simple greedy-based reordering, in exchange for about 5% pattern count increment.
Manabu KOBAYASHI Toshiyasu MATSUSHIMA Shigeichi HIRASAWA
F.P. Preparata et al. have proposed a fault diagnosis model to find all faulty units in the multicomputer system by using outcomes which each unit tests some other units. In this paper, for probabilistic diagnosis models, we show an efficient diagnosis algorithm to obtain a posteriori probability that each of units is faulty given the test outcomes. Furthermore, we propose a method to analyze the diagnostic error probability of this algorithm.
Shinichi NISHIZAWA Hidetoshi ONODERA
This paper describes a design methodology for process variation aware D-Flip-Flop (DFF) using regression analysis. We propose to use a regression analysis to model the worst-case delay characteristics of a DFF under process variation. We utilize the regression equation for transistor width tuning of the DFF to improve its worst-case delay performance. Regression analysis can not only identify the performance-critical transistors inside the DFF, but also shows these impacts on DFF delay performance in quantitative form. Proposed design methodology is verified using Monte-Carlo simulation. The result shows the proposed method achieves to design a DFF which has similar or better delay characteristics in comparison with the DFF designed by an experienced cell designer.
Risa TAKEDA Yosei SHIBATA Takahiro ISHINABE Hideo FUJIKAKE
We examined single crystal growth of benzothienobenzothiophene-based organic semiconductors by solution coating method using liquid crystal and investigated its electrical characteristics. As the results, we revealed that the averaged mobility in the saturation region reached 2.08 cm2/Vs along crystalline b-axis, and 1.08 cm2/Vs along crystalline a-axis.
Bo SUN Akinori FUJINO Tatsuya MORI Tao BAN Takeshi TAKAHASHI Daisuke INOUE
Analyzing a malware sample requires much more time and cost than creating it. To understand the behavior of a given malware sample, security analysts often make use of API call logs collected by the dynamic malware analysis tools such as a sandbox. As the amount of the log generated for a malware sample could become tremendously large, inspecting the log requires a time-consuming effort. Meanwhile, antivirus vendors usually publish malware analysis reports (vendor reports) on their websites. These malware analysis reports are the results of careful analysis done by security experts. The problem is that even though there are such analyzed examples for malware samples, associating the vendor reports with the sandbox logs is difficult. This makes security analysts not able to retrieve useful information described in vendor reports. To address this issue, we developed a system called AMAR-Generator that aims to automate the generation of malware analysis reports based on sandbox logs by making use of existing vendor reports. Aiming at a convenient assistant tool for security analysts, our system employs techniques including template matching, API behavior mapping, and malicious behavior database to produce concise human-readable reports that describe the malicious behaviors of malware programs. Through the performance evaluation, we first demonstrate that AMAR-Generator can generate human-readable reports that can be used by a security analyst as the first step of the malware analysis. We also demonstrate that AMAR-Generator can identify the malicious behaviors that are conducted by malware from the sandbox logs; the detection rates are up to 96.74%, 100%, and 74.87% on the sandbox logs collected in 2013, 2014, and 2015, respectively. We also present that it can detect malicious behaviors from unknown types of sandbox logs.
Takuya WATANABE Mitsuaki AKIYAMA Tetsuya SAKAI Hironori WASHIZAKI Tatsuya MORI
Permission warnings and privacy policy enforcement are widely used to inform mobile app users of privacy threats. These mechanisms disclose information about use of privacy-sensitive resources such as user location or contact list. However, it has been reported that very few users pay attention to these mechanisms during installation. Instead, a user may focus on a more user-friendly source of information: text description, which is written by a developer who has an incentive to attract user attention. When a user searches for an app in a marketplace, his/her query keywords are generally searched on text descriptions of mobile apps. Then, users review the search results, often by reading the text descriptions; i.e., text descriptions are associated with user expectation. Given these observations, this paper aims to address the following research question: What are the primary reasons that text descriptions of mobile apps fail to refer to the use of privacy-sensitive resources? To answer the research question, we performed empirical large-scale study using a huge volume of apps with our ACODE (Analyzing COde and DEscription) framework, which combines static code analysis and text analysis. We developed light-weight techniques so that we can handle hundred of thousands of distinct text descriptions. We note that our text analysis technique does not require manually labeled descriptions; hence, it enables us to conduct a large-scale measurement study without requiring expensive labeling tasks. Our analysis of 210,000 apps, including free and paid, and multilingual text descriptions collected from official and third-party Android marketplaces revealed four primary factors that are associated with the inconsistencies between text descriptions and the use of privacy-sensitive resources: (1) existence of app building services/frameworks that tend to add API permissions/code unnecessarily, (2) existence of prolific developers who publish many applications that unnecessarily install permissions and code, (3) existence of secondary functions that tend to be unmentioned, and (4) existence of third-party libraries that access to the privacy-sensitive resources. We believe that these findings will be useful for improving users' awareness of privacy on mobile software distribution platforms.
Automatic game strategy data acquisition is important for the realization of the professional strategy analysis systems by providing evaluation values such as the team status and the efficacy of plays. The key factor that influences the performance of the strategy data acquisition in volleyball game is the unknown player roles. Player role means the position with game meaning of each player in the team formation, such as the setter, attacker and blocker. The unknown player role makes individual player unreliable and loses the contribution of each player in the strategy analysis. This paper proposes a court-divisional team motion feature and a player performance curve to deal with the unknown player roles in strategy data acquisition. Firstly, the court-divisional team motion feature is proposed for the team tactical status detection. This feature reduces the influence of individual player information by summing up the ball relative motion density of all the players in divided court area, which corresponds to the different plays. Secondly, the player performance curves are proposed for the efficacy variables acquisition in attack play. The player roles candidates are detected by three features that represent the entire process of a player starting to rush (or jump) to the ball and hit the ball: the ball relative distance, ball approach motion and the attack motion feature. With the 3D ball trajectories and multiple players' positions tracked from multi-view volleyball game videos, the experimental detection rate of each team status (attack, defense-ready, offense-ready and offense status) are 75.2%, 84.2%, 79.7% and 81.6%. And for the attack efficacy variables acquisition, the average precision of the set zone, the number of available attackers, the attack tempo and the number of blockers are 100%, 100%, 97.8%, and 100%, which achieve 8.3% average improvement compared with manual acquisition.
Hideaki ISHIBASHI Masayoshi ERA Tetsuo FURUKAWA
The aim of this work is to develop a method for the simultaneous analysis of multiple groups and their members based on hierarchical tensor manifold modeling. The method is particularly designed to analyze multiple teams, such as sports teams and business teams. The proposed method represents members' data using a nonlinear manifold for each team, and then these manifolds are further modeled using another nonlinear manifold in the model space. For this purpose, the method estimates the role of each member in the team, and discovers correspondences between members that play similar roles in different teams. The proposed method was applied to basketball league data, and it demonstrated the ability of knowledge discovery from players' statistics. We also demonstrated that the method could be used as a general tool for multi-level multi-group analysis by applying it to marketing data.
Chaman WIJESIRIWARDANA Prasad WIMALARATNE
Mining software repositories allow software practitioners to improve the quality of software systems and to support maintenance based on historical data. Such data is scattered across autonomous and heterogeneous information sources, such as version control, bug tracking and build automation systems. Despite having many tools to track and measure the data originated from such repositories, software practitioners often suffer from a scarcity of the techniques necessary to dynamically leverage software repositories to fulfill their complex information needs. For example, answering a question such as “What is the number of commits between two successful builds?” requires tiresome manual inspection of multiple repositories. As a solution, this paper presents a conceptual framework and a proof of concept visual query interface to satisfy distinct software quality related information needs of software practitioners. The data originated from repositories is integrated and analyzed to perform systematic investigations, which helps to uncover hidden relationships between software quality and trends of software evolution. This approach has several significant benefits such as the ability to perform real-time analyses, the ability to combine data from various software repositories and generate queries dynamically. The framework evaluated with 31 subjects by using a series of questions categorized into three software evolution scenarios. The evaluation results evidently show that our framework surpasses the state of the art tools in terms of correctness, time and usability.
Jingjie YAN Guanming LU Xiaodong BAI Haibo LI Ning SUN Ruiyu LIANG
In this letter, we propose a supervised bimodal emotion recognition approach based on two important human emotion modalities including facial expression and body gesture. A effectively supervised feature fusion algorithms named supervised multiset canonical correlation analysis (SMCCA) is presented to established the linear connection between three sets of matrices, which contain the feature matrix of two modalities and their concurrent category matrix. The test results in the bimodal emotion recognition of the FABO database show that the SMCCA algorithm can get better or considerable efficiency than unsupervised feature fusion algorithm covering canonical correlation analysis (CCA), sparse canonical correlation analysis (SCCA), multiset canonical correlation analysis (MCCA) and so on.
In this paper, we propose a novel primary user detection scheme for spectrum sensing in cognitive radio. Inspired by the conventional signal classification approach, the spectrum sensing is translated into a classification problem. On the basis of feature-based classification, the spectral correlation of a second-order cyclostationary analysis is applied as the feature extraction method, whereas a stacked denoising autoencoders network is applied as the classifier. Two training methods for signal detection, interception-based detection and simulation-based detection, are considered, for different prior information and implementation conditions. In an interception-based detection method, inspired by the two-step sensing, we obtain training data from the interception of actual signals after a sophisticated sensing procedure, to achieve detection without priori information. In addition, benefiting from practical training data, this interception-based detection is superior under actual transmission environment conditions. The alternative, a simulation-based detection method utilizes some undisguised parameters of the primary user in the spectrum of interest. Owing to the diversified predetermined training data, simulation-based detection exhibits transcendental robustness against harsh noise environments, although it demands a more complicated classifier network structure. Additionally, for the above-described training methods, we discuss the classifier complexity over implementation conditions and the trade-off between robustness and detection performance. The simulation results show the advantages of the proposed method over conventional spectrum-sensing schemes.
Kazuo AOYAMA Kazumi SAITO Tetsuo IKEDA
This paper presents an efficient acceleration algorithm for Lloyd-type k-means clustering, which is suitable to a large-scale and high-dimensional data set with potentially numerous classes. The algorithm employs a novel projection-based filter (PRJ) to avoid unnecessary distance calculations, resulting in high-speed performance keeping the same results as a standard Lloyd's algorithm. The PRJ exploits a summable lower bound on a squared distance defined in a lower-dimensional space to which data points are projected. The summable lower bound can make the bound tighter dynamically by incremental addition of components in the lower-dimensional space within each iteration although the existing lower bounds used in other acceleration algorithms work only once as a fixed filter. Experimental results on large-scale and high-dimensional real image data sets demonstrate that the proposed algorithm works at high speed and with low memory consumption when large k values are given, compared with the state-of-the-art algorithms.
Namyong JUNG Hyeongboo BAEK Donghyouk LIM Jinkyu LEE
As real-time embedded systems are required to accommodate various tasks with different levels of criticality, scheduling algorithms for MC (Mixed-Criticality) systems have been widely studied in the real-time systems community. Most studies have focused on MC uniprocessor systems whereas there have been only a few studies to support MC multiprocessor systems. In particular, although the ZL (Zero-Laxity) policy has been known to an effective technique in improving the schedulability performance of base scheduling algorithms on SC (Single-Criticality) multiprocessor systems, the effectiveness of the ZL policy on MC multiprocessor systems has not been revealed to date. In this paper, we focus on realizing the potential of the ZL policy for MC multiprocessor systems, which is the first attempt. To this end, we design the ZL policy for MC multiprocessor systems, and apply the policy to EDF (Earliest Deadline First), yielding EDZL (Earliest Deadline first until Zero-Laxity) tailored for MC multiprocessor systems. Then, we develop a schedulability analysis for EDZL (as well as its base algorithm EDF) to support its timing guarantee. Our simulation results show a significant schedulability improvement of EDZL over EDF, demonstrating the effectiveness of the ZL policy for MC multiprocessor systems.
The Jacobi-Davidson method and the Riccati method for eigenvalue problems are studied. In the methods, one has to solve a nonlinear equation called the correction equation per iteration, and the difference between the methods comes from how to solve the equation. In the Jacobi-Davidson/Riccati method the correction equation is solved with/without linearization. In the literature, avoiding the linearization is known as an improvement to get a better solution of the equation and bring the faster convergence. In fact, the Riccati method showed superior convergence behavior for some problems. Nevertheless the advantage of the Riccati method is still unclear, because the correction equation is solved not exactly but with low accuracy. In this paper, we analyzed the approximate solution of the correction equation and clarified the point that the Riccati method is specialized for computing particular solutions of eigenvalue problems. The result suggests that the two methods should be selectively used depending on target solutions. Our analysis was verified by numerical experiments.
Hideaki YOSHINO Kenko OTA Takefumi HIRAGURI
Data aggregation, which is the process of summarizing a large amount of data, is an effective method for saving limited communication resources, such as radio frequency and sensor-node energy. Packet aggregation in wireless LAN and sensed-data aggregation in wireless sensor networks are typical examples. We propose and analyze two queueing models of fundamental statistical data aggregation schemes: constant interval and constant aggregation number. We represent each aggregation scheme by a tandem queueing network model with a gate at the aggregation process and a single server queue at a transmission process. We analytically derive the stationary distribution and Laplace-Stieltjes transform of the system time for each aggregation and transmission process and of the total system time. We then numerically evaluate the stationary mean system time characteristics and clarify that each model has an optimal aggregation parameter (i.e., an optimal aggregation interval or optimal aggregation number), that minimizes the mean total system time. In addition, we derive the explicit optimal aggregation parameter for a D/M/1 transmission model with each aggregation scheme and clarify that it provides accurate approximation of that of each aggregation model. The optimal aggregation interval was determined by the transmission rate alone, while the optimal aggregation number was determined by the arrival and transmission rates alone with explicitly derived proportional constants. These results can provide a theoretical basis and a guideline for designing aggregation devices, such as IoT gateways.
Akihide NAGAMINE Kanshiro KASHIKI Fumio WATANABE Jiro HIROKAWA
As one functionality of the wireless distributed network (WDN) enabling flexible wireless networks, it is supposed that a dynamic spectrum access is applied to OFDM systems for superior radio resource management. As a basic technology for such WDN, our study deals with the OFDM signal detection based on its cyclostationary feature. Previous relevant studies mainly relied on software simulations based on the Monte Carlo method. This paper analytically clarifies the relationship between the design parameters of the detector and its detection performance. The detection performance is formulated by using multiple design parameters including the transfer function of the receive filter. A hardware experiment with radio frequency (RF) signals is also carried out by using the detector consisting of an RF unit and FPGA. Thereby, it is verified that the detection characteristics represented by the false-alarm and non-detection probabilities calculated by the analytical formula agree well with those obtained by the hardware experiment. Our analysis and experiment results are useful for the parameter design of the signal detector to satisfy required performance criteria.