1-12hit |
Bo SUN Akinori FUJINO Tatsuya MORI Tao BAN Takeshi TAKAHASHI Daisuke INOUE
Analyzing a malware sample requires much more time and cost than creating it. To understand the behavior of a given malware sample, security analysts often make use of API call logs collected by the dynamic malware analysis tools such as a sandbox. As the amount of the log generated for a malware sample could become tremendously large, inspecting the log requires a time-consuming effort. Meanwhile, antivirus vendors usually publish malware analysis reports (vendor reports) on their websites. These malware analysis reports are the results of careful analysis done by security experts. The problem is that even though there are such analyzed examples for malware samples, associating the vendor reports with the sandbox logs is difficult. This makes security analysts not able to retrieve useful information described in vendor reports. To address this issue, we developed a system called AMAR-Generator that aims to automate the generation of malware analysis reports based on sandbox logs by making use of existing vendor reports. Aiming at a convenient assistant tool for security analysts, our system employs techniques including template matching, API behavior mapping, and malicious behavior database to produce concise human-readable reports that describe the malicious behaviors of malware programs. Through the performance evaluation, we first demonstrate that AMAR-Generator can generate human-readable reports that can be used by a security analyst as the first step of the malware analysis. We also demonstrate that AMAR-Generator can identify the malicious behaviors that are conducted by malware from the sandbox logs; the detection rates are up to 96.74%, 100%, and 74.87% on the sandbox logs collected in 2013, 2014, and 2015, respectively. We also present that it can detect malicious behaviors from unknown types of sandbox logs.
Lianqiang LI Kangbo SUN Jie ZHU
Knowledge distillation approaches can transfer information from a large network (teacher network) to a small network (student network) to compress and accelerate deep neural networks. This paper proposes a novel knowledge distillation approach called multi-knowledge distillation (MKD). MKD consists of two stages. In the first stage, it employs autoencoders to learn compact and precise representations of the feature maps (FM) from the teacher network and the student network, these representations can be treated as the essential of the FM, i.e., EFM. In the second stage, MKD utilizes multiple kinds of knowledge, i.e., the magnitude of individual sample's EFM and the similarity relationships among several samples' EFM to enhance the generalization ability of the student network. Compared with previous approaches that employ FM or the handcrafted features from FM, the EFM learned from autoencoders can be transferred more efficiently and reliably. Furthermore, the rich information provided by the multiple kinds of knowledge guarantees the student network to mimic the teacher network as closely as possible. Experimental results also show that MKD is superior to the-state-of-arts.
Takuya WATANABE Mitsuaki AKIYAMA Fumihiro KANEI Eitaro SHIOJI Yuta TAKATA Bo SUN Yuta ISHII Toshiki SHIBAHARA Takeshi YAGI Tatsuya MORI
This paper reports a large-scale study that aims to understand how mobile application (app) vulnerabilities are associated with software libraries. We analyze both free and paid apps. Studying paid apps was quite meaningful because it helped us understand how differences in app development/maintenance affect the vulnerabilities associated with libraries. We analyzed 30k free and paid apps collected from the official Android marketplace. Our extensive analyses revealed that approximately 70%/50% of vulnerabilities of free/paid apps stem from software libraries, particularly from third-party libraries. Somewhat paradoxically, we found that more expensive/popular paid apps tend to have more vulnerabilities. This comes from the fact that more expensive/popular paid apps tend to have more functionality, i.e., more code and libraries, which increases the probability of vulnerabilities. Based on our findings, we provide suggestions to stakeholders of mobile app distribution ecosystems.
Jingbo SUN Yue WANG Jian YUAN Xiuming SHAN
Since most of energy consumed by the telecommunication infrastructure is due to the Base Transceiver Station (BTS), switching off BTSs when traffic load is low has been recognized as an effective way of saving energy. In this letter, an energy saving scheme is proposed to minimize the number of active BTSs based on the space-time structure of traffic loads as determined by principal component analysis. Compared to existing methods, our approach models traffic loads more accurately, and has a much smaller input size. As it is implemented in an off-line manner, our scheme also avoids excessive communications and computing overheads. Simulation results show that the proposed method has a comparable performance in energy savings.
Local discriminative regions play important roles in fine-grained image analysis tasks. How to locate local discriminative regions with only category label and learn discriminative representation from these regions have been hot spots. In our work, we propose Searching Discriminative Regions (SDR) and Learning Discriminative Regions (LDR) method to search and learn local discriminative regions in images. The SDR method adopts attention mechanism to iteratively search for high-response regions in images, and uses this as a clue to locate local discriminative regions. Moreover, the LDR method is proposed to learn compact within category and sparse between categories representation from the raw image and local images. Experimental results show that our proposed approach achieves excellent performance in both fine-grained image retrieval and classification tasks, which demonstrates its effectiveness.
Denghui YAO Xiaoyong ZHANG Zhengbo SUN Dexiu HU
Long-term coherent integration can significantly improve the ability to detect maneuvering targets by radar. Especially for weak targets, longer integration times are needed to improve. But for non-radially moving targets, the time-varying angle between target moving direction and radar line of sight will cause non-linear range migration (NLRM) and non-linear Doppler frequency migration (NLDFM) within long-time coherent processing, which precludes existing methods that ignore angle changes, and seriously degrades the performance of coherent integration. To solve this problem, an efficient method based on Radon Fourier transform (RFT) with modified variant angle model (ARFT) is proposed. In this method, a new parameter angle is introduced to optimize the target motion model, and the NLRM and NLDFM are eliminated by range-velocity-angle joint three-dimensional searching of ARFT. Compared with conventional algorithms, the proposed method can more accurately compensate for the NLRM and NLDFM, thus achieving better integration performance and detection probability for non-radial moving weak targets. Numerical simulations verify the effectiveness and advantages of the proposed method.
Bo SUN Mitsuaki AKIYAMA Takeshi YAGI Mitsuhiro HATADA Tatsuya MORI
Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.
Satoshi KAWATA Satoru SHOJI Hong-Bo SUN
Lasers have been established as a unique nanoprocessing tool due to its intrinsic three-dimensional (3D) fabrication capability and the excellent compatibility to various functional materials. Here we report two methods that have been proved particularly promising for tailoring 3D photonic crystals (PhCs): pinpoint writing via two-photon photopolymerization and multibeam interferential patterning. In the two-photon fabrication, a finely quantified pixel writing scheme and a method of pre-compensation to the shrinkage induced by polymerization enable high-reproducibility and high-fidelity prototyping; well-defined diamond-lattice PhCs prove the arbitrary 3D processing capability of the two-photon technology. In the interference patterning method, we proposed and utilized a two-step exposure approach, which not only increases the number of achievable lattice types, but also expands the freedom in tuning lattice constant.
Taek Young YOUN Bo Sun KWAK Seungkwang LEE Hyun Sook RHEE
To support secure database management, a number of value-added encryption schemes have been studied including order-revealing encryption (ORE) schemes. One of outstanding features of ORE schemes is the efficiency of range queries in an encrypted form. Compared to existing encryption methods, ORE leads to an increase in the length of ciphertexts. To improve the efficiency of ORE schemes in terms of the length of ciphertext, a new ORE scheme with shorter ciphertext has been proposed by Kim. In this paper, we revisit Kim's ORE scheme and show that the length of ciphertexts is not as short as analyzed in their paper. We also introduce a simple modification reducing the memory requirement than existing ORE schemes.
To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.
Location and feature representation of object's parts play key roles in fine-grained visual recognition. To promote the final recognition accuracy without any bounding boxes/part annotations, many studies adopt object location networks to propose bounding boxes/part annotations with only category labels, and then crop the images into partial images to help the classification network make the final decision. In our work, to propose more informative partial images and effectively extract discriminative features from the original and partial images, we propose a two-stage approach that can fuse the original features and partial features by evaluating and ranking the information of partial images. Experimental results show that our proposed approach achieves excellent performance on two benchmark datasets, which demonstrates its effectiveness.
Bing DENG Zhengbo SUN Le YANG Dexiu HU
A linear-correction method is developed for source position and velocity estimation using time difference of arrival (TDOA) and frequency difference of arrival (FDOA) measurements. The proposed technique first obtains an initial source location estimate using the first-step processing of an existing algebraic algorithm. It then refines the initial localization result by estimating via weighted least-squares (WLS) optimization and subtracting out its estimation error. The new solution is shown to be able to achieve the Cramer-Rao lower bound (CRLB) accuracy and it has better accuracy over several benchmark methods at relatively high noise levels.