Yuya DEGAWA Toru KOIZUMI Tomoki NAKAMURA Ryota SHIOYA Junichiro KADOMOTO Hidetsugu IRIE Shuichi SAKAI
One of the performance bottlenecks of a processor is the front-end that supplies instructions. Various techniques, such as cache replacement algorithms and hardware prefetching, have been investigated to facilitate smooth instruction supply at the front-end and to improve processor performance. In these approaches, one of the most important factors has been the reduction in the number of instruction cache misses. By using the number of instruction cache misses or derived factors, previous studies have explained the performance improvements achieved by their proposed methods. However, we found that the number of instruction cache misses does not always explain performance changes well in modern processors. This is because the front-end in modern processors handles subsequent instruction cache misses in overlap with earlier ones. Based on this observation, we propose a novel factor: the number of miss regions. We define a region as a sequence of instructions from one branch misprediction to the next, while we define a miss region as a region that contains one or more instruction cache misses. At the boundary of each region, the pipeline is flushed owing to a branch misprediction. Thus, cache misses after this boundary are not handled in overlap with cache misses before the boundary. As a result, the number of miss regions is equal to the number of cache misses that are processed without overlap. In this paper, we demonstrate that the number of miss regions can well explain the variation in performance through mathematical models and simulation results. The results show that the model explains cycles per instruction with an average error of 1.0% and maximum error of 4.1% when applying an existing prefetcher to the instruction cache. The idea of miss regions highlights that instruction cache misses and branch mispredictions interact with each other in processors with a decoupled front-end. We hope that considering this interaction will motivate the development of fast performance estimation methods and new microarchitectural methods.
Shunsuke TSUKADA Hikaru TAKAYASHIKI Masayuki SATO Kazuhiko KOMATSU Hiroaki KOBAYASHI
A hybrid memory architecture (HMA) that consists of some distinct memory devices is expected to achieve a good balance between high performance and large capacity. Unlike conventional memory architectures, the HMA needs the metadata for data management since the data are migrated between the memory devices during the execution of an application. The memory controller caches the metadata to avoid accessing the memory devices for the metadata reference. However, as the amount of the metadata increases in proportion to the size of the HMA, the memory controller needs to handle a large amount of metadata. As a result, the memory controller cannot cache all the metadata and increases the number of metadata references. This results in an increase in the access latency to reach the target data and degrades the performance. To solve this problem, this paper proposes a metadata prefetching mechanism for HMAs. The proposed mechanism loads the metadata needed in the near future by prefetching. Moreover, to increase the effect of the metadata prefetching, the proposed mechanism predicts the metadata used in the near future based on an address difference that is the difference between two consecutive access addresses. The evaluation results show that the proposed metadata prefetching mechanism can improve the instructions per cycle by up to 44% and 9% on average.
The 2020 International Conference on Emerging Technologies for Communications (ICETC2020) was held online on December 2nd—4th, 2020, and 213 research papers were accepted and presented in each session. It is expected that the accepted papers will contribute to the development and extension of research in multiple research areas. In this survey paper, all accepted research papers are classified into four research areas: Physical & Fundamental, Communications, Network, and Information Technology & Application, and then research papers are classified into each research topic. For each research area and topic, this survey paper briefly introduces the presented technologies and methods.
Lin CAO Kaixuan LI Kangning DU Yanan GUO Peiran SONG Tao WANG Chong FU
Face sketch synthesis refers to transform facial photos into sketches. Recent research on face sketch synthesis has achieved great success due to the development of Generative Adversarial Networks (GAN). However, these generative methods prone to neglect detailed information and thus lose some individual specific features, such as glasses and headdresses. In this paper, we propose a novel method called Feature Learning Generative Adversarial Network (FL-GAN) to synthesize detail-preserving high-quality sketches. Precisely, the proposed FL-GAN consists of one Feature Learning (FL) module and one Adversarial Learning (AL) module. The FL module aims to learn the detailed information of the image in a latent space, and guide the AL module to synthesize detail-preserving sketch. The AL Module aims to learn the structure and texture of sketch and improve the quality of synthetic sketch by adversarial learning strategy. Quantitative and qualitative comparisons with seven state-of-the-art methods such as the LLE, the MRF, the MWF, the RSLCR, the RL, the FCN and the GAN on four facial sketch datasets demonstrate the superiority of this method.
Lin CAO Xibao HUO Yanan GUO Kangning DU
Sketch face recognition refers to matching photos with sketches, which has effectively been used in various applications ranging from law enforcement agencies to digital entertainment. However, due to the large modality gap between photos and sketches, sketch face recognition remains a challenging task at present. To reduce the domain gap between the sketches and photos, this paper proposes a cascaded transformation generation network for cross-modality image generation and sketch face recognition simultaneously. The proposed cascaded transformation generation network is composed of a generation module, a cascaded feature transformation module, and a classifier module. The generation module aims to generate a high quality cross-modality image, the cascaded feature transformation module extracts high-level semantic features for generation and recognition simultaneously, the classifier module is used to complete sketch face recognition. The proposed transformation generation network is trained in an end-to-end manner, it strengthens the recognition accuracy by the generated images. The recognition performance is verified on the UoM-SGFSv2, e-PRIP, and CUFSF datasets; experimental results show that the proposed method is better than other state-of-the-art methods.
Cache prefetching technique brings huge benefits to performance improvement, but it comes at the cost of microarchitectural security in processors. In this letter, we deep dive into internal workings of a DCUIP prefetcher, which is one of prefetchers equipped in Intel processors. We discover that a DCUIP table is shared among different execution contexts in hyperthreading-enabled processors, which leads to another microarchitectural vulnerability. By exploiting the vulnerability, we propose a DCUIP poisoning attack. We demonstrate an AES encryption key can be extracted from an AES-NI implementation by mounting the proposed attack.
Jianli CAO Zhikui CHEN Yuxin WANG He GUO Pengcheng WANG
Like many processors, GPGPU suffers from memory wall. The traditional solution for this issue is to use efficient schedulers to hide long memory access latency or use data prefetch mech-anism to reduce the latency caused by data transfer. In this paper, we study the instruction fetch stage of GPU's pipeline and analyze the relationship between the capacity of GPU kernel and instruction miss rate. We improve the next line prefetch mechanism to fit the SIMT model of GPU and determine the optimal parameters of prefetch mechanism on GPU through experiments. The experimental result shows that the prefetch mechanism can achieve 12.17% performance improvement on average. Compared with the solution of enlarging I-Cache, prefetch mechanism has the advantages of more beneficiaries and lower cost.
Makoto NAKAMURA Hiroaki NISHIUCHI Jin NAKAZATO Konstantin KOSLOWSKI Julian DAUBE Ricardo SANTOS Gia Khanh TRAN Kei SAKAGUCHI
In this paper, a Proof-of-Concept (PoC) architecture is constructed, and the effectiveness of mmWave overlay heterogeneous network (HetNet) with mesh backhaul utilizing route-multiplexing and Multi-access Edge Computing (MEC) utilizing prefetching algorithm is verified by measuring the throughput and the download time of real contents. The architecture can cope with the intensive mobile data traffic since data delivery utilizes multiple backhaul routes based on the mesh topology, i.e. route-multiplexing mechanism. On the other hand, MEC deploys the network edge contents requested in advance by nearby User Equipment (UE) based on pre-registered context information such as location, destination, demand application, etc. to the network edge, which is called prefetching algorithm. Therefore, mmWave access can be fully exploited even with capacity-limited backhaul networks by introducing the proposed algorithm. These technologies solve the problems in conventional mmWave HetNet to reduce mobile data traffic on backhaul networks to cloud networks. In addition, the proposed architecture is realized by introducing wireless Software Defined Network (SDN) and Network Function Virtualization (NFV). In our architecture, the network is dynamically controlled via wide-coverage microwave band links by which UE's context information is collected for optimizing the network resources and controlling network infrastructures to establish backhaul routes and MEC servers. In this paper, we develop the hardware equipment and middleware systems, and introduce these algorithms which are used as a driver of IEEE802.11ad and open source software. For 5G and beyond, the architecture integrated in mmWave backhaul, MEC and SDN/NFV will support some scenarios and use cases.
In this paper, we propose L0 norm optimization in a scrambled sparse representation domain and its application to an Encryption-then-Compression (EtC) system. We design a random unitary transform that conserves L0 norm isometry. The resulting encryption method provides a practical orthogonal matching pursuit (OMP) algorithm that allows computation in the encrypted domain. We prove that the proposed method theoretically has exactly the same estimation performance as the nonencrypted variant of the OMP algorithm. In addition, we demonstrate the security strength of the proposed secure sparse representation when applied to the EtC system. Even if the dictionary information is leaked, the proposed scheme protects the privacy information of observed signals.
Takahide ITO Yuichi NAKAMURA Kazuaki KONDO Espen KNOOP Jonathan ROSSITER
This paper introduces a novel skin-stretcher device for gently urging head rotation. The device pulls and/or pushes the skin on the user's neck by using servo motors. The user is induced to rotate his/her head based on the sensation caused by the local stretching of skin. This mechanism informs the user when and how much the head rotation is requested; however it does not force head rotation, i.e., it allows the user to ignore the stimuli and to maintain voluntary movements. We implemented a prototype device and analyzed the performance of the skin stretcher as a human-in-the-loop system. Experimental results define its fundamental characteristics, such as input-output gain, settling time, and other dynamic behaviors. Features are analyzed, for example, input-output gain is stable within the same installation condition, but various between users.
Yujian FENG Fei WU Yimu JI Xiao-Yuan JING Jian YU
Sketch face recognition is to match sketch face images to photo face images. The main challenge of sketch face recognition is learning discriminative feature representations to ensure intra-class compactness and inter-class separability. However, traditional sketch face recognition methods encouraged samples with the same identity to get closer, and samples with different identities to be further, and these methods did not consider the intra-class compactness of samples. In this paper, we propose triplet-margin-center loss to cope with the above problem by combining the triplet loss and center loss. The triplet-margin-center loss can enlarge the distance of inter-class samples and reduce intra-class sample variations simultaneously, and improve intra-class compactness. Moreover, the triplet-margin-center loss applies a hard triplet sample selection strategy. It aims to effectively select hard samples to avoid unstable training phase and slow converges. With our approach, the samples from photos and from sketches taken from the same identity are closer, and samples from photos and sketches come from different identities are further in the projected space. In extensive experiments and comparisons with the state-of-the-art methods, our approach achieves marked improvements in most cases.
In this paper, we propose a secure computation of sparse coding and its application to Encryption-then-Compression (EtC) systems. The proposed scheme introduces secure sparse coding that allows computation of an Orthogonal Matching Pursuit (OMP) algorithm in an encrypted domain. We prove theoretically that the proposed method estimates exactly the same sparse representations that the OMP algorithm for non-encrypted computation does. This means that there is no degradation of the sparse representation performance. Furthermore, the proposed method can control the sparsity without decoding the encrypted signals. Next, we propose an EtC system based on the secure sparse coding. The proposed secure EtC system can protect the private information of the original image contents while performing image compression. It provides the same rate-distortion performance as that of sparse coding without encryption, as demonstrated on both synthetic data and natural images.
Yosei SHIBATA Ryosuke SAITO Takahiro ISHINABE Hideo FUJIKAKE
In this study, we examined the mechanical durability and self-recovery characterization of liquid crystal gel films with lysine-based gelator. The results indicated that the structural destruction in liquid crystal gel films is attributed to dissociation among network structure. The cracked LC gel films can be recovered by formation of sol-sate films.
Kanghee KIM Wooseok LEE Sangbang CHOI
Hardware prefetching involves a sophisticated balance between accuracy, coverage, and timeliness while minimizing hardware cost. Recent prefetchers have achieved these goals, but they still require complex hardware and a significant amount of storage. In this paper, we propose an efficient Per-page Most-Offset Prefetcher (PMOP) that minimizes hardware cost and simultaneously improves accuracy while maintaining coverage and timeliness. We achieve these objectives using an enhanced offset prefetcher that performs well with a reasonable hardware cost. Our approach first addresses coverage and timeliness by allowing multiple Most-Offset predictions. To minimize offset interference between pages, the PMOP leverages a fine-grain per-page offset filter. This filter records the access history with page-IDs, which enables efficient mapping and tracking of multiple offset streams from diverse pages. Analysis results show that PMOP outperforms the state-of-the-art Signature Path Prefetcher while reducing storage overhead by a factor of 3.4.
Shun-ichiro OHMI Yuya TSUKAMOTO Rengie Mark D. MAILIG
In this paper, we have investigated the etching selectivity of HfN encapsulating layer for high quality PtHf-alloy silicide (PtHfSi) formation with low contact resistivity on Si(100). The HfN(10 nm)/PtHf(20 nm)/p-Si(100) stacked layer was in-situ deposited by RF-magnetron sputtering at room temperature. Then, silicidation was carried out at 500°C/20 min in N2/4.9%H2 ambient. Next, the HfN encapsulating layer was etched for 1-10 min by buffered-HF (BHF) followed by the unreacted PtHf metal etching. We have found that the etching duration of the 10-nm-thick HfN encapsulating layer should be shorter than 6 min to maintain the PtHfSi crystallinity. This is probably because the PtHf-alloy silicide was gradually etched by BHF especially for the Hf atoms after the HfN was completely removed. The optimized etching process realized the ultra-low contact resistivity of PtHfSi to p+/n-Si(100) and n+/p-Si(100) such as 9.4×10-9Ωcm2 and 4.8×10-9Ωcm2, respectively, utilizing the dopant segregation process. The control of etching duration of HfN encapsulating layer is important to realize the high quality PtHfSi formation with low contact resistivity.
This letter proposes a new face sketch recognition method. Given a query sketch and face photos in a database, the proposed method first synthesizes pseudo sketches by computing the locality sensitive histogram and dense illumination invariant features from the resized face photos, then extracts discriminative features by computing histogram of averaged oriented gradients on the query sketch and pseudo sketches, and finally find a match with the shortest cosine distance in the feature space. It achieves accuracy comparable to the state-of-the-art while showing much more robustness than the existing face sketch recognition methods.
Tatsuya CHUMAN Kenta IIDA Warit SIRICHOTEDUMRONG Hitoshi KIYA
Encryption-then-Compression (EtC) systems have been proposed to securely transmit images through an untrusted channel provider. In this study, EtC systems were applied to social media like Twitter that carry out image manipulations. The block scrambling-based encryption schemes used in EtC systems were evaluated in terms of their robustness against image manipulation on social media. The aim was to investigate how five social networking service (SNS) providers, Facebook, Twitter, Google+, Tumblr and Flickr, manipulate images and to determine whether the encrypted images uploaded to SNS providers can avoid being distorted by such manipulations. In an experiment, encrypted and non-encrypted JPEG images were uploaded to various SNS providers. The results show that EtC systems are applicable to the five SNS providers.
Encryption-then-Compression (EtC) systems have been considered for the user-controllable privacy protection of social media like Twitter. The aim of this paper is to evaluate the security of block scrambling-based encryption schemes, which have been proposed to construct EtC systems. Even though this scheme has enough key spaces against brute-force attacks, each block in encrypted images has almost the same correlation as that of original images. Therefore, it is required to consider the security from different viewpoints from number theory-based encryption methods with provable security such as RSA and AES. In this paper, we evaluate the security of encrypted images including JPEG distortion by using automatic jigsaw puzzle solvers.
Ryosuke SAITO Yosei SHIBATA Takahiro ISHINABE Hideo FUJIKAKE
In this study, we evaluated the electro-optical characteristics and structural stability in curved state of dye-doped liquid crystal (LC) gel film for stretchable displays. As the results, maximum contrast ratio of 6.7:1 and suppression of LC flow were achieved by optimum of blend condition such as gelator and dye concentration.
Tatsuya CHUMAN Kenta KURIHARA Hitoshi KIYA
The aim of this paper is to apply automatic jigsaw puzzle solvers, which are methods of assembling jigsaw puzzles, to the field of information security. Encryption-then-Compression (EtC) systems have been considered for the user-controllable privacy protection of digital images in social network services. Block scrambling-based encryption schemes, which have been proposed to construct EtC systems, have enough key spaces for protecting brute-force attacks. However, each block in encrypted images has almost the same correlation as that of original images. Therefore, it is required to consider the security from different viewpoints from number theory-based encryption methods with provable security such as RSA and AES. In this paper, existing jigsaw puzzle solvers, which aim to assemble puzzles including only scrambled and rotated pieces, are first reviewed in terms of attacking strategies on encrypted images. Then, an extended jigsaw puzzle solver for block scrambling-based encryption scheme is proposed to solve encrypted images including inverted, negative-positive transformed and color component shuffled blocks in addition to scrambled and rotated ones. In the experiments, the jigsaw puzzle solvers are applied to encrypted images to consider the security conditions of the encryption schemes.