Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Crowd counting is a crucial task in computer vision, which poses a significant challenge yet holds vast potential for practical applications in public safety and transportation. Traditional crowd counting approaches typically rely on a single framework to predict density maps or head point distributions. However, the straightforward architectures often fall short in cases of over-counting or omission, particularly in diverse crowded scenes. To address these limitations, we introduce the Density to Point Transformer (D2PT), an innovative approach for effective crowd counting and localization. Specifically, D2PT employs a Transformer-based teacher-student framework that integrates the insights of density-based and head-point-based methods. Furthermore, we introduce feature-aligned knowledge distillation, formulating a collaborative training approach that enhances the performance of both density estimation and point map prediction. Optimized with multiple loss functions, D2PT achieves state-of-the-art performance across five crowd counting datasets, demonstrating its robustness and effectiveness for intricate crowd counting and localization challenges.
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
We propose a method for adaptively selecting merge candidates for the geometric partitioning mode (GPM) in versatile video coding (VVC). The conventional GPM contributes to improved coding efficiency and subjective quality by partitioning the block into two nonrectangular partitions with motion vectors. The motion vector of the GPM is encoded as an index of the merge candidate list, but it does not consider that the GPM partitions are nonrectangular. In this paper, the distribution of merge candidates was evaluated for each GPM mode and partition, and a characteristic bias was revealed. To improve the coding efficiency of VVC, the proposed method allows GPM to select merge candidates that are specific to the partition. This method also introduces adaptive reference frame selection using template matching of adjacent samples. Following common test conditions in the Joint Video Experts Team (JVET), the experimental results showed an improvement in coding efficiency, with a bitrate savings of 0.16%, compared to the reference software for exploration experiments on enhanced compression beyond VVC capability in the JVET.
Guangjin OUYANG Yong GUO Yu LU Fang HE
With the rapid development of Internet technology, the type and quantity of network traffic data have increased accordingly, and network traffic classification has become an important research task. In previous research, there are methods based on traditional machine learning and deep learning; compared to machine learning, deep learning can obtain good results by converting network traffic into two-dimensional images and utilizing deep learning classification models. However, all of these methods have some limitations: the trained models cannot learn sustainably, and the generalization ability of the models is limited. In order to solve this problem, we propose a network traffic classification methods based on incremental learning and Mixup, which is based on generative adversarial networks. First, the network traffic is converted into a 2D image, the original database is linearly interpolated using Mixup to reduce the overfitting tendency of the model and improve the generalization ability, and the traffic is classified using the ability of deep learning on the image. Secondly, we improve the traditional incremental learning algorithm. To effectively address the imbalance between old and new categories in incremental learning. The experimental results show that the model performs well in classification experiments, reaching 92.26% and 93.86% accuracy on the ISCXVPN2016 and USTC datasets, respectively, and we can maintain a high accuracy rate with limited storage space in the process of increasing new categories.
A backdoor sample attack is an attack that causes a deep neural network to misrecognize data that include a specific trigger because the model has been trained on malicious data that insert triggers into the deep neural network. The deep neural network correctly recognizes data without triggers, but incorrectly recognizes data with triggers. These backdoor attacks have mainly been studied in the image domain; however, defense research in the text domain is insufficient. In this study, we propose a method to defend against textual backdoor samples using a detection model. The proposed method detects a textual backdoor sample by comparing the resulting value of the target model with that of the model trained on the original training data. This method can defend against attacks without access to the entire training data. For the experimental setup, we used the TensorFlow library, and the MR and IMDB datasets were used as the experimental datasets. As a result of the experiment, when 1000 partial training datasets were used to train the detection model, the proposed method could classify the MR and IMDB datasets with detection rates of 79.6% and 83.2%, respectively.
Shotaro SUGITANI Ryuichi NAKAJIMA Keita YOSHIDA Jun FURUTA Kazutoshi KOBAYASHI
Integrated circuits used in automotive or aerospace applications must have high soft error tolerance. Redundant Flip Flops (FFs) are effective to improve the soft error tolerance. However, these countermeasures have large performance overheads and can be excessive for terrestrial applications. This paper proposes two types of radiation-hardened FFs named Primary Latch Transmission gate FF (PLTGFF) and Feed-Back Gate Tri-state Inverter FF (FBTIFF) for terrestrial use. By increasing the critical charge (Qcrit) at weak nodes, soft error tolerance of them were improved with low performance overheads. PLTGFF has the 5% area, 4% delay, and 10% power overheads, while FBTIFF has the 42% area, 10% delay, and 22% power overheads. They were fabricated in a 65 nm bulk process. By α-particle and spallation neutron irradiation tests, the soft error rates are reduced by 25% for PLTGFF and 50% for FBTIFF compared to a standard FF. In the terrestrial environment, the proposed FFs have better trade-offs between reliability and performance than those of multiplexed FFs such as the dual-interlocked storage cell (DICE) with larger overheads than the proposed FFs.
The Origin of the low turn-on voltage in the blue organic light-emitting diode using upconversion is discussed. We have discovered the properties of the intermediate state at the donor/acceptor interface such as the energy levels and the molecular interactions determines the efficiency of the upconversion process.
Misato ONISHI Kazuhiro YAMAGUCHI Yuji SAKAMOTO
Holography is a three-dimensional (3D) technology that enables natural stereoscopic viewing with deep depth and expected for practical use in the future. Based on the recording process of holography, the electronic data generated through numerical simulation in a computer are called computer-generated holograms (CGHs). Displaying the generated CGH on a spatial light modulator and reconstructing a 3D object by illuminating it with light is called electro-holography. One of the issues in the development of 3DTV using electro-holography is the compression and transmission of a CGH. Because of the data loss caused by compression in a CGH, the quality of the reconstructed image may be affected, unlike normal 2D images. In wireless transmission of a CGH, not only data loss due to compression but also retransmissions and drops of data due to unstable network environments occur. These may degrade the quality of the reconstructed image, cause frame drops, and decrease the frame rate. In this paper, we developed a system for streaming CGH videos for reconstructing 3D objects using electro-holography. CGH videos were generated by merging multiple CGHs into a timeline, and the uncompressed or lossless compressed CGH videos were streamed via a network such as wired and wireless local area networks, a local 5G network, and mobile network. The performance of the network and quality of the CGH videos and reconstructed images were evaluated. Optically reconstructed images were obtained from the uncompressed CGH videos streamed via the networks. It was also confirmed that the required bit rate could be reduced without degrading the quality of the reconstructed image by using lossless compression. In some cases of wireless transmission, even when packet loss or retransmission occurs, there was no degradation in the reconstructed image quality.
Arie SETIAWAN Shu SATO Naruto YONEMOTO Hitoshi NOHMI Hiroshi MURATA
To improve throughput in security inspection procedures, a millimeter-wave (mmW) imaging system with a high-throughput operation with reasonable resolution compared to conventional mmW imaging systems is developed. Investigates the distinctive attributes of mmW, including its safe penetration through clothing, the study demonstrates the generation of detailed two-dimensional reconstructions of objects. Through the strategic use of a lens, signal amplitudes and phases are effectively captured, yielding reconstruction images from the signal reflected from the target. Experimental validations further affirm the effectiveness of mmW imaging with a dielectric lens, showcasing successful reconstructions of targets positioned at the lens’s front focal plane. Notably, the approach exhibits proficiency in discerning objects obscured behind non-metallic materials such as paper and cloth. These findings highlight the potential of utilizing Fourier transform analysis and a dielectric lens in mmW imaging, presenting a promising approach for security applications, particularly in the detection of concealed objects.
Hiroto TOCHIGI Masakazu NAKATANI Ken-ichi AOSHIMA Mayumi KAWANA Yuta YAMAGUCHI Kenji MACHIDA Nobuhiko FUNABASHI Hideo FUJIKAKE
In this study, we introduce a lateral electric-field driving system based on continuous potential-difference driving using lateral transparent electrodes to achieve a wide viewing zone angle in electronic holographic displays. We evaluate light modulation to validate the independent driving capability of each pixel at a high resolution (pixel pitch: 1 μm). Additionally, we demonstrate the feasibility of two-dimensional driving by integrating the driving and ground electrodes.
Hiroyuki HATANO Seiya HORIUCHI Kosuke SANADA Kazuo MORI Takaya YAMAZATO Shintaro ARAI Masato SAITO Yukihiro TADOKORO Hiroya TANAKA
Received Signal Strength Indicator (RSSI)-based localization is of interest in indoor localization systems. In this study, we propose a method to improve localization accuracy using interference-oriented fluctuation. We estimate the distance between target and beacon nodes by utilizing the nodes located around them. When the beacon node transmits a signal to the target for measuring the distance, the surrounding nodes also transmit a copy of the signal. Such signals cause interference patterns at the beacon, thereby randomizing the RSSI. Our developed statistical signal processing enables the estimation of the strength of the received signal with the randomized RSSI. We numerically show that the distance between the target and beacon nodes is estimated with lower error than when using the conventional method. In addition, such accurate distance estimation allows significant improvement in localization performance. Our approach is useful for indoor localization systems, for example, those in medical and industrial applications.
Renwei CUI Wei CUI Yujian CAI Yu YAN
The electrocardiogram (ECG) signals P-wave, QRS wave and T-wave all reflect the activity of the heart, and the analysis of ECG signals can provide basic information for the diagnosis and prevention of heart disease. In the work of this paper, frequency-modulated continuous-wave (FMCW) radar and deep learning network are utilized to acquire ECG signals non-contactly, and we propose an improved differential and cross multiply (DACM) algorithm and a multi-neighbor differentiator for extracting cardiac motion acceleration information, as well as a partitioned reconstruction network incorporating an attention mechanism of encoder-decoder to achieve ECG signal reconstruction. The design principle is a combination of signal segmentation and deep learning (Sequence-to-sequence and attention) called SS-S2SA. firstly, a segmentation algorithm is applied to segment the acceleration signal and the ECG signal synchronously, and then the cardiac motion acceleration signal is mapped to the ECG signal using the SS-S2SA network. The method proposed in this paper is demonstrated to reconstruct ECG signals more accurately and finely by training more than 18,000 acceleration signal segments from 10 healthy subjects and evaluating the predictions from 5 subjects. The average correlation coefficient between the predicted signal and the real signal is about 0.92, and the mean absolute error (MAE) of the timing of the P-peak, R-peak, and T-peak are 13.9 ms, 8.1 ms, and 11.1 ms, respectively.
Toru TAKAHASHI Yasunori KATO Kentaro ISODA Yusuke KITSUKAWA
In this paper, a Doppler-tolerant waveform is proposed as a transmitting signal for joint radar and communication systems. In the proposed waveform, communication signals are multiplexed at the side band of a linear frequency modulated (LFM) pulse, based on the orthogonal frequency division multiplexing (OFDM) scheme. Therefore, the proposed waveform can maintain Doppler-tolerance in radar use as well as the original LFM pulse can. In addition, it is also capable of flexibly increasing the transmission rate in communication use by assigning more communication signals at the side-band subcarriers. Numerical simulations were carried out to comprehensively examine the proposed waveform in terms of the probability of detection in radar use and the symbol error rate in communication use. In conclusion, the proposed waveform is suited to the transmitting signal for joint radar and communication systems, especially with maintaining Doppler-tolerance to detect fast-moving targets.
Yasuyuki MAEKAWA Koichi HARADA Junichi ABE Fumihiro YAMASHITA
The signal levels of Ku-band BS broadcast radio wave and JCSAT-5A beacon radio wave have been simultaneously measured at Osaka Electro-Communication University (OECU, Neyagawa, Osaka), NTT Yokosuka R&D Center (Yokosuka, Kanagawa), and satellite base station (Matsuyama, Ehime), respectively, from April 2022 to March 2023. The yearly cumulative distribution of rain attenuation at Yokosuka station shows the same increasing tendency compared to the ITU-R recommendations, as at Neyagawa station, while the increasing tendency is not clear at Matsuyama station. Also, site diversity techniques are examined among these three stations with relatively long distances of about 300-700 km. The site diversity effects among the three stations are almost consistent with the ITU-R recommendations between eastern and western areas of Japan. The 99.9% annual available time (0.1% unavailable time) percentage of satellite operations is shown to be guaranteed by the rain margins of 3-5 dB for the yearly rain attenuation statistics at the three stations. The monthly rain attenuation statistics, however, indicate that the rain margins of 6-10 dB are required to maintain the same 99.9% available time percentage primarily around summer time. The increase in rain margins is successfully suppressed under 3 dB using the site diversity operations. This increase in rain margins is well explained by the worst month statistics of the ITU-R recommendations.
Zewei HE Zixuan CHEN Guizhong FU Yangming ZHENG Zhe-Ming LU
In this letter, we propose a single frame based method to remove the stripe noise, meanwhile preserving the vertical details. The key idea is to employ the side-window filter to perform edge-preserving smoothing, and then accurately separate the stripe noise via a 1D column guided filter. Experimental results demonstrate the effectiveness and efficiency of our method.
Shota TOYOOKA Yoshinobu KAJIKAWA
This letter proposes a method that can track the movement of noise sources in fixed filter ANC and virtual sensing ANC systems by using source localization with multiple microphones. Since the optimal noise control filter depends on the location of the noise source, the proposed system prepares optimal noise control filters in advance for multiple locations where the noise is expected to move. The noise source location is then identified using the noise source localization method during the operation of the ANC system, and the appropriate noise control filter is selected according to the location. Simulation results using actual impulse responses show that a noise reduction of approximately 20 dB is possible even if the noise source moves.
Kengo NAKATA Daisuke MIYASHITA Jun DEGUCHI Ryuichi FUJIMOTO
Quantization is commonly used to reduce the inference time of convolutional neural networks (CNNs). To reduce the inference time without drastically reducing accuracy, optimal bit widths need to be allocated for each layer or filter of the CNN. In conventional methods, the optimal bit allocation is obtained by using the gradient descent algorithm while minimizing the model size. However, the model size has little to no correlation with the inference time. In this paper, we present a computational-complexity metric called MAC×bit that is strongly correlated with the inference time of quantized CNNs. We propose a gradient descent-based regularization method that uses this metric for optimal bit allocation of a quantized CNN to improve the recognition accuracy and reduce the inference time. In experiments, the proposed method reduced the inference time of a quantized ResNet-18 model by 21.0% compared with the conventional regularization method based on model size while maintaining comparable recognition accuracy.
Kun ZHOU Zejun ZHANG Xu TANG Wen XU Jianxiao XIE Changbing TANG
RGB-D semantic segmentation has attracted increasing attention over the past few years. The depth feature encodes both the shape of a local geometry as well as the base (whereabout) of it in a larger context. RGB and depth images can be concatenated into one and inputted into a network model, reducing additional computation but resulting in some distractive information as they are multimodal. For the problem, we propose a Shape-aware Convolutional layer with Convolutional Kernel Attention (CKA-ShapeConv) for reducing the distractive information by leveraging each unique input feature to rectify the kernels. Instead of using a single convolution kernel, we aggregate N parallel convolution kernels based on input-dependent attention. Specifically, four sets of attention weights are firstly calculated from each input feature map, next N parallel convolution kernels are weighted and aggregated along different dimensions, which ensure that the generated convolution kernel is more capable of catching semantic information from the input feature map, reducing interference between RGB and depth features. Then the aggregated convolution kernel is decomposed into two components: base and shape, two new learnable weights are introduced to cooperate with them independently, and finally a convolution is applied on the re-weighted combination of these two components. These two components can capture semantic and shape information of regions effectively, respectively. Meanwhile, our CKA-ShapeConv layer can be easily integrated into most existing backbone models with only a small amount of additional computation. Our experiments on NYUDv2 and SUN RGB-D datasets show that the proposed CKA-ShapeConv layer can improve the performance of backbone models effectively.
Zhihao LI Ruihu LI Chaofeng GUAN Liangdong LU Hao SONG Qiang FU
In this paper, we propose a class of 1-generator quasi-twisted codes with special structures and investigate their application to construct ternary quantum codes. We discuss the algebraic structure of these 1-generator quasi-twisted codes and their dual codes. Moreover, sufficient conditions for these quasi-twisted codes to satisfy Hermitian self-orthogonality are given. Then, some ternary quantum codes exceeding the Gilbert-Varshamov bound are derived from such Hermitian self-orthogonal 1-generator quasi-twisted codes. In particular, sixteen quantum codes are new or have better parameters than those in the literatures, eight of which are obtained by the progapation rules.
Kyohei SUDO Keisuke HARA Masayuki TEZUKA Yusuke YOSHIDA
The learning with errors (LWE) problem is one of the fundamental problems in cryptography and it has many applications in post-quantum cryptography. There are two variants of the problem, the decisional-LWE problem, and the search-LWE problem. LWE search-to-decision reduction shows that the hardness of the search-LWE problem can be reduced to the hardness of the decisional-LWE problem. The efficiency of the reduction can be regarded as the gap in difficulty between the problems. We initiate a study of quantum search-to-decision reduction for the LWE problem and propose a reduction that satisfies sample-preserving. In sample-preserving reduction, it preserves all parameters even the number of instances. Especially, our quantum reduction invokes the distinguisher only 2 times to solve the search-LWE problem, while classical reductions require a polynomial number of invocations. Furthermore, we give a way to amplify the success probability of the reduction algorithm. Our amplified reduction is incomparable to the classical reduction in terms of sample complexity and query complexity. Our reduction algorithm supports a wide class of error distributions and also provides a search-to-decision reduction for the learning parity with noise problem. In the process of constructing the search-to-decision reduction, we give a quantum Goldreich-Levin theorem over ℤq where q is a prime. In short, this theorem states that, if a hardcore predicate a・s (mod q) can be predicted with probability distinctly greater than (1/q) with respect to a uniformly random a ∈ ℤqn, then it is possible to determine s ∈ ℤqn.
Jie REN Minglin LIU Lisheng LI Shuai LI Mu FANG Wenbin LIU Yang LIU Haidong YU Shidong ZHANG
The distribution station serves as a foundational component for managing the power system. However, there are missing data in the areas without collection devices due to the limitation of device deployment, leading to an adverse impact on the real-time and precise monitoring of distribution stations. The problem of missing data can be solved by the pseudo measurement data deduction method. Traditional pseudo measurement data deduction methods overlook the temporal and contextual correlations of distribution station data, resulting in a lower restoration accuracy. Motivated by the above challenges, this paper proposes a novel pseudo measurement data deduction model for minimal data collection requirements in distribution stations. Compared to the traditional GAN, the proposed enhanced GAN improves the architecture by decomposing the input tensor of the generator, allowing it to handle high-dimensional and intricate data. Furthermore, we enhance the loss function to accelerate the model’s convergence speed. Our proposed approach allows GAN to be trained within a supervised environment, effectively enhancing the accuracy of model training. The simulation result shows that the proposed algorithm achieves better performances compared with existing methods.