Jinkyu KANG Seongah JEONG Hoojin LEE
In this letter, we analyze the error rate performance of M-ary coherent free-space optical (FSO) communications under strong atmospheric turbulence. Specifically, we derive the exact error rates for M-ary phase shift keying (MPSK) and M-ary quadrature amplitude modulation (MQAM) based on moment-generating function (MGF) with negative exponential distributed turbulence, where maximum ratio combining (MRC) receiver is adopted to mitigate the turbulence effects. Additionally, by evaluating the asymptotic error rate in high signal-to-noise ratio (SNR) regime, it is possible to effectively investigate and predict the error rate performance for various system configurations. The accuracy and the effectiveness of our theoretical analyses are verified via numerical results.
Zhimin GUO Jianfei CHEN Sheng ZHANG
Millimeter wave synthetic aperture interferometric radiometers (SAIR) are very powerful instruments, which can effectively realize high-precision imaging detection. However due to the existence of interference factor and complex near-field error, the imaging effect of near-field SAIR is usually not ideal. To achieve better imaging results, a new fully connected imaging network (FCIN) is proposed for near-field SAIR. In FCIN, the fully connected network is first used to reconstruct the image domain directly from the visibility function, and then the residual dense network is used for image denoising and enhancement. The simulation results show that the proposed FCIN method has high imaging accuracy and shorten imaging time.
Xiang SHEN Dezhi HAN Chin-Chen CHANG Liang ZONG
Visual Question Answering (VQA) is multi-task research that requires simultaneous processing of vision and text. Recent research on the VQA models employ a co-attention mechanism to build a model between the context and the image. However, the features of questions and the modeling of the image region force irrelevant information to be calculated in the model, thus affecting the performance. This paper proposes a novel dual self-guided attention with sparse question networks (DSSQN) to address this issue. The aim is to avoid having irrelevant information calculated into the model when modeling the internal dependencies on both the question and image. Simultaneously, it overcomes the coarse interaction between sparse question features and image features. First, the sparse question self-attention (SQSA) unit in the encoder calculates the feature with the highest weight. From the self-attention learning of question words, the question features of larger weights are reserved. Secondly, sparse question features are utilized to guide the focus on image features to obtain fine-grained image features, and to also prevent irrelevant information from being calculated into the model. A dual self-guided attention (DSGA) unit is designed to improve modal interaction between questions and images. Third, the sparse question self-attention of the parameter δ is optimized to select these question-related object regions. Our experiments with VQA 2.0 benchmark datasets demonstrate that DSSQN outperforms the state-of-the-art methods. For example, the accuracy of our proposed model on the test-dev and test-std is 71.03% and 71.37%, respectively. In addition, we show through visualization results that our model can pay more attention to important features than other advanced models. At the same time, we also hope that it can promote the development of VQA in the field of artificial intelligence (AI).
Tatsumi KONISHI Hiroyuki NAKANO Yoshikazu YANO Michihiro AOKI
This letter proposes a transmission scheme called spatial vector (SV), which is effective for Nakagami-m fading multiple-input multiple-output channels. First, the analytical error rate of SV is derived for Nakagami-m fading MIMO channels. Next, an example of SV called integer SV (ISV) is introduced. The error performance was evaluated over Nakagami-m fading from m = 1 to m = 50 and compared with spatial modulation (SM), enhanced SM, and quadrature SM. The results show that for m > 1, ISV outperforms the SM schemes and is robust to m variations.
Tingyao WU Zhisong BIE Celimuge WU
The newly proposed orthogonal time frequency space (OTFS) system exhibits excellent error performance on high-Doppler fading channels. However, the rectangular prototype window function (PWF) inherent in OTFS leads to high out-of-band emission (OOBE), which reduces the spectral efficiency in multi-user scenarios. To this end, this paper presents an OTFS system based on bi-orthogonal frequency division multiplexing (OTFS-BFDM) modulation. In OTFS-BFDM systems, PWFs with bi-orthogonal properties can be optimized to provide lower OOBE than OTFS, which is a special case with rectangular PWF. We further derive that the OTFS-BFDM system is sparsely-connected so that the low-complexity message passing (MP) decoding algorithm can be adopted. Moreover, the power spectral density, peak to average power ratio (PAPR) and bit error rate (BER) of the OTFS-BFDM system with different PWFs are compared. Simulation results show that: i) the use of BFDM modulation significantly inhibits the OOBE of OTFS system; ii) the better the frequency-domain localization of PWFs, the smaller the BER and PAPR of OTFS-BFDM system.
Object contour detection is a task of extracting the shape created by the boundaries between objects in an image. Conventional methods limit the detection targets to specific categories, or miss-detect edges of patterns inside an object. We propose a new method to represent a contour image where the pixel value is the distance to the boundary. Contour detection becomes a regression problem that estimates this contour image. A deep convolutional network for contour estimation is combined with stereo vision to detect unspecified object contours. Furthermore, thanks to similar inference targets and common network structure, we propose a network that simultaneously estimates both contour and disparity with fully shared weights. As a result of experiments, the multi-tasking network drew a good precision-recall curve, and F-measure was about 0.833 for FlyingThings3D dataset. L1 loss of disparity estimation for the dataset was 2.571. This network reduces the amount of calculation and memory capacity by half, and accuracy drop compared to the dedicated networks is slight. Then we quantize both weights and activations of the network to 3-bit. We devise a dedicated hardware architecture for the quantized CNN and implement it on an FPGA. This circuit uses only internal memory to perform forward propagation calculations, that eliminates high-power external memory accesses. This circuit is a stall-free pixel-by-pixel pipeline, and performs 8 rows, 16 input channels, 16 output channels, 3 by 3 pixels convolution calculations in parallel. The convolution calculation performance at the operating frequency of 250 MHz is 9 TOPs/s.
Toshiro NAKAHIRA Koichi ISHIHARA Motoharu SASAKI Hirantha ABEYSEKERA Tomoki MURAKAMI Takatsune MORIYAMA Yasushi TAKATORI
In this paper, we propose a novel centralized control method to handle multi-radio and terminal connections in an 802.11ax wireless LAN (802.11ax) mixed environment. The proposed control method can improve the throughput by applying 802.11ax Spatial Reuse in an environment hosting different terminal standards and mixed terminal communication quality. We evaluate the proposed control method by computer simulations assuming environments with mixed terminal standards, mixed communication quality, and both.
Yuuki FUJITA Akihiro FUJIMOTO Hideki TODE
With the increase of IoT devices, P2P-based IoT platforms have been attracting attention because of their capabilities of building and maintaining their networks autonomously in a decentralized way. In particular, Skip Graph, which has a low network rebuilding cost and allows range search, is suitable for the platform. However, when data observed at geographically close points have similar values (i.e. when data have strong spatial autocorrelation), existing types of Skip Graph degrade their search performances. In this paper, we propose a query transfer method that enables efficient search even for spatially autocorrelated data by adaptively using two-types of Skip Graph depending on the key-distance to the target key. Simulation results demonstrate that the proposed method can reduce the query transfer distance compared to the existing method even for spatially autocorrelated data.
Wenjing ZHANG Peng SONG Wenming ZHENG
In this letter, we propose a novel transferable sparse regression (TSR) method, for cross-database facial expression recognition (FER). In TSR, we firstly present a novel regression function to regress the data into a latent representation space instead of a strict binary label space. To further alleviate the influence of outliers and overfitting, we impose a row sparsity constraint on the regression term. And a pairwise relation term is introduced to guide the feature transfer learning. Secondly, we design a global graph to transfer knowledge, which can well preserve the cross-database manifold structure. Moreover, we introduce a low-rank constraint on the graph regularization term to uncover additional structural information. Finally, several experiments are conducted on three popular facial expression databases, and the results validate that the proposed TSR method is superior to other non-deep and deep transfer learning methods.
Qin CHENG Linghua ZHANG Bo XUE Feng SHU Yang YU
As an emerging technology, device-free localization (DFL) using wireless sensor networks to detect targets not carrying any electronic devices, has spawned extensive applications, such as security safeguards and smart homes or hospitals. Previous studies formulate DFL as a classification problem, but there are still some challenges in terms of accuracy and robustness. In this paper, we exploit a generalized thresholding algorithm with parameter p as a penalty function to solve inverse problems with sparsity constraints for DFL. The function applies less bias to the large coefficients and penalizes small coefficients by reducing the value of p. By taking the distinctive capability of the p thresholding function to measure sparsity, the proposed approach can achieve accurate and robust localization performance in challenging environments. Extensive experiments show that the algorithm outperforms current alternatives.
Zihao SONG Peng SONG Chao SHENG Wenming ZHENG Wenjing ZHANG Shaokai LI
Unsupervised Feature selection is an important dimensionality reduction technique to cope with high-dimensional data. It does not require prior label information, and has recently attracted much attention. However, it cannot fully utilize the discriminative information of samples, which may affect the feature selection performance. To tackle this problem, in this letter, we propose a novel discriminative virtual label regression method (DVLR) for unsupervised feature selection. In DVLR, we develop a virtual label regression function to guide the subspace learning based feature selection, which can select more discriminative features. Moreover, a linear discriminant analysis (LDA) term is used to make the model be more discriminative. To further make the model be more robust and select more representative features, we impose the ℓ2,1-norm on the regression and feature selection terms. Finally, extensive experiments are carried out on several public datasets, and the results demonstrate that our proposed DVLR achieves better performance than several state-of-the-art unsupervised feature selection methods.
Akira JINGUJI Shimpei SATO Hiroki NAKAHARA
Convolutional neural network (CNN) has a high recognition rate in image recognition and are used in embedded systems such as smartphones, robots and self-driving cars. Low-end FPGAs are candidates for embedded image recognition platforms because they achieve real-time performance at a low cost. However, CNN has significant parameters called weights and internal data called feature maps, which pose a challenge for FPGAs for performance and memory capacity. To solve these problems, we exploit a split-CNN and weight sparseness. The split-CNN reduces the memory footprint by splitting the feature map into smaller patches and allows the feature map to be stored in the FPGA's high-throughput on-chip memory. Weight sparseness reduces computational costs and achieves even higher performance. We designed a dedicated architecture of a sparse CNN and a memory buffering scheduling for a split-CNN and implemented this on the PYNQ-Z1 FPGA board with a low-end FPGA. An experiment on classification using VGG16 shows that our implementation is 3.1 times faster than the GPU, and 5.4 times faster than an existing FPGA implementation.
Lijun GAO Zhenyi BIAN Maode MA
DoS (Denial of Service) attacks are becoming one of the most serious security threats to global networks. We analyze the existing DoS detection methods and defense mechanisms in depth. In recent years, K-Means and improved variants have been widely examined for security intrusion detection, but the detection accuracy to data is not satisfactory. In this paper we propose a multi-dimensional space feature vector expansion K-Means model to detect threats in the network environment. The model uses a genetic algorithm to optimize the weight of K-Means multi-dimensional space feature vector, which greatly improves the detection rate against 6 typical Dos attacks. Furthermore, in order to verify the correctness of the model, this paper conducts a simulation on the NSL-KDD data set. The results show that the algorithm of multi-dimensional space feature vectors expansion K-Means improves the recognition accuracy to 96.88%. Furthermore, 41 kinds of feature vectors in NSL-KDD are analyzed in detail according to a large number of experimental training. The feature vector of the probability positive return of security attack detection is accurately extracted, and a comparison chart is formed to support subsequent research. A theoretical analysis and experimental results show that the multi-dimensional space feature vector expansion K-Means algorithm has a good application in the detection of DDos attacks.
Yoichi HINAMOTO Shotaro NISHIMURA
This paper investigates an adaptive notch digital filter that employs normal state-space realization of a single-frequency second-order IIR notch digital filter. An adaptive algorithm is developed to minimize the mean-squared output error of the filter iteratively. This algorithm is based on a simplified form of the gradient-decent method. Stability and frequency estimation bias are analyzed for the adaptive iterative algorithm. Finally, a numerical example is presented to demonstrate the validity and effectiveness of the proposed adaptive notch digital filter and the frequency-estimation bias analyzed for the adaptive iterative algorithm.
Tatsuki ITASAKA Ryo MATSUOKA Masahiro OKUDA
We propose an algorithm for the constrained design of FIR filters with sparse coefficients. In general filter design approaches, as the length of the filter increases, the number of multipliers used to construct the filter increases. This is a serious problem, especially in two-dimensional FIR filter designs. The FIR filter coefficients designed by the least-squares method with peak error constraint are optimal in the sense of least-squares within a given order, but not necessarily optimal in terms of constructing a filter that meets the design specification under the constraints on the number of coefficients. That is, a higher-order filter with several zero coefficients can construct a filter that meets the specification with a smaller number of multipliers. We propose a two-step approach to design constrained sparse FIR filters. Our method minimizes the number of non-zero coefficients while the frequency response of the filter that meets the design specification. It achieves better performance in terms of peak error than conventional constrained least-squares designs with the same or higher number of multipliers in both one-dimensional and two-dimensional filter designs.
Koji YAMAMOTO Takayuki NISHIO Masahiro MORIKURA Hirantha ABEYSEKERA
In this paper, a stochasic geometry analysis of the inversely proportional setting (IPS) of carrier sense threshold (CST) and transmission power for densely deployed wireless local area networks (WLANs) is presented. In densely deployed WLANs, CST adjustment is a crucial technology to enhance spatial reuse, but it can starve surrounding transmitters due to an asymmetric carrier sensing relationship. In order for the carrier sensing relationship to be symmetric, the IPS of the CST and transmission power is a promising approach, i.e., each transmitter jointly adjusts its CST and transmission power in order for their product to be equal to those of others. This setting is used for spatial reuse in IEEE 802.11ax. By assuming that the set of potential transmitters follows a Poisson point process, the impact of the IPS on throughput is formulated based on stochastic geometry in two scenarios: an adjustment at a single transmitter and an identical adjustment at all transmitters. The asymptotic expression of the throughput in dense WLANs is derived and an explicit solution of the optimal CST is achieved as a function of the number of neighboring potential transmitters and signal-to-interference power ratio using approximations. This solution was confirmed through numerical results, where the explicit solution achieved throughput penalties of less than 8% relative to the numerically evaluated optimal solution.
Enze YANG Shuoyan LIU Yuxin LIU Kai FANG
Crowd flow prediction in high density urban scenes is involved in a wide range of intelligent transportation and smart city applications, and it has become a significant topic in urban computing. In this letter, a CNN-based framework called Pyramidal Spatio-Temporal Network (PSTNet) for crowd flow prediction is proposed. Spatial encoding is employed for spatial representation of external factors, while prior pyramid enhances feature dependence of spatial scale distances and temporal spans, after that, post pyramid is proposed to fuse the heterogeneous spatio-temporal features of multiple scales. Experimental results based on TaxiBJ and MobileBJ demonstrate that proposed PSTNet outperforms the state-of-the-art methods.
Natsuki UENO Shoichi KOYAMA Hiroshi SARUWATARI
We propose a useful formulation for ill-posed inverse problems in Hilbert spaces with nonlinear clipping effects. Ill-posed inverse problems are often formulated as optimization problems, and nonlinear clipping effects may cause nonconvexity or nondifferentiability of the objective functions in the case of commonly used regularized least squares. To overcome these difficulties, we present a tractable formulation in which the objective function is convex and differentiable with respect to optimization variables, on the basis of the Bregman divergence associated with the primitive function of the clipping function. By using this formulation in combination with the representer theorem, we need only to deal with a finite-dimensional, convex, and differentiable optimization problem, which can be solved by well-established algorithms. We also show two practical examples of inverse problems where our theory can be applied, estimation of band-limited signals and time-harmonic acoustic fields, and evaluate the validity of our theory by numerical simulations.
Toshiro NAKAHIRA Tomoki MURAKAMI Hirantha ABEYSEKERA Koichi ISHIHARA Motoharu SASAKI Takatsune MORIYAMA Yasushi TAKATORI
In this paper, we examine techniques for improving the throughput of unlicensed radio systems such as wireless LANs (WLANs) to take advantage of multi-radio access to mobile broadband, which will be important in 5G evolution and beyond. In WLANs, throughput is reduced due to mixed standards and the degraded quality of certain frequency channels, and thus control techniques and an architecture that provide efficient control over WLANs are needed to solve the problem. We have proposed a technique to control the terminal connection dynamically by using the multi-radio of the AP. Furthermore, we have proposed a new control architecture called WiSMA for efficient control of WLANs. Experiments show that the proposed method can solve those problems and improve the WLAN throughput.
Anis Ur REHMAN Ken KIHARA Sakuichi OHTSUKA
In daily reality, people often pay attention to several objects that change positions while being observed. In the laboratory, this process is investigated by a phenomenon known as multiple object tracking (MOT) which is a task that evaluates attentive tracking performance. Recent findings suggest that the attentional set for multiple moving objects whose depth changes in three dimensions from one plane to another is influenced by the initial configuration of the objects. When tracking objects, it is difficult for people to expand their attentional set to multiple-depth planes once attention has been focused on a single plane. However, less is known about people contracting their attentional set from multiple-depth planes to a single-depth plane. In two experiments, we examined tracking accuracy when four targets or four distractors, which were initially distributed on two planes, come together on one of the planes during an MOT task. The results from this study suggest that people have difficulty changing the depth range of their attention during attentive tracking, and attentive tracking performance depends on the initial attentional set based on the configuration prior to attentive tracking.