Shijie LIN Chen DONG Zhiqiang WANG Wenzhong GUO Zhenyi CHEN Yin YE
A Lévy search strategy based chaotic artificial bee colony algorithm (LABC) is proposed in this paper. The chaotic sequence, global optimal mechanism and Lévy flight mechanism were introduced respectively into the initialization, the employed bee search and the onlooker bee search. The experiments show that the proposed algorithm performed better in convergence speed, global search ability and optimization accuracy than other improved ABC.
Zhongqiang LUO Chaofu JING Chengjie LI
Nonnegative Matrix Factorization (NMF) is a promising data-driven matrix decomposition method, and is becoming very active and attractive in machine learning and blind source separation areas. So far NMF algorithm has been widely used in diverse applications, including image processing, anti-collision for Radio Frequency Identification (RFID) systems and audio signal analysis, and so on. However the typical NMF algorithms cannot work well in underdetermined mixture, i.e., the number of observed signals is less than that of source signals. In practical applications, adding suitable constraints fused into NMF algorithm can achieve remarkable decomposition results. As a motivation, this paper proposes to add the minimum volume and minimum correlation constrains (MCV) to the NMF algorithm, which makes the new algorithm named MCV-NMF algorithm suitable for underdetermined scenarios where the source signals satisfy mutual independent assumption. Experimental simulation results validate that the MCV-NMF algorithm has a better performance improvement in solving RFID tag anti-collision problem than that of using the nearest typical NMF method.
Xujie LI Weiwei XIA Lianfeng SHEN
This letter presents an analytical study of the reverse link Erlang capacity of 3G/Ad Hoc Integrated networks. In the considered integrated network, 3G networks and Ad Hoc networks operate over the same frequency band and hence cause interference to each other. The reverse link Erlang capacity is analyzed and discussed in two cases: Ad Hoc networks use and do not use power control.
Jinjie LIANG Zhenyu LIU Zhiheng ZHOU Yan XU
Federated learning is a promising strategy for indoor localization that can reduce the labor cost of constructing a fingerprint dataset in a distributed training manner without privacy disclosure. However, the traffic generated during the whole training process of federated learning is a burden on the up-and-down link, which leads to huge energy consumption for mobile devices. Moreover, the non-independent and identically distributed (Non-IID) problem impairs the global localization performance during the federated learning. This paper proposes a communication-efficient FedAvg method for federated indoor localization which is improved by the layerwise asynchronous aggregation strategy and layerwise swapping training strategy. Energy efficiency can be improved by performing asynchronous aggregation between the model layers to reduce the traffic cost in the training process. Moreover, the impact of the Non-IID problem on the localization performance can be mitigated by performing swapping training on the deep layers. Extensive experimental results show that the proposed methods reduce communication traffic and improve energy efficiency significantly while mitigating the impact of the Non-IID problem on the precision of localization.
A wideband beamformer with mainlobe control is proposed. To make the beamformer robust against pointing errors, inequality rather than equality constraints are used to restrict the mainlobe response, thus more degrees of freedom are saved. The constraints involved are nonconvex, therefore are linearly approximated so that the beamformer can be obtained by iterating a second-order cone program. Moreover, the response variance element is introduced to achieve a frequency invariant beamwidth. The effectiveness of the technique is demonstrated by numerical examples.
Xujie LI Weiwei XIA Qiong YANG Lianfeng SHEN
This letter presents an analytical study of outage probability of a 3G/Ad Hoc cooperative network. The considered cooperative network can improve the signal quality so as to decrease the outage probability. Meanwhile, it imposes additional interference on other ongoing users. But on the whole, our analytical study and simulation results show that the cooperative network can still effectively overcome outage event and decrease the average outage probability.
Chuanyi LIU Jie LIN Binxing FANG
Cloud computing is broadly recognized as as the prevalent trend in IT. However, in cloud computing mode, customers lose the direct control of their data and applications hosted by the cloud providers, which leads to the trustworthiness issue of the cloud providers, hindering the widespread use of cloud computing. This paper proposes a trustworthiness verification and audit mechanism on cloud providers called T-YUN. It introduces a trusted third party to cyclically attest the remote clouds, which are instrumented with the trusted chain covering the whole architecture stack. According to the main operations of the clouds, remote verification protocols are also proposed in T-YUN, with a dedicated key management scheme. This paper also implements a proof-of-concept emulator to validate the effectiveness and performance overhead of T-YUN. The experimental results show that T-YUN is effective and the extra overhead incurred by it is acceptable.
Jiabao GAO Yuchen YAO Zhengjie LI Jinmei LAI
A series of Binarized Neural Networks (BNNs) show the accepted accuracy in image classification tasks and achieve the excellent performance on field programmable gate array (FPGA). Nevertheless, we observe existing designs of BNNs are quite time-consuming in change of the target BNN and acceleration of a new BNN. Therefore, this paper presents FCA-BNN, a flexible and configurable accelerator, which employs the layer-level configurable technique to execute seamlessly each layer of target BNN. Initially, to save resource and improve energy efficiency, the hardware-oriented optimal formulas are introduced to design energy-efficient computing array for different sizes of padded-convolution and fully-connected layers. Moreover, to accelerate the target BNNs efficiently, we exploit the analytical model to explore the optimal design parameters for FCA-BNN. Finally, our proposed mapping flow changes the target network by entering order, and accelerates a new network by compiling and loading corresponding instructions, while without loading and generating bitstream. The evaluations on three major structures of BNNs show the differences between inference accuracy of FCA-BNN and that of GPU are just 0.07%, 0.31% and 0.4% for LFC, VGG-like and Cifar-10 AlexNet. Furthermore, our energy-efficiency results achieve the results of existing customized FPGA accelerators by 0.8× for LFC and 2.6× for VGG-like. For Cifar-10 AlexNet, FCA-BNN achieves 188.2× and 60.6× better than CPU and GPU in energy efficiency, respectively. To the best of our knowledge, FCA-BNN is the most efficient design for change of the target BNN and acceleration of a new BNN, while keeps the competitive performance.
Jiahong WANG Masatoshi MIYAZAKI Jie LI
In recent years, more emphasis is placed on the performance of massive databases. It is often required not only that database systems provide high throughputs with rapid response times, but also that they are fully available 24-hours-per-day and 7-days-per-week. Requirements for throughput and response time can be satisfied by upgrading the hardware. As a result, databases in the old hardware environment have to be moved to the new one. Moving a database, however, generally requires taking the database off line for a long time, which is unacceptable for numerous applications. In this paper, a very practical and important subject is addressed: how to upgrade the hardware on line, i.e., how to move a database from an old hardware environment to a new one concurrently with users' reading and writing of the database. A technique for this purpose is proposed. We have implemented a prototype based on this technique. Our experiments with the prototype shown that compared with conventional off-line approach, the proposed technique could give a performance improvement by more than 85% in the query-bound environment and 40% in the update-bound environment.
Zhengjie LI Jiabao GAO Jinmei LAI
In recent years FPGA has become popular in CNN acceleration, and many CNN-to-FPGA toolchains are proposed to fast deploy CNN on FPGA. However, for these toolchains, updating CNN network means regeneration of RTL code and re-implementation which is time-consuming and may suffer timing-closure problems. So, we propose HBDCA: a toolchain and corresponding accelerator. The CNN on HBDCA is defined by the content of BRAM. The toolchain integrates UpdateMEM utility of Xilinx, which updates content of BRAM without re-synthesis and re-implementation process. The toolchain also integrates TensorFlow Lite which provides high-accuracy quantization. HBDCA supports 8-bits per-channel quantization of weights and 8-bits per-layer quantization of activations. Upgrading CNN on accelerator means the kernel size of CNN may change. Flexible structure of HBDCA supports kernel-level parallelism with three different sizes (3×3, 5×5, 7×7). HBDCA implements four types of parallelism in convolution layer and two types of parallelism in fully-connected layer. In order to reduce access number to memory, both spatial and temporal data-reuse techniques were applied on convolution layer and fully-connect layer. Especially, temporal reuse is adopted at both row and column level of an Input Feature Map of convolution layer. Data can be just read once from BRAM and reused for the following clock. Experiments show by updating BRAM content with single UpdateMEM command, three CNNs with different kernel size (3×3, 5×5, 7×7) are implemented on HBDCA. Compared with traditional design flow, UpdateMEM reduces development time by 7.6X-9.1X for different synthesis or implementation strategy. For similar CNN which is created by toolchain, HBDCA has smaller latency (9.97µs-50.73µs), and eliminates re-implementation when update CNN. For similar CNN which is created by dedicated design, HBDCA also has the smallest latency 9.97µs, the highest accuracy 99.14% and the lowest power 1.391W. For different CNN which is created by similar toolchain which eliminate re-implementation process, HBDCA achieves higher speedup 120.28X.
A novel element is proposed for manipulating two orthogonally-polarized electromagnetic waves, resulting in a polarization-reconfigurable flat transmitarray. This element consists of four identical metallic patterns, including a square frame loaded with short stubs and an internal crossed dipole, which are printed on the two sides of three identical flat dielectric slabs, with no air gap among them. With a linearly-polarized (LP) feeder, the flat transmitarray can transform the LP incident wave into a circular, horizontal or vertical polarization wave in a convenient way. By rotating the LP feeder so that the polarization angle is 0°, 45°, 90° or 135°, the waves of linear horizontal, right-handed circular, linear vertical or left-handed circular polarization can be obtained alternately. Simulations and experiments are conducted to validate the performance. The measured axial ratio bandwidths for RHCP and LHCP transmitarrays are about 7.1% and 5.1%, respectively, the 3dB gain bandwidths are 16.19% and 22.4%, and the peak gains are 25.56dBi and 24.2dBi, respectively.
Xuefeng WU Jie LI Hisao KAMEDA
In this paper, we present an analytic model to study the reliability of some important disk array organizations that have been proposed by others in the literature. These organizations are based on the combination of two options for the data layout, regular RAID-5 and block designs, and three alternatives for sparing, hot sparing, distributed sparing and parity sparing. Uncorrectable bit errors have big effects on reliability but are ignored in traditional reliability analysis of disk arrays. We consider both disk failures and uncorrectable bit errors in the model. The reliability of disk arrays is measured in terms of MTTDL (Mean Time To Data Loss). A unified formula of MTTDL has been derived for these disk array organizations. The MTTDLs of these disk array organizations are also compared using the analytic model. By numerical experiments, we show that the data losses caused by uncorrectable bit errors may dominate the data losses of disk array systems though only the data losses caused by disk failures are traditionally considered. The consideration of uncorrectable bit errors provides a more realistic look at the reliability of the disk array systems.
Huimin LU Yujie LI Shota NAKASHIMA Seiichi SERIKAWA
Absorption, scattering, and color distortion are three major issues in underwater optical imaging. Light rays traveling through water are scattered and absorbed according to their wavelength. Scattering is caused by large suspended particles that degrade underwater optical images. Color distortion occurs because different wavelengths are attenuated to different degrees in water; consequently, images of ambient underwater environments are dominated by a bluish tone. In the present paper, we propose a novel underwater imaging model that compensates for the attenuation discrepancy along the propagation path. In addition, we develop a fast weighted guided normalized convolution domain filtering algorithm for enhancing underwater optical images. The enhanced images are characterized by a reduced noise level, better exposure in dark regions, and improved global contrast, by which the finest details and edges are enhanced significantly.
Jie LIU Zhuochen XIE Huijie LIU Zhengmin ZHANG
In this paper, a new non-uniform weight-updating scheme for adaptive digital beamforming (DBF) is proposed. The unique feature of the letter is that the effective working range of the beamformer is extended and the computational complexity is reduced by introducing the robust DBF based on worst-case performance optimization. The robust parameter for each weight updating is chosen by analyzing the changing rate of the Direction of Arrival (DOA) of desired signal in LEO satellite communication. Simulation results demonstrate the improved performance of the new Non-Uniform Weight-Updating Beamformer (NUWUB).
Jie LI Sai LI Abdul Hayee SHAIKH
In this manuscript, we propose a joint channel and power assignment algorithm for an unmanned aerial vehicle (UAV) swarm communication system based on multi-agent deep reinforcement learning (DRL). Regarded as an agent, each UAV to UAV (U2U) link can choose the optimal channel and power according to the current situation after training is successfully completed. Further, a mixing network is introduced based on DRL, where Q values of every single agent are non-linearly mapped, and we call it the QMIX algorithm. As it accesses state information, QMIX can learn to enrich the joint action value function. The proposed method can be used for both unicast and multicast scenarios. Experiments show that each U2U link can be trained to meet the constraints of UAV communication and minimize the interference to the system. For unicast communication, the communication rate is increased up to 15.6% and 8.9% using the proposed DRL method compared with the well-known random and adaptive methods, respectively. For multicast communication, the communication rate is increased up to 6.7% using the proposed QMIX method compared with the DRL method and 13.6% using DRL method compared with adaptive method. Besides, the successful transmission probability can maintain a high level.
Jiahong WANG Jie LI Hisao KAMEDA
Parallel Transaction Processing (TP) systems have great potential to serve the ever-increasing demands for high transaction processing rate. This potential, however, may not be reached due to the data contention and the widely-used two-phase locking (2PL) Concurrency Control (CC) method. In this paper, a distributed locking-based CC policy called LWDC (Local Wait-Depth Control) was proposed for dealing with this problem for the shared-nothing parallel TP system. On the basis of the LWDC policy, an algorithm called LWDCk was designed. Using simulation LWDCk was compared with the 2PL and the base-line Distributed Wait-Depth Limited (DWDL) CC methods. Simulation studies show that the new algorithm offers better system performance than those compared.
In this letter, a novel and highly efficient haze removal algorithm is proposed for haze removal from only a single input image. The proposed algorithm is built on the atmospheric scattering model. Firstly, global atmospheric light is estimated and coarse atmospheric veil is inferred based on statistics of dark channel prior. Secondly, the coarser atmospheric veil is refined by using a fast Tri-Gaussian filter based on human retina property. To avoid halo artefacts, we then redefine the scene albedo. Finally, the haze-free image is derived by inverting the atmospheric scattering model. Results on some challenging foggy images demonstrate that the proposed method can not only improve the contrast and visibility of the restored image but also expedite the process.
Zijie LIU Can CHEN Yi CHENG Maomao JI Jinrong ZOU Dengyin ZHANG
Common schedulers for long-term running services that perform task-level optimization fail to accommodate short-living batch processing (BP) jobs. Thus, many efficient job-level scheduling strategies are proposed for BP jobs. However, the existing scheduling strategies perform time-consuming objective optimization which yields non-negligible scheduling delay. Moreover, they tend to assign BP jobs in a centralized manner to reduce monetary cost and synchronization overhead, which can easily cause resource contention due to the task co-location. To address these problems, this paper proposes TEBAS, a time-efficient balance-aware scheduling strategy, which spreads all tasks of a BP job into the cluster according to the resource specifications of a single task based on the observation that computing tasks of a BP job commonly possess similar features. The experimental results show the effectiveness of TEBAS in terms of scheduling efficiency and load balancing performance.
Xiaolong ZHENG Bangjie LI Daqiao ZHANG Di YAO Xuguang YANG
The ionospheric clutter in High Frequency Surface Wave Radar (HFSWR) is the reflection of electromagnetic waves from the ionosphere back to the receiver, which should be suppressed as much as possible for the primary purpose of target detection in HFSWR. However, ionospheric clutter contains vast quantities of ionospheric state information. By studying ionospheric clutter, some of the relevant ionospheric parameters can be inferred, especially during the period of typhoons, when the ionospheric state changes drastically affected by typhoon-excited gravity waves, and utilizing the time-frequency characteristics of ionospheric clutter at typhoon time, information such as the trend of electron concentration changes in the ionosphere and the direction of the typhoon can be obtained. The results of the processing of the radar data showed the effectiveness of this method.
ChangWoo PYO Jie LI Hisao KAMEDA
Personal communication service (PCS) networks support the delivery of communication services as the mobile user moves from one region to another. When a mobile user receives a call, the network has to quickly determine its current location. The existing approach suffers from high delay in locating the mobile since the mobile's current location has to be always consulted on the location databases. Caching the location of the remote mobile is useful to reduce this delay. However, the longer the useless record caused by the movement of the mobile remains in a cache, the higher the degradation of cache memory utilization is imposed on a system. In this paper, we propose an efficient caching scheme that a cached record is not allowed to remain over the predefined time, called a time-threshold, in a cache. A long time-threshold may cause to increase the obsoleteness of the cached record. In contrast, a short time-threshold may cause to degrade memory utilization. This paper finds the optimal time-threshold to enlarge cache memory utilization. Also, we provide a unique solution for determining the optimal time-threshold, and study the effects of changing the important parameters of mobility, calling patterns, and network conditions on the optimal time-threshold. Furthermore, we compare the performance of the proposed caching call delivery scheme and the existing call delivery schemes.