1-14hit |
This paper describes the analysis and design of low-noise analog circuits for a new architecture readout LSI, Qpix. In contrast to conventional readout LSIs using TOT method, Qpix measures deposited charge directly as well as time information. A preamplifier with a two-stage op amp and current-copy output buffers is proposed to realize these functions. This preamplifier is configured to implement a charge sensitive amplifier (CSA) and a trans-impedance amplifier (TIA). Design issues related to CSA are analyzed, which includes gain requirement of the op amp, stability and compensation of the two-stage cascode op amp, noise performance estimation, requirement for the resolution of the ADC and time response. The offset calibration method in the TIA to improve the charge detecting sensitivity is also presented. Also, some design principles for these analog circuits are presented. In order to verify the theoretical analysis, a 400-pixel high speed readout LSI: Qpix v.1 has been designed and fabricated in 180 nm CMOS process. Calculations and SPICE simulations show that the total output noise is about 0.31 mV (rms) at the output of the CSA and the offset voltage is less than 4 mV at the output of the TIA. These are attractive performances for experimental particle detector using Qpix v.1 chip as its readout LSI.
Yufei LIN Xuejun YANG Xinhai XU Xiaowei GUO
Scaling up the system size has been the common approach to achieving high performance in parallel computing. However, designing and implementing a large-scale parallel system can be very costly in terms of money and time. When building a target system, it is desirable to initially build a smaller version by using the processing nodes with the same architecture as those in the target system. This allows us to achieve efficient and scalable prediction by using the smaller system to predict the performance of the target system. Such scalability prediction is critical because it enables system designers to evaluate different design alternatives so that a certain performance goal can be successfully achieved. As the de facto standard for writing parallel applications, MPI is widely used in large-scale parallel computing. By categorizing the discrete event simulation methods for MPI programs and analyzing the characteristics of scalability prediction, we propose a novel simulation method, called virtual-actual combined execution-driven (VACED) simulation, to achieve scalable prediction for MPI programs. The basic idea behind is to predict the execution time of an MPI program on a target machine by running it on a smaller system so that we can predict its communication time by virtual simulation and obtain its sequential computation time by actual execution. We introduce a model for the VACED simulation as well as the design and implementation of VACED-SIM, a lightweight simulator based on fine-grained activity and event definitions. We have validated our approach on a sub-system of Tianhe-1A. Our experimental results show that VACED-SIM exhibits higher accuracy and efficiency than MPI-SIM. In particular, for a target system with 1024 cores, the relative errors of VACED-SIM are less than 10% and the slowdowns are close to 1.
Yifei LIU Yuan ZHAO Jun ZHU Bin TANG
A novel Nyquist Folding Receiver (NYFR) based passive localization algorithm with Sparse Bayesian Learning (SBL) is proposed to estimate the position of a spaceborne Synthetic Aperture Radar (SAR).Taking the geometry and kinematics of a satellite into consideration, this paper presents a surveillance geometry model, which formulates the localization problem into a sparse vector recovery problem. A NYFR technology is utilized to intercept the SAR signal. Then, a convergence algorithm with SBL is introduced to recover the sparse vector. Furthermore, simulation results demonstrate the availability and performance of our algorithm.
Fei LI Zhizhong DING Yu WANG Jie LI Zhi LIU
In this paper, the problem of channel estimation in orthogonal frequency-division multiplexing systems over fast time-varying channel is investigated by using a Basis Expansion Model (BEM). Regarding the effects of the Gibbs phenomenon in the BEM, we propose a new method to alleviate it and reduce the modeling error. Theoretical analysis and detail comparison results show that the proposed BEM method can provide improved modeling error compared with other BEMs such as CE-BEM and GCE-BEM. In addition, instead of using the frequency-domain Kronecker delta structure, a new clustered pilot structure is proposed to enhance the estimation performance further. The new clustered pilot structure can effectively reduce the inter-carrier interference especially in the case of high Doppler spreads.
In this letter, we propose a novel kind of uncertain query, top (k1,k2) query. The x-tuple model and the possible world semantics are used to describe data objects in uncertain datasets. The top (k1,k2) query is going to find k2 x-tuples with largest probabilities to be the result of top k1 query in a possible world. Firstly, we design a basic algorithm for top (k1,k2) query based on dynamic programming. And then some pruning strategies are designed to improve its efficiency. An improved initialization method is proposed for further acceleration. Experiments in real and synthetic datasets prove the performance of our methods.
In this paper, we present a joint multi-patch and multi-task convolutional neural networks (JMM-CNNs) framework to learn more descriptive and robust face representation for face recognition. In the proposed JMM-CNNs, a set of multi-patch CNNs and a feature fusion network are constructed to learn and fuse global and local facial features, then a multi-task learning algorithm, including face recognition task and pose estimation task, is operated on the fused feature to obtain a pose-invariant face representation for the face recognition task. To further enhance the pose insensitiveness of the learned face representation, we also introduce a similarity regularization term on features of the two tasks to propose a regularization loss. Moreover, a simple but effective patch sampling strategy is applied to make the JMM-CNNs have an end-to-end network architecture. Experiments on Multi-PIE dataset demonstrate the effectiveness of the proposed method, and we achieve a competitive performance compared with state-of-the-art methods on Labeled Face in the Wild (LFW), YouTube Faces (YTF) and MegaFace Challenge.
Yanming CHEN Bin LYU Zhen YANG Fei LI
In this letter, we propose an energy beamforming empowered relaying scheme for a batteryless IoT network, where wireless-powered relays are deployed between the hybrid access point (HAP) and batteryless IoT devices to assist the uplink information transmission from the devices to the HAP. In particular, the HAP first exploits energy beamforming to efficiently transmit radio frequency (RF) signals to transfer energy to the relays and as the incident signals to enable the information backscattering of batteryless IoT devices. Then, each relay uses the harvested energy to forward the decoded signals from its corresponding batteryless IoT device to the HAP, where the maximum-ratio combing is used for further performance improvement. To maximize the network sum-rate, the joint optimization of energy beamforming vectors at the HAP, network time scheduling, power allocation at the relays, and relection coefficient at the users is investigated. As the formulated problem is non-convex, we propose an alternating optimization algorithm with the variable substitution and semi-definite relaxation (SDR) techniques to solve it efficiently. Specifically, we prove that the obtained energy beamforming matrices are always rank-one. Numerical results show that compared to the benchmark schemes, the proposed scheme can achieve a significant sum-rate gain.
Haibin KAN Xuefei LI Hong SHEN
In this letter, we discussed some properties of characteristic generators for a finite Abelian group code, proved that any two characteristic generators can not start (end) at the same position and have the same order of the starting (ending) components simultaneously, and that the number of all characteristic generators can be directly computed from the group code itself. These properties are exactly the generalization of the corresponding trellis properties of a linear code over a field.
In this paper we describe a multicast routing algorithm, which builds upon mobile multicast agents of an ad-hoc network. Mobile multicast agents (MMAs) form a virtual backbone of an ad-hoc network and they provide multicast tree discovery, multicast tree maintenance and datagram delivery. First, we construct a cluster-spine hierarchy structure for an ad-hoc network. Second, we propose a multicast routing algorithm, which is inspired by Ad-hoc On-Demand Distance Vector (AODV) routing protocol. The results show that the MMA multicast algorithm can simplify the multicast tree discovery, reduce control overhead of the network, and increase the total network throughput, in comparison with general AODV multicast operation. We also overcome the deficiency of CBRP multicast routing, which places much burden on cluster heads.
Recent attempts to directly combine CMOS pixel readout chips with modern gas detectors open the possibility to fully take advantage of gas detectors. Those conventional readout LSIs designed for hybrid semiconductor detectors show some issues when applied to gas detectors. Several new proposed readout LSIs can improve the time and the charge measurement precision. However, the widely used basic charge sensitive amplifier (CSA) has an almost fixed dynamic range. There is a trade-off between the charge measurement resolution and the detectable input charge range. This paper presents a method to apply the folding integration technique to a basic CSA. As a result, the detectable input charge dynamic range is expanded while maintaining all the key merits of a basic CSA. Although folding integration technique has already been successfully applied in CMOS image sensors, the working conditions and the signal characteristics are quite different for pixel readout LSIs for gas particle detectors. The related issues of the folding CSA for pixel readout LSIs, including the charge error due to finite gain of the preamplifier, the calibration method of charge error, and the dynamic range expanding efficiency, are addressed and analyzed. As a design example, this paper also demonstrates the application of the folding integration technique to a Qpix readout chip. This improves the charge measurement resolution and expands the detectable input dynamic range while maintaining all the key features. Calculations with SPICE simulations show that the dynamic range can be improved by 12 dB while the charge measurement resolution is improved by 10 times. The charge error during the folding operation can be corrected to less than 0.5%, which is sufficient for large input charge measurement.
Xinhai XU Xuejun YANG Yufei LIN
As supercomputers increase in size, the mean time between failures (MTBF) of a system becomes shorter, and the reliability problem of supercomputers becomes more and more serious. MPI is currently the de facto standard used to build high-performance applications, and researches on the fault tolerance methods of MPI are always hot topics. However, due to the characteristics of MPI programs, most current checkpointing methods for MPI programs need to modify the MPI library (even operating system), or implement a complicated protocol by logging lots of messages. In this paper, we carry forward the idea of Application-Level Checkpointing (ALC). Based on the general fact that programmers are familiar with the communication characteristics of applications, we have developed BC-ALC, a new portable blocking coordinated ALC for MPI programs. BC-ALC neither modifies the MPI library (even operating system) nor logs any message. It implements coordination only by the Barrier operations instead of any complicated protocol. Furthermore, in order to reduce the cost of fault-tolerance, we reduce the synchronization range of the barrier, and design WBC-ALC, a weak blocking coordinated ALC utilizing group synchronization instead of global synchronization based on the communication relationship between processes. We also propose a fault-tolerance framework developed on top of WBC-ALC and discuss an implementation of it. Experimental results on NPB3.3-MPI benchmarks validate BC-ALC and WBC-ALC, and show that compared with BC-ALC, the average coordination time and the average backup time of a single checkpoint in WBC-ALC are reduced by 44.5% and 5.7% respectively.
To improve detection performance for a reconnaissance receiver, which is designed to detect the non-cooperative MIMO-LFM radar signal under low SNR condition, this letter proposed a novel signal detection method. This method is based on Fractional Fourier Transform with entropy weight (FRFTE) and autocorrelation algorithm. In addition, the flow chart and feasibility of the proposed algorithm are analyzed. Finally, applying our method to Wigner Hough Transform (WHT), we demonstrate the superiority of this method by simulation results.
Yanming CHEN Bin LYU Zhen YANG Fei LI
In this paper, we investigate a wireless-powered relays assisted batteryless IoT network based on the non-linear energy harvesting model, where there exists an energy service provider constituted by the hybrid access point (HAP) and an IoT service provider constituted by multiple clusters. The HAP provides energy signals to the batteryless devices for information backscattering and the wireless-powered relays for energy harvesting. The relays are deployed to assist the batteryless devices with the information transmission to the HAP by using the harvested energy. To model the energy interactions between the energy service provider and IoT service provider, we propose a Stackelberg game based framework. We aim to maximize the respective utility values of the two providers. Since the utility maximization problem of the IoT service provider is non-convex, we employ the fractional programming theory and propose a block coordinate descent (BCD) based algorithm with successive convex approximation (SCA) and semi-definite relaxation (SDR) techniques to solve it. Numerical simulation results confirm that compared to the benchmark schemes, our proposed scheme can achieve larger utility values for both the energy service provider and IoT service provider.
Shuyun LUO Wushuang WANG Yifei LI Jian HOU Lu ZHANG
Crowdsourcing becomes a popular data-collection method to relieve the burden of high cost and latency for data-gathering. Since the involved users in crowdsourcing are volunteers, need incentives to encourage them to provide data. However, the current incentive mechanisms mostly pay attention to the data quantity, while ignoring the data quality. In this paper, we design a Data-quality awaRe IncentiVe mEchanism (DRIVE) for collaborative tasks based on the Stackelberg game to motivate users with high quality, the highlight of which is the dynamic reward allocation scheme based on the proposed data quality evaluation method. In order to guarantee the data quality evaluation response in real-time, we introduce the mobile edge computing framework. Finally, one case study is given and its real-data experiments demonstrate the superior performance of DRIVE.