1-3hit |
Enze YANG Shuoyan LIU Yuxin LIU Kai FANG
Crowd flow prediction in high density urban scenes is involved in a wide range of intelligent transportation and smart city applications, and it has become a significant topic in urban computing. In this letter, a CNN-based framework called Pyramidal Spatio-Temporal Network (PSTNet) for crowd flow prediction is proposed. Spatial encoding is employed for spatial representation of external factors, while prior pyramid enhances feature dependence of spatial scale distances and temporal spans, after that, post pyramid is proposed to fuse the heterogeneous spatio-temporal features of multiple scales. Experimental results based on TaxiBJ and MobileBJ demonstrate that proposed PSTNet outperforms the state-of-the-art methods.
Shuoyan LIU Enze YANG Kai FANG
Abnormal behavior detection is now a widely concerned research field, especially for crowded scenes. However, most traditional unsupervised approaches often suffered from the problem when the normal events in the scenario with large visual variety. This paper proposes a self-learning probabilistic Latent Semantic Analysis, which aims at taking full advantage of the high-level abnormal information to solve problems. We select the informative observations to construct the “reference events” from the training sets as a high-level guidance cue. Specifically, the training set is randomly divided into two separate subsets. One is used to learn this model, which is defined as the initialization sequence of “reference events”. The other aims to update this model and the the infrequent samples are chosen into the “reference events”. Finally, we define anomalies using events that are least similar to “reference events”. The experimental result demonstrates that the proposed model can detect anomalies accurately and robustly in the real-world crowd environment.
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Crowd counting is a crucial task in computer vision, which poses a significant challenge yet holds vast potential for practical applications in public safety and transportation. Traditional crowd counting approaches typically rely on a single framework to predict density maps or head point distributions. However, the straightforward architectures often fall short in cases of over-counting or omission, particularly in diverse crowded scenes. To address these limitations, we introduce the Density to Point Transformer (D2PT), an innovative approach for effective crowd counting and localization. Specifically, D2PT employs a Transformer-based teacher-student framework that integrates the insights of density-based and head-point-based methods. Furthermore, we introduce feature-aligned knowledge distillation, formulating a collaborative training approach that enhances the performance of both density estimation and point map prediction. Optimized with multiple loss functions, D2PT achieves state-of-the-art performance across five crowd counting datasets, demonstrating its robustness and effectiveness for intricate crowd counting and localization challenges.