Author Search Result

[Author] Hongli ZHANG(6hit)

1-6hit
  • Efficient Distributed Web Crawling Utilizing Internet Resources

    Xiao XU  Weizhe ZHANG  Hongli ZHANG  Binxing FANG  

     
    PAPER-Data Engineering, Web Information Systems

      Vol:
    E93-D No:10
      Page(s):
    2747-2762

    Internet computing is proposed to exploit personal computing resources across the Internet in order to build large-scale Web applications at lower cost. In this paper, a DHT-based distributed Web crawling model based on the concept of Internet computing is proposed. Also, we propose two optimizations to reduce the download time and waiting time of the Web crawling tasks in order to increase the system's throughput and update rate. Based on our contributor-friendly download scheme, the improvement on the download time is achieved by shortening the crawler-crawlee RTTs. In order to accurately estimate the RTTs, a network coordinate system is combined with the underlying DHT. The improvement on the waiting time is achieved by redirecting the incoming crawling tasks to light-loaded crawlers in order to keep the queue on each crawler equally sized. We also propose a simple Web site partition method to split a large Web site into smaller pieces in order to reduce the task granularity. All the methods proposed are evaluated through real Internet tests and simulations showing satisfactory results.

  • Privacy-Aware Information Sharing in Location-Based Services: Attacks and Defense

    Zhikai XU  Hongli ZHANG  Xiangzhan YU  Shen SU  

     
    PAPER

      Pubricized:
    2016/05/31
      Vol:
    E99-D No:8
      Page(s):
    1991-2001

    Location-based services (LBSs) are useful for many applications in internet of things(IoT). However, LBSs has raised serious concerns about users' location privacy. In this paper, we propose a new location privacy attack in LBSs called hidden location inference attack, in which the adversary infers users' hidden locations based on the users' check-in histories. We discover three factors that influence individual check-in behaviors: geographic information, human mobility patterns and user preferences. We first separately evaluate the effects of each of these three factors on users' check-in behaviors. Next, we propose a novel algorithm that integrates the above heterogeneous factors and captures the probability of hidden location privacy leakage. Then, we design a novel privacy alert framework to warn users when their sharing behavior does not match their sharing rules. Finally, we use our experimental results to demonstrate the validity and practicality of the proposed strategy.

  • A Keypoint-Based Region Duplication Forgery Detection Algorithm

    Mahmoud EMAM  Qi HAN  Liyang YU  Hongli ZHANG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2016/06/13
      Vol:
    E99-D No:9
      Page(s):
    2413-2416

    The copy-move or region duplication forgery technique is a very common type of image manipulation, where a region of the image is copied and then pasted in the same image in order to hide some details. In this paper, a keypoint-based method for copy-move forgery detection is proposed. Firstly, the feature points are detected from the image by using the Förstner Operator. Secondly, the algorithm extracts the features by using MROGH feature descriptor, and then matching the features. Finally, the affine transformation parameters can be estimated using the RANSAC algorithm. Experimental results are presented to confirm that the proposed method is effective to locate the altered region with geometric transformation (rotation and scaling).

  • SNGR: Scalable Name-Based Geometric Routing for ICN

    Yanbin SUN  Yu ZHANG  Binxing FANG  Hongli ZHANG  

     
    PAPER-Network

      Vol:
    E99-B No:8
      Page(s):
    1835-1845

    Information-Centric Networking (ICN) treats contents as first class citizens and adopts name-based routing for content distribution and retrieval. Content names rather than IP addresses are directly used for routing. However, due to the location-independent naming and the huge namespace, name-based routing faces scalability and efficiency issues including large routing tables and high path stretches. This paper proposes a universal Scalable Name-based Geometric Routing scheme (SNGR), which is a careful synthesis of geometric routing and name resolution. To provide scalable and efficient underlying routing, a universal geometric routing framework (GRF) is proposed. Any geometric routing scheme can be used directly for name resolution based on GRF. To implement an overlay name resolution system, SNGR utilizes a bi-level grouping design. With this design, a resolution node that is close to the consumer can always be found. Our theoretical analyses guarantee the performance of SNGR, and experiments show that SNGR outperforms similar routing schemes in terms of node state, path stretch, and reliability.

  • Exploring Web Partition in DHT-Based Distributed Web Crawling

    Xiao XU  Weizhe ZHANG  Hongli ZHANG  Binxing FANG  

     
    PAPER

      Vol:
    E93-D No:11
      Page(s):
    2907-2921

    The basic requirements of the distributed Web crawling systems are: short download time, low communication overhead and balanced load which largely depends on the systems' Web partition strategies. In this paper, we propose a DHT-based distributed Web crawling system and several DHT-based Web partition methods. First, a new system model based on a DHT method called the Content Addressable Network (CAN) is proposed. Second, based on this model, a network-distance-based Web partition is implemented to reduce the crawler-crawlee network distance in a fully distributed manner. Third, by utilizing the locality on the link space, we propose the concept of link-based Web partition to reduce the communication overhead of the system. This method not only reduces the number of inter-links to be exchanged among the crawlers but also reduces the cost of routing on the DHT overlay. In order to combine the benefits of the above two Web partition methods, we then propose 2 distributed multi-objective Web partition methods. Finally, all the methods we propose in this paper are compared with existing system models in the simulated experiments under different datasets and different system scales. In most cases, the new methods show their superiority.

  • Large-Scale Gaussian Process Regression Based on Random Fourier Features and Local Approximation with Tsallis Entropy

    Hongli ZHANG  Jinglei LIU  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/07/11
      Vol:
    E106-D No:10
      Page(s):
    1747-1751

    With the emergence of a large quantity of data in science and industry, it is urgent to improve the prediction accuracy and reduce the high complexity of Gaussian process regression (GPR). However, the traditional global approximation and local approximation have corresponding shortcomings, such as global approximation tends to ignore local features, and local approximation has the problem of over-fitting. In order to solve these problems, a large-scale Gaussian process regression algorithm (RFFLT) combining random Fourier features (RFF) and local approximation is proposed. 1) In order to speed up the training time, we use the random Fourier feature map input data mapped to the random low-dimensional feature space for processing. The main innovation of the algorithm is to design features by using existing fast linear processing methods, so that the inner product of the transformed data is approximately equal to the inner product in the feature space of the shift invariant kernel specified by the user. 2) The generalized robust Bayesian committee machine (GRBCM) based on Tsallis mutual information method is used in local approximation, which enhances the flexibility of the model and generates a sparse representation of the expert weight distribution compared with previous work. The algorithm RFFLT was tested on six real data sets, which greatly shortened the time of regression prediction and improved the prediction accuracy.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.