Author Search Result

[Author] Yutong LU(4hit)

1-4hit
  • A Configuration Management Study to Fast Massive Writing for Distributed NoSQL System

    Xianqiang BAO  Nong XIAO  Yutong LU  Zhiguang CHEN  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2016/06/20
      Vol:
    E99-D No:9
      Page(s):
    2269-2282

    NoSQL systems have become vital components to deliver big data services due to their high horizontal scalability. However, existing NoSQL systems rely on experienced administrators to configure and tune the wide range of configurable parameters for optimized performance. In this work, we present a configuration management framework for NoSQL systems, called xConfig. With xConfig, its users can first identify performance sensitive parameters and capture the tuned parameters for different workloads as configuration policies. Next, based on tuned policies, xConfig can be implemented as the corresponding configuration optimiaztion system for the specific NoSQL system. Also it can be used to analyze the range of configurable parameters that may impact the runtime performance of NoSQL systems. We implement a prototype called HConfig based on HBase, and the parameter tuning strategies for HConfig can generate tuned policies and enable HBase to run much more efficiently on both individual worker node and entire cluster. The massive writing oriented evaluation results show that HBase under write-intensive policies outperforms both the default configuration and some existing configurations while offering significantly higher throughput.

  • Distributed and Scalable Directory Service in a Parallel File System

    Lixin WANG  Yutong LU  Wei ZHANG  Yan LEI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2015/10/26
      Vol:
    E99-D No:2
      Page(s):
    313-323

    One of the patterns that the design of parallel file systems has to solve stems from the difficulty of handling the metadata-intensive I/O generated by parallel applications accessing a single large directory. We demonstrate a middleware design called SFS to support existing parallel file systems for distributed and scalable directory service. SFS distributes directory entries over data servers instead of metadata servers to offer increased scalability and performance. Firstly, SFS exploits an adaptive directory partitioning based on extendible hashing to support concurrent and unsynchronized partition splitting. Secondly, SFS describes an optimization based on recursive split-ordering that emphasizes speeding up the splitting process. Thirdly, SFS applies a write-optimized index structure to convert slow, small, random metadata updates into fast, large, sequential writes. Finally, SFS gracefully tolerates stale mapping at the clients while maintaining the correctness and consistency of the system. Our performance results on a cluster of 32-servers show our implementation can deliver more than 250,000 file creations per second on average.

  • RFS: An LSM-Tree-Based File System for Enhanced Microdata Performance

    Lixin WANG  Yutong LU  Wei ZHANG  Yan LEI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2016/09/06
      Vol:
    E99-D No:12
      Page(s):
    3035-3046

    File system workloads are increasing write-heavy. The growing capacity of RAM in modern nodes allows many reads to be satisfied from memory while writes must be persisted to disk. Today's sophisticated local file systems like Ext4, XFS and Btrfs optimize for reads but suffer from workloads dominated by microdata (including metadata and tiny files). In this paper we present an LSM-tree-based file system, RFS, which aims to take advantages of the write optimization of LSM-tree to provide enhanced microdata performance, while offering matching performance for large files. RFS incrementally partitions the namespace into several metadata columns on a per-directory basis, preserving disk locality for directories and reducing the write amplification of LSM-trees. A write-ordered log-structured layout is used to store small files efficiently, rather than embedding the contents of small files into inodes. We also propose an optimization of global bloom filters for efficient point lookups. Experiments show our library version of RFS can handle microwrite-intensive workloads 2-10 times faster than existing solutions such as Ext4, Btrfs and XFS.

  • An Efficient Method for Training Deep Learning Networks Distributed

    Chenxu WANG  Yutong LU  Zhiguang CHEN  Junnan LI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2444-2456

    Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.