Author Search Result

[Author] Susumu MATSUMAE(4hit)

1-4hit
  • Simulation Algorithms among Enhanced Mesh Models

    Susumu MATSUMAE  Nobuki TOKURA  

     
    PAPER-Algorithm and Computational Complexity

      Vol:
    E82-D No:10
      Page(s):
    1324-1337

    In this paper, we present simulation algorithms among enhanced mesh models. The enhanced mesh models here include reconfigurable mesh and mesh with multiple broadcasting. A reconfigurable mesh (RM) is a processor array that consists of processors arranged to a 2-dimensional grid with a reconfigurable bus system. The bus system can be used to dynamically obtain various interconnection patterns among the processors during the execution of programs. A horizontal-vertical RM (HV-RM) is obtained from the general RM model, by restricting the network topology it can take to the ones in which each bus segment must be along row or column. A mesh with multiple broadcasting (MWMB) is an enhanced mesh, which has additional broadcasting buses endowed to every row and column. We present two algorithms:1) an algorithm that simulates a HV-RM of size nn time-optimally in θ(n) time on a MWMB of size nn, and 2) an algorithm that simulates a RM of size nn in θ(log2 n) time on a HV-RM of size nn. Both algorithms use a constant number of storage in each processor. Furthermore, we show that a RM of size nn can be simulated in θ((n/m)2 log n log m) time on a HV-RM of size mm, in θ ((n/m)2 m log n log m) time on a MWMB of size mm (m < n). These simulations use θ((n/m)2) storage in each processor, which is optimal.

  • An Efficient Scaling-Simulation Algorithm of Reconfigurable Meshes by Meshes with Statically Partitioned Buses

    Susumu MATSUMAE  

     
    PAPER

      Vol:
    E88-D No:1
      Page(s):
    82-88

    This paper presents an efficient scaling-simulation algorithm that simulates operations of the reconfigurable mesh (RM) of size n n using the mesh with multiple partitioned buses (MMPB) of size m m (m < n). The RM and the MMPB are the two-dimensional mesh-connected computers equipped with broadcasting buses. The broadcasting buses of the RM can be used to dynamically obtain various interconnection patterns among the processors during the execution of programs, while those of the MMPB are placed only to every row and column and are statically partitioned in advance by a fixed length. We show that the RM of size n n can be simulated in steps by the MMPB of size m m (m < n), where L is the number of broadcasting buses in each row/column of the simulating MMPB. Although the time-complexity of our algorithm is less efficient than that of the fastest RM scaling-simulation algorithm, the simulating model of our algorithm is the MMPB model where the bus-reconfiguration is not allowed.

  • Scheduling for Gather Operation in Heterogeneous Parallel Computing Environments

    Fukuhito OOSHITA  Susumu MATSUMAE  Toshimitsu MASUZAWA  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E86-A No:4
      Page(s):
    908-918

    A heterogeneous parallel computing environment consisting of different types of workstations and communication links plays an important role in parallel computing. In many applications on the system, collective communication operations are commonly used as communication primitives. Thus, design of the efficient collective communication operations is the key to achieve high-performance parallel computing. But the heterogeneity of the system complicates the design. In this paper, we consider design of an efficient gather operation, one of the most important collective operations. We show that an optimal gather schedule is found in O(n2k-1) time for the heterogeneous parallel computing environment with n processors of k distinct types, and that a nearly-optimal schedule is found in O(n) time if k=2.

  • Scheduling for Independent-Task Applications on Heterogeneous Parallel Computing Environments under the Unidirectional One-Port Model

    Fukuhito OOSHITA  Susumu MATSUMAE  Toshimitsu MASUZAWA  

     
    PAPER-Parallel and Distributed Computing

      Vol:
    E90-D No:2
      Page(s):
    403-417

    For execution of computation-intensive applications, one of the most important paradigms is to divide the application into a large number of small independent tasks and execute them on heterogeneous parallel computing environments (abbreviated by HPCEs). In this paper, we aim to execute independent tasks efficiently on HPCEs. We consider the problem to find a schedule that maximizes the throughput of task execution for a huge number of independent tasks. First, for HPCEs where the network forms a directed acyclic graph, we show that we can find, in polynomial time, a schedule that attains the optimal throughput. Secondly, for arbitrary HPCEs, we propose an (+ε)-approximation algorithm for any constant ε(ε>0). In addition, we also show that the framework of our approximation algorithm can be applied to other collective communications such as the gather operation.

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.