1-5hit |
Yukinori SATO Ken-ichi SUZUKI Tadao NAKAMURA
High power consumption and slow access of enlarged and multiported register files make it difficult to design high performance superscalar processors. The clustered architecture, where the conventional monolithic register file is partitioned into several smaller register files, is expect to overcome the register file issues. In the clustered architecture, the more a monolithic register file is partitioned, the lower power and faster access register files can be realized. However, the partitioning causes losses of IPC (instructions per clock cycle) due to communication among register files. Therefore, degree of partitioning has a strong impact on the trade-off between power consumption and performance. In addition, the organization of partitioned register files also affects the trade-off. In this paper, we attempt to investigate appropriate degrees of partitioning and organizations of partitioned register files in a clustered architecture to assess the trade-off. From the results of execute-driven simulation, we find that the organization of register files and the degree of partitioning have a strong impact on the IPC, and the configuration with non-consistent register files can make use of the partitioned resources more effectively. From the results of register file access time and energy modeling, we find that the configurations with the highly partitioned non-consistent register file organization can receive benefit of the partitioning in terms of operating frequency and access energy of register files. Further, we examine relationship between IPS (instructions per second) and the product of IPC and operating frequency of register files. The results suggest that highly partitioned non-consistent configurations tends to gain more advantage in performance and power.
Yu NAKAYAMA Ken-Ichi SUZUKI Jun TERADA Akihiro OTAKA
Ring aggregation networks are widely employed for metro access networks. A layer-2 ring with Ethernet Ring Protection is a popular topology for carrier services. Since frames are forwarded along ring nodes, a fairness scheme is required to achieve throughput fairness. Although per-node fairness algorithms have been developed for the Resilient Packet Ring, the per-node fairness is insufficient if there is bias in a flow distribution. To achieve per-flow fairness, N Rate N+1 Color Marking (NRN+1CM) was proposed. However, NRN+1CM can achieve fairness in case there are sufficient numbers of available bits on a frame header. It cannot be employed if the frame header cannot be overwritten. Therefore, the application range of NRN+1CM is limited. This paper proposes a Signaling based Discard with Flags (SDF) scheme for per-flow fairness. The objective of SDF is to eliminate the drawback of NRN+1CM. The key idea is to attach a flag to frames according to the input rate and to discard them selectively based on the flags and a dropping threshold. The flag is removed before the frame is transmitted to another node. The dropping threshold is cyclically updated by signaling between ring nodes and a master node. The SDF performance was confirmed by employing a theoretical analysis and computer simulations. The performance of SDF was comparable to that of NRN+1CM. It was verified that SDF can achieve per-flow throughput fairness without using a frame header in ring aggregation networks.
Masamichi FUJIWARA Ken-Ichi SUZUKI Naoto YOSHIMOTO
Multi-stage splitter configurations are often utilized in passive optical network (PON) systems to effectively accommodate widely-dispersed users. This paper introduces two types of more effective user accommodation approaches that place bidirectional optical amplifiers in several branches of the splitter inside the central office (CO); it allows a single optical line terminal (OLT) to support the coexistence of normal- and extended-distance areas and also the sharing by large numbers of optical network units (ONUs). To ease the issue of amplified spontaneous emission (ASE) noise, which is inherent in these system configurations, we propose to use a semiconductor optical amplifier (SOA)-based burst-mode optical amplifier with a fast automatic level control (ALC) circuit for upstream amplification.
Ken-ichi SUZUKI Yoshiyuki KAERIYAMA Kazuhiko KOMATSU Ryusuke EGAWA Nobuyuki OHBA Hiroaki KOBAYASHI
Ray tracing is one of the most popular techniques for generating photo-realistic images. Extensive research and development work has made interactive static scene rendering realistic. This paper deals with interactive dynamic scene rendering in which not only the eye point but also the objects in the scene change their 3D locations every frame. In order to realize interactive dynamic scene rendering, RTRPS (Ray Tracing based on Ray Plane and Bounding Sphere), which utilizes the coherency in rays, objects, and grouped-rays, is introduced. RTRPS uses bounding spheres as the spatial data structure which utilizes the coherency in objects. By using bounding spheres, RTRPS can ignore the rotation of moving objects within a sphere, and shorten the update time between frames. RTRPS utilizes the coherency in rays by merging rays into a ray-plane, assuming that the secondary rays and shadow rays are shot through an aligned grid. Since a pair of ray-planes shares an original ray, the intersection for the ray can be completed using the coherency in the ray-planes. Because of the three kinds of coherency, RTRPS can significantly reduce the number of intersection tests for ray tracing. Further acceleration techniques for ray-plane-sphere and ray-triangle intersection are also presented. A parallel projection technique converts a 3D vector inner product operation into a 2D operation and reduces the number of floating point operations. Techniques based on frustum culling and binary-tree structured ray-planes optimize the order of intersection tests between ray-planes and a sphere, resulting in 50% to 90% reduction of intersection tests. Two ray-triangle intersection techniques are also introduced, which are effective when a large number of rays are packed into a ray-plane. Our performance evaluations indicate that RTRPS gives 13 to 392 times speed up in comparison with a ray tracing algorithm without organized rays and spheres. We found out that RTRPS also provides competitive performance even if only primary rays are used.
Shingo KAWAI Katsumi IWATSUKI Ken-ichi SUZUKI Shigendo NISHI Masatoshi SARUWATARI
The timing jitter reductions with differently shaped optical bandpass filters are discussed and the transmission distance achievable against the timing jitter is evaluated using optical bandpass filters in several tens of Gb/s soliton transmission. Experimental confirmation of timing jitter reduction with optical bandpass filters is demonstrated in 10Gb/s optical soliton recirculating loop experiments by measuring the timing jitter and the bit error rates.