1-14hit |
This paper presents an M-channel (M=2n (n ∈ N)) integer discrete cosine transforms (IntDCTs) based on fast Hartley transform (FHT) for lossy-to-lossless image coding which has image quality scalability from lossy data to lossless data. Many IntDCTs with lifting structures have already been presented to achieve lossy-to-lossless image coding. Recently, an IntDCT based on direct-lifting of DCT/IDCT, which means direct use of DCT and inverse DCT (IDCT) to lifting blocks, has been proposed. Although the IntDCT shows more efficient coding performance than any conventional IntDCT, it entails many computational costs due to an extra information that is a key point to realize its direct-lifting structure. On the other hand, the almost conventional IntDCTs without an extra information cannot be easily expanded to a larger size than the standard size M=8, or the conventional IntDCT should be improved for efficient coding performance even if it realizes an arbitrary size. The proposed IntDCT does not need any extra information, can be applied to size M=2n for arbitrary n, and shows better coding performance than the conventional IntDCTs without any extra information by applying the direct-lifting to the pre- and post-processing block of DCT. Moreover, the proposed IntDCT is implemented with a half of the computational cost of the IntDCT based on direct-lifting of DCT/IDCT even though it shows the best coding performance.
Motoyuki SUZUKI Shozo MAKINO Akinori ITO Hirotomo ASO Hiroshi SHIMODAIRA
Many methods have been proposed for constructing context-dependent phoneme models using Hidden Markov Models (HMMs) to improve performance. These conventional methods require previously defined contextual factors. If these factors are deficient, the method exhibit poor recognition performance. In this paper, we propose a new construction algorithm for HMnet which does not require pre-defined contextual factors. Experiments demonstrated that the new algorithm could construct the HMnet even for the case that the Successive State Splitting (SSS) algorithm could not. The new algorithm produced better phoneme recognition characteristics than the SSS algorithm.
Iterative patterns can be described by the component pattern of iteration, the shape of which may usually involve the way of its connection for the iteration. Based on a parallel translation group for each iterative pattern, it is shown that the shape of component pattern can be a rectangle for any iterative pattern, which is called a tile and is related to the basis of parallel translation group. And it is shown that the component pattern can be identified by a tessera" which is defined by a non-degenerate colored tile. Thus the tessera may be said to be a compact description of iterative patterns. Some properties of tesseras and tiles are also discussed. The number of tesseras with a tile is evaluated, which means that the number of distinct iterative patterns for each shape of component patterns is enumerated.
Hiroki MORI Hirotomo ASO Shozo MAKINO
A new postprocessing method using interpolated n-gram model for Japanese documents is proposed. The method has the advantages over conventional approaches in enabling high-speed, knowledge-free processing. In parameter estimation of an n-gram model for a large size of vocabulary, it is difficult to obtain sufficient training samples. To overcome poverty of samples, two smoothing methods for Japanese character trigram model are evaluated, and the superiority of deleted interpolation method is shown by using perplexity. A document recognition system based on the trigram model is constructed, which finds maximum likelihood solutions through Viterbi algorithm. Experimental results for three kinds of documents show that the performance is high when using deleted interpolation method for smoothing. 90% of OCR errors are corrected for the documents similar to training text data, and 75% of errors are corrected for the documents not so similar to training text data.
Ruiqi GUO Shinichiro OMACHI Hirotomo ASO
To segment a shape into parts is an important problem in shape representation and analysis. We propose in this paper a novel framework of shape segmentation using deformation models learned from multiple shapes. The deformation model from the target image to every other image is then estimated. Finally, normalized-cut graph partition is applied to the graph constructed based on the similarity of local patches in the target image, and a segmentation of the shape is carried out. Experimental results for images from MPEG7 shape database show the effectiveness of the proposed method.
A parallel overlapping preconditioner is applied to ICCG method and the effect of the parallel preconditioning on the convergence of the method is investigated by solving large scale block tridiagonal linear systems arising from the discretization of Poisson's equation. Compared with the original ICCG method, the parallel preconditioned ICCG method can solve the problems in high parallelism with slight increasing the number of iterations. Furthermore, the speedup and the efficiency are evaluated for the parallel preconditioned ICCG method by substituting the experimental results into formulae of complexity. For example, when a domain of simulation is discretized on a 250250 rectangular grid and the preconditioner is divided into 249 smaller ones, its speedup is 146.3 with the efficiency 0.59.
Shinichiro OMACHI Shunichi MEGAWA Hirotomo ASO
A practical optical character reader is required to deal with not only common fonts but also complex designed fonts. However, recognizing various kinds of decorative character images is still a challenging problem in the field of document image analysis. Since appearances of such decorative characters are complicated, most general character recognition systems cannot give good performances on decorative characters. In this paper, an algorithm that recognizes decorative characters by structural analysis using a graph-matching technique is proposed. Character structure is extracted by using topographical features of multi-scale images, and the extracted structure is represented by a graph. A character image is recognized by matching graphs of the input and standard patterns. Experimental results show the effectiveness of the proposed algorithm.
Masato SUZUKI Nei KATO Hirotomo ASO Yoshiaki NEMOTO
In recognition of handprinted characters, it is important to dissolve distortions of character caused by writer's habits. In order to dissolve distortions and to obtain better features, many image conversion methods have been proposed. But there are distortions that cannot be dissolved by these methods. One example is the case of parallel strokes which are spread out in fan shape. In this paper, in order to dissolve distortions, we propose a new image conversion method, Transformation based on Partial Inclination Detection (TPID)", which is employed just before normalization, and is intended to dissolve several kinds of distortions in images of each character. TPID constructs transformation functions from inclination angles which are detected in some subspaces of the character's image, and converts images using the transformation functions. TPID is especially suitable for correcting the inclinations of horizontal and vertical strokes of a character. This has a powerful impact on the quality of the characteristic features. In recognition experiments using ETL9B, the largest database of handprinted characters in Japan, we have obtained a recognition rate of 99.08%, which is the best to our knowledge.
A large number of techniques have been proposed for acceleration of the Hough Transform, because the transformation is computationally very expensive in general. It is known that the sampling interval in parameter space is strongly related to the computation cost. The precision of the transformation and the processing speed are in a trade-off relationship. No fair comparison of the processing speed between various methods was performed in all previous works, because no criterion had been given for the sampling interval of parameter, and because the precision of parameter was not equal between methods. At the beginning of our research, we derive the relationship between the sampling interval and the precision of parameter. Then we derive a framework for comparing computation cost under equal condition for precision of parameter, regarding the total number of sampling points of a parameter as the computation cost. We define the transformation error in the Hough Transform, and the error is regarded as transformation noise. In this paper we also propose a design method called "Noise-level Shaping," by which we can set the transformation noise to an arbitrarily level. The level of the noise is varied according to the value of a parameter. Noise-level Shaping makes it possible for us to find the efficient parameterization and to find the efficient sampling interval in a specific application of the Hough Transform.
Shinhaeng LEE Shin'ichiro OMACHI Hirotomo ASO
Linear programming techniques are useful in many diverse applications such as: production planning, energy distribution etc. To find an optimal solution of the linear programming problem, we have to repeat computations and it takes a lot of processing time. For high speed computation of linear programming, special purpose hardware has been sought. This paper proposes a systolic array for solving linear programming problems using the revised simplex method which is a typical algorithm of linear programming. This paper also proposes a modified systolic array that can solve linear programming problems whose sizes are very large.
Many numerical simulation problems of natural phenomena are formulated by large tridiagonal and block tridiagonal linear systems. In this paper, an efficient parallel algorithm to solve a tridiagonal linear system is proposed. The algorithm named bi-recurrence algorithm has an inherent parallelism which is suitable for parallel processing. Its time complexity is 8N - 4 for a tridiagonal linear system of order N. The complexity is little more than the Gaussian elimination algorithm. For parallel implementation with two processors, the time complexity is 4N - 1. Based on the bi-recurrence algorithm, a VLSI oriented tridiagonal solver is designed, which has an architecture of 1-D linear systolic array with three processing cells. The systolic tridiagonal solver completes finding the solution of a tridiagonal linear system in 3N + 6 units of time. A highly parallel systolic tridiagonal solver is also presented. The solver is characterized by highly parallel computability which originates in the divide-and-conquer strategy and high cost performance which originates in the systolic architecture. This solver completes finding the solution in 10(N/p) + 6p + 23 time units, where p is the number of partitions of the system.
Fang SUN Shin'ichiro OMACHI Hirotomo ASO
In this paper, a new algorithm for selection of candidates for handwritten character recognition is presented. Since we adopt the concept of the marginal radius to examine the confidence of candidates, the evaluation function is required to describe the pattern distribution correctly. For this reason, we propose Simplified Mahalanobis distance and observe its behavior by simulation. In the proposed algorithm, first, for each character, two types of feature regions (multi-dimensional one and one-dimensional one) are estimated from training samples statistically. Then, by referring to the feature regions, candidates are selected and verified. Using two types of feature regions is a principal characteristic of our method. If parameters are estimated accurately, the multi-dimensional feature region is extremely effective for character recognition. But generally, estimation errors in parameters occur, especially with a small number of sample patterns. Although the recognition ability of one-dimensional feature region is not so high, it can express the distribution comparatively precisely in one-dimensional space. By combining these feature regions, they will work concurrently to overcome the defects of each other. The effectiveness of the method is shown with the results of experiments.
Shinichiro OMACHI Masako OMACHI Hirotomo ASO
In statistical pattern recognition, it is important to estimate the distribution of patterns precisely to achieve high recognition accuracy. In general, precise estimation of the parameters of the distribution requires a great number of sample patterns, especially when the feature vector obtained from the pattern is high-dimensional. For some pattern recognition problems, such as face recognition or character recognition, very high-dimensional feature vectors are necessary and there are always not enough sample patterns for estimating the parameters. In this paper, we focus on estimating the distribution of high-dimensional feature vectors with small number of sample patterns. First, we define a function, called simplified quadratic discriminant function (SQDF). SQDF can be estimated with small number of sample patterns and approximates the quadratic discriminant function (QDF). SQDF has fewer parameters and requires less computational time than QDF. The effectiveness of SQDF is confirmed by three types of experiments. Next, as an application of SQDF, we propose an algorithm for estimating the parameters of the normal mixture. The proposed algorithm is applied to face recognition and character recognition problems which require high-dimensional feature vectors.
Screen pattern used in offset-printed documents has been one of great obstacles in developing document recognition systems that handle color documents. This paper proposes a selective smoothing method for filtering the screen patterns/noise in high-resolution color document images. Experimental results show that the method yields significant improvements in character pattern extraction.