1-5hit |
Byeoung-su KIM Cho-il LEE Seong-hwan JU Whoi-Yul KIM
3D display systems without glasses are preferred because of the inconvenience wearing of special glasses while viewing 3D content. In general, non-glass type 3D displays work by sending left and right views of the content to the corresponding eyes depending on the user position with respect to the display. Since accurate user position estimation has become a very important task for non-glass type 3D displays, most of such systems require additional hardware or suffer from low accuracy. In this paper, an accurate user position estimation method using a single camera for non-glass type 3D display is proposed. As inter-pupillary distance is utilized for the estimation, at first the face is detected and then tracked using an Active Appearance Model. The pose of face is then estimated to compensate the pose variations. To estimate the user position, a simple perspective mapping function is applied which uses the average of the inter-pupillary distance. For accuracy, personal inter-pupillary distance can also be used. Experimental results have shown that the proposed method successfully estimated the user position using a single camera. The average error for position estimation with the proposed method was small enough for viewing 3D contents.
Determining the rotation angle between two images is essential when comparing images that may include rotational variation. While there are three representative methods that utilize the phases of Zernike moments (ZMs) to estimate rotation angles, very little work has been done to compare the performances of these methods. In this paper, we compare the performances of these three methods and propose a new, angular radial transform (ART)-based method. Our method extends Revaud et al.'s method [1] and uses the phase of angular radial transform coefficients instead of ZMs. We show that our proposed method outperforms the ZM-based method using the MPEG-7 shape dataset when computation times are compared or in terms of the root mean square error vs. coverage.
Shape is one of the primary low-level image features in content-based image retrieval. In this paper we propose a new shape description method that consists of a rotationally invariant angular radial transform descriptor (IARTD). The IARTD is a feature vector that combines the magnitude and aligned phases of the angular radial transform (ART) coefficients. A phase correction scheme is employed to produce the aligned phase so that the IARTD is invariant to rotation. The distance between two IARTDs is defined by combining differences in the magnitudes and aligned phases. In an experiment using the MPEG-7 shape dataset, the proposed method outperforms existing methods; the average BEP of the proposed method is 57.69%, while the average BEPs of the invariant Zernike moments descriptor and the traditional ART are 41.64% and 36.51%, respectively.
The noise in digital images acquired by image sensors has complex characteristics due to the variety of noise sources. However, most noise reduction methods assume that an image has additive white Gaussian noise (AWGN) with a constant standard deviation, and thus such methods are not effective for use with image signal processors (ISPs). To efficiently reduce the noise in an ISP, we estimate a unified noise model for an image sensor that can handle shot noise, dark-current noise, and fixed-pattern noise (FPN) together, and then we adaptively reduce the image noise using an adaptive Smallest Univalue Segment Assimilating Nucleus ( SUSAN ) filter based on the unified noise model. Since our noise model is affected only by image sensor gain, the parameters for our noise model do not need to be re-configured depending on the contents of image. Therefore, the proposed noise model is suitable for use in an ISP. Our experimental results indicate that the proposed method reduces image sensor noise efficiently.
We propose a statistical method for counting pedestrians. Previous pedestrian counting methods are not applicable to highly crowded areas because they rely on the detection and tracking of individuals. The performance of detection-and-tracking methods are easily degraded for highly crowded scene in terms of both accuracy and computation time. The proposed method employs feature-based regression in the spatiotemporal domain to count pedestrians. The proposed method is accurate and requires less computation time, even for large crowds, because it does not include the detection and tracking of objects. Our test results from four hours of video sequence obtained from a highly crowded shopping mall, reveal that the proposed method is able to measure human traffic with an accuracy of 97.2% and requires only 14 ms per frame.