Depth-based action recognition has been attracting the attention of researchers because of the advantages of depth cameras over standard RGB cameras. One of these advantages is that depth data can provide richer information from multiple projections. In particular, multiple projections can be used to extract discriminative motion patterns that would not be discernible from one fixed projection. However, high computational costs have meant that recent studies have exploited only a small number of projections, such as front, side, and top. Thus, a large number of projections, which may be useful for discriminating actions, are discarded. In this paper, we propose an efficient method to exploit pools of multiple projections for recognizing actions in depth videos. First, we project 3D data onto multiple 2D-planes from different viewpoints sampled on a geodesic dome to obtain a large number of projections. Then, we train and test action classifiers independently for each projection. To reduce the computational cost, we propose a greedy method to select a small yet robust combination of projections. The idea is that best complementary projections will be considered first when searching for optimal combination. We conducted extensive experiments to verify the effectiveness of our method on three challenging benchmarks: MSR Action 3D, MSR Gesture 3D, and 3D Action Pairs. The experimental results show that our method outperforms other state-of-the-art methods while using a small number of projections.
Chien-Quang LE
Graduate University for Advanced Studies (SOKENDAI)
Sang PHAN
National Institute of Informatics (NII)
Thanh Duc NGO
University of Information Technology
Duy-Dinh LE
National Institute of Informatics (NII)
Shin'ichi SATOH
National Institute of Informatics (NII)
Duc Anh DUONG
University of Information Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Chien-Quang LE, Sang PHAN, Thanh Duc NGO, Duy-Dinh LE, Shin'ichi SATOH, Duc Anh DUONG, "Human Action Recognition from Depth Videos Using Pool of Multiple Projections with Greedy Selection" in IEICE TRANSACTIONS on Information,
vol. E99-D, no. 8, pp. 2161-2171, August 2016, doi: 10.1587/transinf.2015EDP7430.
Abstract: Depth-based action recognition has been attracting the attention of researchers because of the advantages of depth cameras over standard RGB cameras. One of these advantages is that depth data can provide richer information from multiple projections. In particular, multiple projections can be used to extract discriminative motion patterns that would not be discernible from one fixed projection. However, high computational costs have meant that recent studies have exploited only a small number of projections, such as front, side, and top. Thus, a large number of projections, which may be useful for discriminating actions, are discarded. In this paper, we propose an efficient method to exploit pools of multiple projections for recognizing actions in depth videos. First, we project 3D data onto multiple 2D-planes from different viewpoints sampled on a geodesic dome to obtain a large number of projections. Then, we train and test action classifiers independently for each projection. To reduce the computational cost, we propose a greedy method to select a small yet robust combination of projections. The idea is that best complementary projections will be considered first when searching for optimal combination. We conducted extensive experiments to verify the effectiveness of our method on three challenging benchmarks: MSR Action 3D, MSR Gesture 3D, and 3D Action Pairs. The experimental results show that our method outperforms other state-of-the-art methods while using a small number of projections.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.2015EDP7430/_p
Copy
@ARTICLE{e99-d_8_2161,
author={Chien-Quang LE, Sang PHAN, Thanh Duc NGO, Duy-Dinh LE, Shin'ichi SATOH, Duc Anh DUONG, },
journal={IEICE TRANSACTIONS on Information},
title={Human Action Recognition from Depth Videos Using Pool of Multiple Projections with Greedy Selection},
year={2016},
volume={E99-D},
number={8},
pages={2161-2171},
abstract={Depth-based action recognition has been attracting the attention of researchers because of the advantages of depth cameras over standard RGB cameras. One of these advantages is that depth data can provide richer information from multiple projections. In particular, multiple projections can be used to extract discriminative motion patterns that would not be discernible from one fixed projection. However, high computational costs have meant that recent studies have exploited only a small number of projections, such as front, side, and top. Thus, a large number of projections, which may be useful for discriminating actions, are discarded. In this paper, we propose an efficient method to exploit pools of multiple projections for recognizing actions in depth videos. First, we project 3D data onto multiple 2D-planes from different viewpoints sampled on a geodesic dome to obtain a large number of projections. Then, we train and test action classifiers independently for each projection. To reduce the computational cost, we propose a greedy method to select a small yet robust combination of projections. The idea is that best complementary projections will be considered first when searching for optimal combination. We conducted extensive experiments to verify the effectiveness of our method on three challenging benchmarks: MSR Action 3D, MSR Gesture 3D, and 3D Action Pairs. The experimental results show that our method outperforms other state-of-the-art methods while using a small number of projections.},
keywords={},
doi={10.1587/transinf.2015EDP7430},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Human Action Recognition from Depth Videos Using Pool of Multiple Projections with Greedy Selection
T2 - IEICE TRANSACTIONS on Information
SP - 2161
EP - 2171
AU - Chien-Quang LE
AU - Sang PHAN
AU - Thanh Duc NGO
AU - Duy-Dinh LE
AU - Shin'ichi SATOH
AU - Duc Anh DUONG
PY - 2016
DO - 10.1587/transinf.2015EDP7430
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2016
AB - Depth-based action recognition has been attracting the attention of researchers because of the advantages of depth cameras over standard RGB cameras. One of these advantages is that depth data can provide richer information from multiple projections. In particular, multiple projections can be used to extract discriminative motion patterns that would not be discernible from one fixed projection. However, high computational costs have meant that recent studies have exploited only a small number of projections, such as front, side, and top. Thus, a large number of projections, which may be useful for discriminating actions, are discarded. In this paper, we propose an efficient method to exploit pools of multiple projections for recognizing actions in depth videos. First, we project 3D data onto multiple 2D-planes from different viewpoints sampled on a geodesic dome to obtain a large number of projections. Then, we train and test action classifiers independently for each projection. To reduce the computational cost, we propose a greedy method to select a small yet robust combination of projections. The idea is that best complementary projections will be considered first when searching for optimal combination. We conducted extensive experiments to verify the effectiveness of our method on three challenging benchmarks: MSR Action 3D, MSR Gesture 3D, and 3D Action Pairs. The experimental results show that our method outperforms other state-of-the-art methods while using a small number of projections.
ER -