Object detection is an essential and expensive process in many computer vision systems. Standard off-the-shelf embedded processors are hard to achieve performance-power balance for implementation of object detection applications. In this work, we explore an Application Specific Instruction set Processor (ASIP) for object detection using Histogram of Oriented Gradients (HOG) feature. Algorithm simplifications are adopted to reduce memory bandwidth requirements and mathematical complexity without losing reliability. Also, parallel histogram generation and on-the-fly Support Vector Machine (SVM) calculation architecture are employed to reduce the necessary cycle counts. The HOG algorithm on the proposed ASIP was accelerated by a factor of 63x compared to the pure software implementation. The ASIP was synthesized for a standard 90nm CMOS library, with a silicon area of 1.31mm2 and 47.8mW power consumption at a 200MHz frequency. Our object detection processor can achieve 42 frames-per-second (fps) on VGA video. The evaluation and implementation results show that the proposed ASIP is both area-efficient and power-efficient while being competitive with commercial CPUs/DSPs. Furthermore, our ASIP exhibits comparable performance even with hard-wire designs.
Shanlin XIAO
Tokyo Institute of Technology
Tsuyoshi ISSHIKI
Tokyo Institute of Technology
Dongju LI
Tokyo Institute of Technology
Hiroaki KUNIEDA
Tokyo Institute of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Shanlin XIAO, Tsuyoshi ISSHIKI, Dongju LI, Hiroaki KUNIEDA, "HOG-Based Object Detection Processor Design Using ASIP Methodology" in IEICE TRANSACTIONS on Fundamentals,
vol. E100-A, no. 12, pp. 2972-2984, December 2017, doi: 10.1587/transfun.E100.A.2972.
Abstract: Object detection is an essential and expensive process in many computer vision systems. Standard off-the-shelf embedded processors are hard to achieve performance-power balance for implementation of object detection applications. In this work, we explore an Application Specific Instruction set Processor (ASIP) for object detection using Histogram of Oriented Gradients (HOG) feature. Algorithm simplifications are adopted to reduce memory bandwidth requirements and mathematical complexity without losing reliability. Also, parallel histogram generation and on-the-fly Support Vector Machine (SVM) calculation architecture are employed to reduce the necessary cycle counts. The HOG algorithm on the proposed ASIP was accelerated by a factor of 63x compared to the pure software implementation. The ASIP was synthesized for a standard 90nm CMOS library, with a silicon area of 1.31mm2 and 47.8mW power consumption at a 200MHz frequency. Our object detection processor can achieve 42 frames-per-second (fps) on VGA video. The evaluation and implementation results show that the proposed ASIP is both area-efficient and power-efficient while being competitive with commercial CPUs/DSPs. Furthermore, our ASIP exhibits comparable performance even with hard-wire designs.
URL: https://globals.ieice.org/en_transactions/fundamentals/10.1587/transfun.E100.A.2972/_p
Copy
@ARTICLE{e100-a_12_2972,
author={Shanlin XIAO, Tsuyoshi ISSHIKI, Dongju LI, Hiroaki KUNIEDA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={HOG-Based Object Detection Processor Design Using ASIP Methodology},
year={2017},
volume={E100-A},
number={12},
pages={2972-2984},
abstract={Object detection is an essential and expensive process in many computer vision systems. Standard off-the-shelf embedded processors are hard to achieve performance-power balance for implementation of object detection applications. In this work, we explore an Application Specific Instruction set Processor (ASIP) for object detection using Histogram of Oriented Gradients (HOG) feature. Algorithm simplifications are adopted to reduce memory bandwidth requirements and mathematical complexity without losing reliability. Also, parallel histogram generation and on-the-fly Support Vector Machine (SVM) calculation architecture are employed to reduce the necessary cycle counts. The HOG algorithm on the proposed ASIP was accelerated by a factor of 63x compared to the pure software implementation. The ASIP was synthesized for a standard 90nm CMOS library, with a silicon area of 1.31mm2 and 47.8mW power consumption at a 200MHz frequency. Our object detection processor can achieve 42 frames-per-second (fps) on VGA video. The evaluation and implementation results show that the proposed ASIP is both area-efficient and power-efficient while being competitive with commercial CPUs/DSPs. Furthermore, our ASIP exhibits comparable performance even with hard-wire designs.},
keywords={},
doi={10.1587/transfun.E100.A.2972},
ISSN={1745-1337},
month={December},}
Copy
TY - JOUR
TI - HOG-Based Object Detection Processor Design Using ASIP Methodology
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 2972
EP - 2984
AU - Shanlin XIAO
AU - Tsuyoshi ISSHIKI
AU - Dongju LI
AU - Hiroaki KUNIEDA
PY - 2017
DO - 10.1587/transfun.E100.A.2972
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E100-A
IS - 12
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - December 2017
AB - Object detection is an essential and expensive process in many computer vision systems. Standard off-the-shelf embedded processors are hard to achieve performance-power balance for implementation of object detection applications. In this work, we explore an Application Specific Instruction set Processor (ASIP) for object detection using Histogram of Oriented Gradients (HOG) feature. Algorithm simplifications are adopted to reduce memory bandwidth requirements and mathematical complexity without losing reliability. Also, parallel histogram generation and on-the-fly Support Vector Machine (SVM) calculation architecture are employed to reduce the necessary cycle counts. The HOG algorithm on the proposed ASIP was accelerated by a factor of 63x compared to the pure software implementation. The ASIP was synthesized for a standard 90nm CMOS library, with a silicon area of 1.31mm2 and 47.8mW power consumption at a 200MHz frequency. Our object detection processor can achieve 42 frames-per-second (fps) on VGA video. The evaluation and implementation results show that the proposed ASIP is both area-efficient and power-efficient while being competitive with commercial CPUs/DSPs. Furthermore, our ASIP exhibits comparable performance even with hard-wire designs.
ER -