Localized deep extreme learning machines for efficient RGB-D object recognition

Existing RGB-D object recognition methods either use channel specific handcrafted features, or learn features with deep networks. The former lack representation ability while the latter require large amounts of training data and learning time. In real-time robotics applications involving RGB-D senso...

Full description

Bibliographic Details
Main Authors: Mohd Zaki, Hasan Firdaus, Shafait, Faisal, Mian, Ajmal S.
Format: Conference or Workshop Item
Language:English
English
Published: Institute of Electrical and Electronic Engineers, Inc. (IEEE) 2015
Subjects:
Online Access:http://irep.iium.edu.my/64704/
http://irep.iium.edu.my/64704/
http://irep.iium.edu.my/64704/
http://irep.iium.edu.my/64704/7/64704%20Localized%20Deep%20Extreme%20Learning.pdf
http://irep.iium.edu.my/64704/8/64704%20Localized%20Deep%20Extreme%20Learning%20SCOPUS.pdf
Description
Summary:Existing RGB-D object recognition methods either use channel specific handcrafted features, or learn features with deep networks. The former lack representation ability while the latter require large amounts of training data and learning time. In real-time robotics applications involving RGB-D sensors, we do not have the luxury of both. In this paper, we propose Localized Deep Extreme Learning Machines (LDELM) that efficiently learn features from RGB-D data. By using localized patches, not only is the problem of data sparsity solved, but the learned features are robust to occlusions and viewpoint variations. LDELM learns deep localized features in an unsupervised way from random patches of the training data. Each image is then feed-forwarded, patch-wise, through the LDELM to form a cuboid of features. The cuboid is divided into cells and pooled to get the final compact image representation which is then used to train an ELM classifier. Experiments on the benchmark Washington RGB-D and 2D3D datasets show that the proposed algorithm not only is significantly faster to train but also outperforms state-of-the-art methods in terms of accuracy and classification time.