Recent research on neocognitron-like neural feed-forward architectures, which have formerly been successfully applied to the recognition of artificial stimuli such as paperclip objects, now also opens up application to more natural stimuli. Such networks exhibit high-recognition performance with respect to translation, rotation, scaling and cluttered surroundings. In this contribution, we introduce a new type of hierarchical model, which is trained using a non-negative matrix factorization algorithm. In contrast to previous work, our approach cannot only classify objects but is also capable of rapid object detection in natural scenes. Thus, the time-consuming and conceptually unsatisfying split-up into a localization stage (e.g., using segmentation) and a subsequent classification can be avoided. The network consists of alternating layers of simple and complex cell planes and incorporates nonlinear processing schemes that have been proposed in recent literature. Learning of receptive field profiles for the lower layers of the network takes place by unsupervised learning whereas a final classification layer is trained supervised. This final layer is then utilized for detection. We test the classification performance of the network on images of natural objects which are systematically distorted. To test the ability to detect objects, cluttered natural background is used.
Recent research on Neocognitron-like neural feed-forward
architectures, which have formerly been successfully applied to
recognition of artifical stimuli like paperclip objects, is
promising application to more natural stimuli. Several authors have
shown high recognition performance of such networks with respect to
translation, rotation, scaling and cluttered surroundings. In this
contribution, we introduce a variation of existing hierarchical
models, that is trained using a non-negative matrix factorization
algorithm. In contrast to previous work, our approach can not only
classify objects but is also capable of rapid object detection in
natural scenes. Thus, the time-consuming and conceptually
unsatisfying split-up into a localization stage (e.g. using
segmentation) and a subsequent classification can be avoided. Though
in principle an exhaustive search by classification of every
sub-window of an image is performed, the process is nevertheless
highly efficient. The network consists of alternating layers of
simple and complex cell planes and incorporates nonlinear processing
schemes that have been proposed in recent literature. Learning of
receptive field profiles for the lower layers of the network takes
place by unsupervised learning whereas a final classification layer
is trained supervised. Detection is achieved by attaching an
additional network layer, whose simple cell profiles are learned
from the final classification units that were acquired during the
training phase. We test the classification performance of the
network on images of natural objects which are systematically
distorted. To test the ability to detect objects, cluttered natural
background is used.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.