You are here
Sparse Coding and Compressed Sensing: Locally Competitive Algorithms and Random Projections
- Date Issued:
- 2016
- Summary:
- For an 8-bit grayscale image patch of size n x n, the number of distinguishable signals is 256(n2). Natural images (e.g.,photographs of a natural scene) comprise a very small subset of these possible signals. Traditional image and video processing relies on band-limited or low-pass signal models. In contrast, we will explore the observation that most signals of interest are sparse, i.e. in a particular basis most of the expansion coefficients will be zero. Recent developments in sparse modeling and L1 optimization have allowed for extraordinary applications such as the single pixel camera, as well as computer vision systems that can exceed human performance. Here we present a novel neural network architecture combining a sparse filter model and locally competitive algorithms (LCAs), and demonstrate the networks ability to classify human actions from video. Sparse filtering is an unsupervised feature learning algorithm designed to optimize the sparsity of the feature distribution directly without having the need to model the data distribution. LCAs are defined by a system of di↵erential equations where the initial conditions define an optimization problem and the dynamics converge to a sparse decomposition of the input vector. We applied this architecture to train a classifier on categories of motion in human action videos. Inputs to the network were small 3D patches taken from frame di↵erences in the videos. Dictionaries were derived for each action class and then activation levels for each dictionary were assessed during reconstruction of a novel test patch. We discuss how this sparse modeling approach provides a natural framework for multi-sensory and multimodal data processing including RGB video, RGBD video, hyper-spectral video, and stereo audio/video streams.
Title: | Sparse Coding and Compressed Sensing: Locally Competitive Algorithms and Random Projections. |
304 views
166 downloads |
---|---|---|
Name(s): |
Hahn, William E., author Barenholtz, Elan, Thesis advisor Florida Atlantic University, Degree grantor Charles E. Schmidt College of Science Center for Complex Systems and Brain Sciences |
|
Type of Resource: | text | |
Genre: | Electronic Thesis Or Dissertation | |
Date Created: | 2016 | |
Date Issued: | 2016 | |
Publisher: | Florida Atlantic University | |
Place of Publication: | Boca Raton, Fla. | |
Physical Form: | application/pdf | |
Extent: | 287 p. | |
Language(s): | English | |
Summary: | For an 8-bit grayscale image patch of size n x n, the number of distinguishable signals is 256(n2). Natural images (e.g.,photographs of a natural scene) comprise a very small subset of these possible signals. Traditional image and video processing relies on band-limited or low-pass signal models. In contrast, we will explore the observation that most signals of interest are sparse, i.e. in a particular basis most of the expansion coefficients will be zero. Recent developments in sparse modeling and L1 optimization have allowed for extraordinary applications such as the single pixel camera, as well as computer vision systems that can exceed human performance. Here we present a novel neural network architecture combining a sparse filter model and locally competitive algorithms (LCAs), and demonstrate the networks ability to classify human actions from video. Sparse filtering is an unsupervised feature learning algorithm designed to optimize the sparsity of the feature distribution directly without having the need to model the data distribution. LCAs are defined by a system of di↵erential equations where the initial conditions define an optimization problem and the dynamics converge to a sparse decomposition of the input vector. We applied this architecture to train a classifier on categories of motion in human action videos. Inputs to the network were small 3D patches taken from frame di↵erences in the videos. Dictionaries were derived for each action class and then activation levels for each dictionary were assessed during reconstruction of a novel test patch. We discuss how this sparse modeling approach provides a natural framework for multi-sensory and multimodal data processing including RGB video, RGBD video, hyper-spectral video, and stereo audio/video streams. | |
Identifier: | FA00004713 (IID) | |
Degree granted: | Dissertation (Ph.D.)--Florida Atlantic University, 2016. | |
Collection: | FAU Electronic Theses and Dissertations Collection | |
Note(s): | Includes bibliography. | |
Subject(s): |
Artificial intelligence Expert systems (Computer science) Image processing -- Digital techniques -- Mathematics Sparse matrices |
|
Held by: | Florida Atlantic University Libraries | |
Sublocation: | Digital Library | |
Links: | http://purl.flvc.org/fau/fd/FA00004713 | |
Persistent Link to This Record: | http://purl.flvc.org/fau/fd/FA00004713 | |
Use and Reproduction: | Copyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. | |
Use and Reproduction: | http://rightsstatements.org/vocab/InC/1.0/ | |
Host Institution: | FAU | |
Is Part of Series: | Florida Atlantic University Digital Library Collections. |