You are here

Context-based Image Concept Detection and Annotation

Download pdf | Full Screen View

Date Issued:
2016
Summary:
Scene understanding attempts to produce a textual description of visible and latent concepts in an image to describe the real meaning of the scene. Concepts are either objects, events or relations depicted in an image. To recognize concepts, the decision of object detection algorithm must be further enhanced from visual similarity to semantical compatibility. Semantically relevant concepts convey the most consistent meaning of the scene. Object detectors analyze visual properties (e.g., pixel intensities, texture, color gradient) of sub-regions of an image to identify objects. The initially assigned objects names must be further examined to ensure they are compatible with each other and the scene. By enforcing inter-object dependencies (e.g., co-occurrence, spatial and semantical priors) and object to scene constraints as background information, a concept classifier predicts the most semantically consistent set of names for discovered objects. The additional background information that describes concepts is called context. In this dissertation, a framework for building context-based concept detection is presented that uses a combination of multiple contextual relationships to refine the result of underlying feature-based object detectors to produce most semantically compatible concepts. In addition to the lack of ability to capture semantical dependencies, object detectors suffer from high dimensionality of feature space that impairs them. Variances in the image (i.e., quality, pose, articulation, illumination, and occlusion) can also result in low-quality visual features that impact the accuracy of detected concepts. The object detectors used to build context-based framework experiments in this study are based on the state-of-the-art generative and discriminative graphical models. The relationships between model variables can be easily described using graphical models and the dependencies and precisely characterized using these representations. The generative context-based implementations are extensions of Latent Dirichlet Allocation, a leading topic modeling approach that is very effective in reduction of the dimensionality of the data. The discriminative contextbased approach extends Conditional Random Fields which allows efficient and precise construction of model by specifying and including only cases that are related and influence it. The dataset used for training and evaluation is MIT SUN397. The result of the experiments shows overall 15% increase in accuracy in annotation and 31% improvement in semantical saliency of the annotated concepts.
Title: Context-based Image Concept Detection and Annotation.
470 views
353 downloads
Name(s): Zolghadr, Esfandiar, author
Furht, Borko, Thesis advisor
Florida Atlantic University, Degree grantor
College of Engineering and Computer Science
Department of Computer and Electrical Engineering and Computer Science
Type of Resource: text
Genre: Electronic Thesis Or Dissertation
Date Created: 2016
Date Issued: 2016
Publisher: Florida Atlantic University
Place of Publication: Boca Raton, Fla.
Physical Form: application/pdf
Extent: 111 p.
Language(s): English
Summary: Scene understanding attempts to produce a textual description of visible and latent concepts in an image to describe the real meaning of the scene. Concepts are either objects, events or relations depicted in an image. To recognize concepts, the decision of object detection algorithm must be further enhanced from visual similarity to semantical compatibility. Semantically relevant concepts convey the most consistent meaning of the scene. Object detectors analyze visual properties (e.g., pixel intensities, texture, color gradient) of sub-regions of an image to identify objects. The initially assigned objects names must be further examined to ensure they are compatible with each other and the scene. By enforcing inter-object dependencies (e.g., co-occurrence, spatial and semantical priors) and object to scene constraints as background information, a concept classifier predicts the most semantically consistent set of names for discovered objects. The additional background information that describes concepts is called context. In this dissertation, a framework for building context-based concept detection is presented that uses a combination of multiple contextual relationships to refine the result of underlying feature-based object detectors to produce most semantically compatible concepts. In addition to the lack of ability to capture semantical dependencies, object detectors suffer from high dimensionality of feature space that impairs them. Variances in the image (i.e., quality, pose, articulation, illumination, and occlusion) can also result in low-quality visual features that impact the accuracy of detected concepts. The object detectors used to build context-based framework experiments in this study are based on the state-of-the-art generative and discriminative graphical models. The relationships between model variables can be easily described using graphical models and the dependencies and precisely characterized using these representations. The generative context-based implementations are extensions of Latent Dirichlet Allocation, a leading topic modeling approach that is very effective in reduction of the dimensionality of the data. The discriminative contextbased approach extends Conditional Random Fields which allows efficient and precise construction of model by specifying and including only cases that are related and influence it. The dataset used for training and evaluation is MIT SUN397. The result of the experiments shows overall 15% increase in accuracy in annotation and 31% improvement in semantical saliency of the annotated concepts.
Identifier: FA00004745 (IID)
Degree granted: Dissertation (Ph.D.)--Florida Atlantic University, 2016.
Collection: FAU Electronic Theses and Dissertations Collection
Note(s): Includes bibliography.
Subject(s): Computer vision--Mathematical models.
Pattern recognition systems.
Information visualization.
Natural language processing (Computer science)
Multimodal user interfaces (Computer systems)
Latent structure analysis.
Expert systems (Computer science)
Held by: Florida Atlantic University Libraries
Sublocation: Digital Library
Links: http://purl.flvc.org/fau/fd/FA00004745
Persistent Link to This Record: http://purl.flvc.org/fau/fd/FA00004745
Use and Reproduction: Copyright © is held by the author with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Use and Reproduction: http://rightsstatements.org/vocab/InC/1.0/
Host Institution: FAU
Is Part of Series: Florida Atlantic University Digital Library Collections.