The performance of the proposed CBIR system is assessed using 2000 images from the Corel image dataset. The images were divided into 10 semantic groups, as well as a number of features. They were then compared to other techniques such as Gain Ratio, Genetic Algorithm, Information Gain, Isomap, Kernel PCA, OneR, Principal Component Analysis (PCA) and Relief-F. The results from the experiment conducted in this thesis show that the proposed feature selections and classifiers will improve the semantic performance results in the proposed CBIR systems. Retrieval accuracy results for Fuzzy Rough feature selection is 91.06% for Normal images, and the results are 90.31%, 91.28% and 90.42% with Gaussian Noise, Salt & Pepper Noise and Poisson Noise respectively.
The significance of this research is firstly, proposing an improved pre-processing phase to solve CBIR problems. Secondly, proposing an integrated framework of using Rough Set with one-versus-one (1-v-1) Support Vector Machine and Rough Set with one-versus-all (1-v-r) Support Vector Machine classifiers in CBIR systems. The Rough Set theory, as a feature selection method in this pre-processing phase, could solve huge amounts of image features problems by narrowing the search space. Also, this theory could deal with vague and incomplete areas by its upper and lower approximations and solve the incomplete and vague areas in image descriptions. As such, the accuracy of the CBIR system can be improved. This proposed approach also gives the confidence and deviation of the estimation (that traditional methods cannot provide before) when compared with historical systems. Finally, the semantic gap problem can be reduced by the Fuzzy Rough Set semantic rules.
This research proposes an improved approach to select significant features from the huge image feature vector. The concept behind this research is that it is possible to extract image feature relational patterns in an image feature vector database. After which, these relational patterns are used to generate rules and improve the retrieval results for a CBIR system. In addition, this research proposes a CBIR system utilising the Rough Set instead of deterministic and crisp methods. In this research, Rough Set rules are evaluated with noisy images. Also, in order to have a more accurate classifier in the CBIR system, the classifier is proposed to be based on the Rough Set and Support Vector Machine (SVM) in this research.
The committee considered Dr. Li's dissertation titled "Content-based visual search learned from social media" as worthy of the award as it substantially extends the boundaries for developing content-based multimedia indexing and retrieval solutions. In particular, it provides fresh new insights into the possibilities for realizing image retrieval solutions in the presence of vast information that can be drawn from the social media.
Each image in a Content Based Image Retrieval (CBIR) system is represented by its features such as colour, texture and shape. These three groups of features are stored in the feature vector. Therefore, each image managed by the CBIR system is associated with one or more feature vectors. As a result, the storage space required for feature vectors is proportional to the amount of images in the database. In addition, when comparing the similarities among images, the CBIR needs to compare these feature vectors. Nonetheless, researchers are still facing problems when working with a huge image database. Much time is needed when comparing huge feature vectors, as a large amount of memory is required to run the CBIR system. Due to this problem, feature reduction and selection techniques are employed to alleviate the storage and time requirements of large feature vectors. There are many feature reduction techniques, including linear projection techniques such as Principal Component Analysis (PCA), Linear Discriminate Analysis (LDA) and metric embedding techniques (both linear and non-linear). However, these methods have limitations in the CBIR system and cannot improve CBIR performance (retrieval accuracy) and reduce semantic gap efficiently. Therefore, we need a feature selection method that can deal with image features efficiently and has the ability to deal with uncertainties.