Covid-19 Search Engine
Monday, October 31, 2022
Content-based image retrieval (CBIR) involves various computer vision and image processing methods aimed at searching for digital images within large databases. Unlike traditional approaches that rely on tags, keywords, and titles, CBIR focuses on features like shape, texture, and color. In the medical field, image retrieval has become increasingly important due to the vast collection of scans in hospitals. Manual processing and annotation of these images are time-consuming tasks for healthcare professionals. Medical imaging offers detailed information about the human body using different modalities, with radiography (X-ray imaging) being the most common. Other modalities include CT, MRI, and PET, each with its own data formats such as DICOM and NIfTI. However, existing content-based image retrieval systems cannot accommodate datasets like the SASRS-COV-2 Ct-Scan dataset, which contains positive and negative CT scans for COVID-19.
The content-based image retrieval system involves extracting features from an image database and storing them in a feature database, often referred to as a bag-of-features. When a user queries an image, its features are extracted and compared with those in the feature database to find similar images, which are then ranked and retrieved for the user. The retrieval process is computationally intensive, and logs of queried images are stored with timestamps. Both the image and feature databases are implemented offline due to size limitations, using Node.js for storing and retrieving features and images.
In the project, four different models (VGG-16, VGG-19, Inception-V3, and Xception) pretrained on ImageNet were considered and implemented. The performance of these models was evaluated based on factors such as computation time, similarity metrics, and size.