Automatic segmentation of individual sensors of a manometer in video-fluoroscopic videos from head and neck cancer patients

Head and neck cancer (HNC) is one of the most common types of cancers worldwide, ranking as the sixth most common type of cancer in 2022 according to GLOBOCAN. HNC refers to cancers found in the oral cavity, sinonasal cavity, pharynx and larynx, and the locations of these types of cancers and their treatments often result in dysphagia (trouble swallowing).

The current gold standards, the videofluoroscopic swallowing study (VFSS) and fiberoptic endoscopic evaluation of swallowing (FEES), rely on subjective interpretation of data, often resulting in unreliable diagnosis. High-resolution impedance manometry (HRIM) could provide a more quantitative approach to diagnosing dysphagia, but the anatomical changes in HNC patients cause abnormalities in the data, making analysis using HRIM difficult. To solve this limitation, it is possible to combine HRIM with VFSS and extract the sensor locations and overlay them on the HRIM plots. Doing this manually is very time-consuming, however.

In this study, we aim to develop an algorithm to automatically extract the HRIM sensor locations from a VFSS video frame to aid in the diagnosis of dysphagia in HNC patients. While deep-learning-based segmentation has been used to great success in the world of medical imaging, this study focuses on traditional segmentation methods.