Abstract:
Monocular depth estimation is an important topic in minimally invasive surgery, providing valuable information for downstream application, like navigation systems. Deep learning for this task requires high amount of training data for an accurate and robust model. Especially in the medical field acquiring ground truth depth information is rarely possible due to patient security and technical limitations. This problem is being tackled by many approaches including the use of syn- thetic data. This leads to the question, how well does the syn- thetic data allow the prediction of depth information on clini- cal data. To evaluate this, the synthetic data is used to train and optimize a U-Net, including hyperparameter tuning and aug- mentation. The trained model is then used to predict the depth on clinical image and analyzed in quality, consistency over the same scene, time and color. The results demonstrate that syn- thetic data sets can be used for training, with an accuracy of over 77% and a RMSE below 10 mm on the synthetic data set, do well on resembling clinical data, but also have limitations due to the complexity of clinical environments. Synthetic data sets are a promising approach allowing monocular depth esti- mation in fields with otherwise lacking data.
Abstract:
In laparoscopic liver surgery, image-guided navigation systems provide crucial support to surgeons by supply- ing information about tumor and vessel positions. For this purpose, these information from a preoperative CT or MRI scan is overlaid onto the laparoscopic video. One option is performing a registration of preoperative 3D data and 3D reconstructed laparoscopic data. A robust registration is challenging due to factors like limited field of view, liver deformations, and 3D reconstruction errors. Since in reality various influencing factors always intertwine, it is crucial to analyze their combined effects. This paper assesses registration accuracy under various synthetically simulated influences: patch size, spatial dis- placement, Gaussian deformations, holes, and downsampling. The objective is to provide insights into the required quality of the intraoperative 3D surface patches. LiverMatch serves as the feature descriptor, and registration employs the RANSAC algorithm. The results of this paper show that ensuring a large field of view of at least 15-20% of the liver surface is necessary, allowing tolerance for less accurate depth estimation.
Abstract:
The lack of labeled, intraoperative patient data in medical scenarios poses a relevant challenge for machine learning applications. Given the apparent power of machine learning, this study examines how synthetically-generated data can help to reduce the amount of clinical data needed for robust liver surface segmentation in laparoscopic images. Here, we report the results of three experiments, using 525 annotated clinical images from 5 patients alongside 20,000 synthetic photo-realistic images from 10 patient models. The effectiveness of the use of synthetic data is compared to the use of data augmentation, a traditional performance-enhancing technique. For training, a supervised approach employing the U-Net architecture was chosen. The results of these experiments show a progressive increase in accuracy. Our base experiment on clinical data yielded an F1 score of 0.72. Applying data augmentation to this model increased the F1 score to 0.76. Our model pre-trained on synthetic data and fine-tuned with augmented data achieved an F1 score of 0.80, a 4% increase. Additionally, a model evaluation involving k-fold cross validation highlighted the dependency of the result on the test set. These results demonstrate that leveraging synthetic data has the ability of limiting the need for more patient data to increase the segmentation performance.
Abstract:
Objective. 3D-localization of gamma sources has the potential to improve the outcome of radio-guided surgery. The goal of this paper is to analyze the localization accuracy for point-like sources with a single coded aperture camera. Approach. We both simulated and measured a point-like 241Am source at 17 positions distributed within the field of view of an experimental gamma camera. The setup includes a 0.11mm thick Tungsten sheet with a MURA mask of rank 31 and pinholes of 0.08 mm in diameter and a detector based on the photon counting readout circuit Timepix3. Two methods, namely an iterative search including either a symmetric Gaussian fitting or an exponentially modified Gaussian fitting (EMG) and a center of mass method were compared to estimate the 3D source position. Main results. Considering the decreasing axial resolution with source-to-mask distance, the EMG improved the results by a factor of 4 compared to the Gaussian fitting based on the simulated data. Overall, we obtained a mean localization error of 0.77 mm on the simulated and 2.64 mm on the experimental data in the imaging range of 20−100 mm. Significance. This paper shows that despite the low axial resolution, point-like sources in the nearfield can be localized as well as with more sophisticated imaging devices such as stereo cameras. The influence of the source size and the photon count on the imaging and localization accuracy remains an important issue for further research.
Abstract:
Purpose: Handheld gamma cameras with coded aperture collimators are under inves- tigation for intraoperative imaging in nuclear medicine. Coded apertures are a promis- ing collimation technique for applications such as lymph node localization due to their high sensitivity and the possibility of 3D imaging. We evaluated the axial resolutionand computational performance of two reconstruction methods.Methods: An experimental gamma camera was set up consisting of the pixelated semiconductor detector Timepix3 and MURA mask of rank 31 with round holesof 0.08 mm in diameter in a 0.11 mm thick Tungsten sheet. A set of measurements was taken where a point-like gamma source was placed centrally at 21 different positions within the range of 12–100 mm. For each source position, the detector image was reconstructed in 0.5 mm steps around the true source position, resulting in an image stack. The axial resolution was assessed by the full width at half maximum (FWHM) of the contrast-to-noise ratio (CNR) profile along the z-axis of the stack. Two reconstruction methods were compared: MURA Decoding and a 3D maximum likeli- hood expectation maximization algorithm (3D-MLEM).Results: While taking 4400 times longer in computation, 3D-MLEM yielded a smaller axial FWHM and a higher CNR. The axial resolution degraded from 5.3 mm and 1.8 mm at 12 mm to 42.2 mm and 13.5 mm at 100 mm for MURA Decoding and 3D-MLEM respectively.Conclusion: Our results show that the coded aperture enables the depth estimation of single point-like sources in the near field. Here, 3D-MLEM offered a better axial reso- lution but was computationally much slower than MURA Decoding, whose reconstruc- tion time is compatible with real-time imaging.