A. Wachter, and W. Nahm. Workflow Augmentation of Video Data for Event Recognition with Time-Sensitive Neural Networks. In eprint, 2021
Supervised training of neural networks requires large, diverse and well annotated data sets. In the medical field, this is often difficult to achieve due to constraints in time, expert knowledge and prevalence of an event. Artificial data augmentation can help to prevent overfitting and improve the detection of rare events as well as overall performance. However, most augmentation techniques use purely spatial transformations, which are not sufficient for video data with temporal correlations. In this paper, we present a novel methodology for workflow augmentation and demonstrate its benefit for event recognition in cataract surgery. The proposed approach increases the frequency of event alternation by creating artificial videos. The original video is split into event segments and a workflow graph is extracted from the original annotations. Finally, the segments are assembled into new videos based on the workflow graph. Compared to the original videos, the frequency of event alternation in the augmented cataract surgery videos increased by 26%. Further, a 3% higher classification accuracy and a 7.8% higher precision was achieved compared to a state-of-the-art approach. Our approach is particularly helpful to increase the occurrence of rare but important events and can be applied to a large variety of use cases.
A. Wachter, J. Kost, and W. Nahm. Simulation-Based Estimation of the Number of Cameras Required for 3D Reconstruction in a Narrow-Baseline Multi-Camera Setup. In Journal of Imaging, vol. 7(5) , pp. 87, 2021
Graphical visualization systems are a common clinical tool for displaying digital images and three-dimensional volumetric data. These systems provide a broad spectrum of information to support physicians in their clinical routine. For example, the field of radiology enjoys unrestricted options for interaction with the data, since information is pre-recorded and available entirely in digital form. However, some fields, such as microsurgery, do not benefit from this yet. Microscopes, endoscopes, and laparoscopes show the surgical site as it is. To allow free data manipulation and information fusion, 3D digitization of surgical sites is required. We aimed to find the number of cameras needed to add this functionality to surgical microscopes. For this, we performed in silico simulations of the 3D reconstruction of representative models of microsurgical sites with different numbers of cameras in narrow-baseline setups. Our results show that eight independent camera views are preferable, while at least four are necessary for a digital surgical site. In most cases, eight cameras allow the reconstruction of over 99% of the visible part. With four cameras, still over 95% can be achieved. This answers one of the key questions for the development of a prototype microscope. In future, such a system can provide functionality which is unattainable today
Cranio-maxillofacial surgery often alters the aesthetics of the face which can be a heavy burden for patients to decide whether or not to undergo surgery. Today, physicians can predict the post-operative face using surgery planning tools to support the patient’s decision-making. While these planning tools allow a simulation of the post-operative face, the facial texture must usually be captured by another 3D texture scan and subsequently mapped on the simulated face. This approach often results in face predictions that do not appear realistic or lively looking and are therefore ill-suited to guide the patient’s decision-making. Instead, we propose a method using a generative adversarial network to modify a facial image according to a 3D soft-tissue estimation of the post-operative face. To circumvent the lack of available data pairs between pre- and post-operative measurements we propose a semi-supervised training strategy using cycle losses that only requires paired open-source data of images and 3D surfaces of the face’s shape. After training on “in-the-wild” images we show that our model can realistically manipulate local regions of a face in a 2D image based on a modified 3D shape. We then test our model on four clinical examples where we predict the post-operative face according to a 3D soft-tissue prediction of surgery outcome, which was simulated by a surgery planning tool. As a result, we aim to demonstrate the potential of our approach to predict realistic post-operative images of faces without the need of paired clinical data, physical models, or 3D texture scans.
This contribution is part of a project concerning the creation of an artificial dataset comprising 3D head scans of craniosynostosis patients for a deep-learning-based classification. To conform to real data, both head and neck are required in the 3D scans. However, during patient recording, the neck is often covered by medical staff. Simply pasting an arbitrary neck leaves large gaps in the 3D mesh. We therefore use a publicly available statistical shape model (SSM) for neck reconstruction. However, most SSMs of the head are constructed using healthy subjects, so the full head reconstruction loses the craniosynostosis-specific head shape. We propose a method to recover the neck while keeping the pathological head shape intact. We propose a Laplace- Beltrami-based refinement step to deform the posterior mean shape of the full head model towards the pathological head. The artificial neck is created using the publicly available Liverpool-Y ork-Model. W e apply our method to construct artificial necks for head scans of 50 scaphocephaly patients. Our method reduces mean vertex correspondence error by approximately 1.3 mm compared to the ordinary posterior mean shape, preserves the pathological head shape, and creates a continuous transition between neck and head. The presented method showed good results for reconstructing a plausible neck to craniosynostosis patients. Easily generalized it might also be applicable to other pathological shapes.