Contact
+49-9131-85-27775
+49-9131-85-27270
Secretary
Monday | 8:00 - 12:15 |
Tuesday | 8:00 - 16:45 |
Wednesday | 8:00 - 16:45 |
Thursday | 8:00 - 16:45 |
Friday | 8:00 - 12:15 |
Address
Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)
Lehrstuhl für Informatik 5 (Mustererkennung)
Martensstr. 3
91058 Erlangen
Germany
Powered by
|
Ph.D. Gallery
|
|
Real-time Respiratory Motion Analysis Using GPU-accelerated Range Imaging
Respiratory motion analysis and management are crucial issues for a plurality of medical applications. Of particular scientific concern are methods that allow to analyze the patient’s breathing in a non-invasive and real-time manner that does not involve ionizing radiation exposure. For this purpose, range imaging technologies, that allow to dynamically acquire three-dimensional body surface data, have been proposed over the last years. A particular challenge with such methods is a fully automatic investigation and assessment of the body surface data, as well as computation times that comply with real-time constraints. This dissertation is concerned with the application of range imaging principles for real-time automatic respiratory motion analysis. The focus is on the development of efficient methods for data preprocessing and fusion as well as machine learning and surface registration techniques. A particular emphasis of this thesis is the design of the proposed algorithms for GPU architectures to enable real-time computation.
The first part of this thesis covers the general challenges and requirements for respiratory motion analysis in diagnostic and therapeutic applications. Furthermore, the range imaging technologies that are relevant for this thesis are introduced and the suitability of GPU architectures for accelerating the computation of several tasks inherent to range imaging based respiratory motion analysis are investigated.
The second part of this work is concerned with pre-processing and fusion techniques for range data. To account for the low signal-to-noise ratio that is common with range data, this work proposes a processing pipeline that allows to reconstruct the ideal data with an error trueness less than 1.0 mm at run-times of 2 ms. For fusing range image data in a multi-camera setup, as it is required for the simultaneous acquisition of frontal and lateral body surface, this thesis proposes a novel framework that enables the computation of a 180◦ coverage body surface model consisting of more than 3.0 × 105 points with a computation time of less than 5 ms.
The third part of this work is concerned with patient-specific respiratory motion models. The thesis proposes machine learning techniques to generate a continuous motion model that features the ability to automatically differentiate between thoracic and abdominal breathing as well as to quantitatively analyze the patient’s respiration magnitude. By using purposely developed surface registration schemes, these models are then brought in congruence with body surface data acquired by range imaging sensors. This allows for respiratory motion compensated patient positioning that reduces the alignment error observed with conventional approaches by a factor of 3 to less than 4.0 mm. Further, this approach allows to automatically derive a multi-dimensional respiration surrogate that yields a correlation coefficient greater than 0.97 compared to commonly employed invasive or semi-automatic approaches and that can be computed in 20 ms.
The fourth part concludes this thesis with a summary of the presented methods and results, as well as an outlook regarding future research directions and challenges towards clinical translation.
|
|
Dominik Neumann
|
24.06.2019
|
|
Robust Personalization of Cardiac Computational Models
Heart failure (HF) is a major cause of morbidity and mortality in the Western world, yet early diagnosis and treatment remain a major challenge. As computational cardiac models are becoming more mature, they are slowly evolving into clinical tools to better stratify HF patients, predict risk and perform treatment planning. A critical prerequisite, however, is their ability to precisely capture an individual patient’s physiology. The process of fitting a model to patient data is called personalization, which is the overarching topic of this thesis.
An image-based, multi-physics 3D whole-heart model is employed in this work. It consists of several components covering anatomy, electrophysiology, biomechanics and hemodynamics. Building upon state-of-the-art personalization techniques, the first goal was to develop an automated pipeline for personalizing all components of the model in a streamlined and reproducible fashion, based on routinely acquired clinical data. Evaluation was performed on a multi-clinic cohort consisting of 113 patients, the largest cohort in any comparable study to date. The goodness of fit between per- sonalized models and ground-truth clinical data was mostly below clinical variability, while a full personalization was finalized within only few hours. This showcases the ability of the proposed pipeline to extract advanced biophysical parameters robustly and efficiently.
Designing such personalization algorithms is a tedious, model- and data-specific process. The second goal was to investigate whether artificial intelligence (AI) con- cepts can be used to learn this task, inspired by how humans manually perform it. A self-taught artificial agent based on reinforcement learning (RL) is proposed, which first learns how the model behaves, then computes an optimal strategy for person- alization. The algorithm is model-independent; applying it to a new model requires only adjusting few hyper-parameters. The obtained results for two different mod- els suggested that equivalent, if not better goodness of fit than standard methods could be achieved, while being more robust and with faster convergence rate. AI ap- proaches could thus make personalization algorithms generalizable and self-adaptable to any patient and any model.
Due to limited data, uncertainty in the clinical measurements, parameter non- identifiability, and modeling assumptions, various combinations of parameter val- ues may exist that yield the same quality of fit. The third goal of this work was uncertainty quantification (UQ) of the estimated parameters and to ascertain the uniqueness of the found solution. A stochastic method based on Bayesian inference and fast surrogate models is proposed, which estimates the posterior of the model, taking into account uncertainties due to measurement noise. Experiments on the biomechanics model showed that not only goodness of fits equivalent to the standard methods could be achieved, but also the non-uniqueness of the problem could be demonstrated and uncertainty estimates reported, crucial information for subsequent clinical assessments of the personalized models.
|
|
|
Respiratory Motion Compensation in X-Ray Fluoroscopy
Fluoroscopy is a common imaging modality in medicine for guidance of minimally invasive interventions due to its high temporal and spatial resolution and the good visibility of interventional devices and bones. To counteract its lack of 3-D information and soft-tissue contrast, the X-ray images can be enhanced with overlays. This constitutes a medical application of augmented reality technology. Most commonly, the overlays are static. Due to inevitable respiratory and cardiac motion of the patient when imaging chest or abdomen, the images and the overlays frequently are inconsistent. In this thesis, two methods for compensating this involuntary motion are presented. In the first method, a respiratory signal is estimated from a 2-D+t X-ray image sequence using unsupervised learning. The robustness of respiratory signal extraction to common disturbances occurring in interventional imaging is increased by patch-based processing and illumination invariance. The respiratory signal is used as prior information in the subsequent motion estimation step. Due to transparency effects in X-ray, conventional registration methods are not applicable for motion estimation. Thus, a novel surrogate-driven layered motion model is proposed to estimate the respiratory motion from the X-ray sequence. The motion model is incorporated in an energy formulation that is solved with an efficient graphics processing unit (GPU) implementation of primal-dual optimization. In the second method, pre-operative magnetic resonance imaging (MRI) enables 3-D imaging with good soft-tissue contrast. Real-time 2-D+t slice images are acquired and stacked to 3-D+t volumes using a Markov random field (MRF). The MRF enforces similar surrogate signals in slices that are assigned to each other as well as temporal smoothness of the assignment. The surrogate signal for respiratory motion is estimated from the MRI images, while the cardiac surrogate signal is derived from electrocardiography (ECG). In the MRI volumes, conventional 3-D/3-D registration is used to estimate the patient motion. In both methods, the surrogate signals and the estimated motions are combined in a motion model, which is then used for motion compensation in X-ray. For evaluation, experiments are conducted on pig and patient data. Compared to static overlays, the residual apparent motion in the X-ray images is reduced by 13% using the X-ray-based motion model and by 40% using the MRI-based motion model. The runtime of applying the motion model during the procedure is sufficient for real-time processing at common fluoroscopy frame rates. Two alternative methods with different properties are proposed in this thesis. X-ray-based motion compensation requires no pre-procedural processing, but is more complex during the procedure and is limited to respiratory motion. MRI-based motion compensation can also handle cardiac motion, but needs to be transferred from MRI to X-ray. The choice of motion model depends on the requirements of the clinical application. To this end, the X-ray-based motion model is implemented in a prototype and is evaluated clinically.
|
|
Chris Schwemmer
|
29.03.2019
|
|
3-D Imaging of Coronary Vessels Using C-arm CT
Cardiovascular disease has become the number one cause of death worldwide. For the diagnosis and therapy of coronary artery disease, interventional C-arm-based fluoroscopy is animaging method of choice. While these C-arm systems are also capable of rotating around the patient and thus allow a CT-like 3-D image reconstruction, their long rotation time of about five seconds leads to strong motion artefacts in 3-D coronary artery imaging. In this work, a novel method is introduced that is based on a 2-D–2-D image registration algorithm. It is embedded in an iterative algorithm for motion estimation and compensation and does not require any complex segmentation or user interaction. It is thus fully automatic, which is a very desirable feature for interventional applications. The method is evaluated on simulated and human clinical data. Overall, it could be shown that the method can be successfully applied to a large set of clinical data without user interaction or parameter changes, and with a high robustness against initial 3-D image quality, while delivering results that are at least up to the current state of the art, and better in many cases.
|
|
|
Artificial Intelligence for Medical Image Understanding
Robust and fast detection and segmentation of anatomical structures in medical image data represents an important component of medical image analysis technologies. Current solutions for this problem are nowadays based on machine learning techniques that exploit large annotated image databases in order to learn the appearance of the captured anatomy. These solutions are subject to several limitations. This includes the use of suboptimal image feature engineering methods and most importantly the use of computationally suboptimal search-schemes for anatomy parsing, e.g., exhaustive hypotheses scanning. In particular, these techniques do not effectively address cases of incomplete data, i.e., scans acquired with a partial field-of-view. To address these challenges, we introduce in this thesis marginal space deep learning, a framework for medical image parsing which exploits the automated feature design of deep learning models and an efficient object parametrization scheme in hierarchical marginal spaces. To support the efficient evaluation of solution hypotheses under complex transformations, such as rotation and anisotropic scaling, we propose a novel cascaded network architecture, called sparse adaptive neural network. Experiments on detecting and segmenting the aortic root in 2891 3D ultrasound volumes from 869 patients, demonstrate a high level of robustness with an accuracy increase of 30-50% against the state-of-the-art. Despite these facts, using a scanning routine to explore large parameter subspaces results in a high computational complexity, with false-positive predictions and limited scalability to high resolution volumetric data. To deal with these limitations, we propose a novel paradigm for medical image parsing, based on principles of cognitive modeling and behavior learning. The anatomy detection problem is reformulated as a behavior learning task for an intelligent artificial agent. Using deep reinforcement learning, agents are taught how to search for an anatomical structure. This resumes to learning to navigate optimal search trajectories through the image space that converge to the locations of the sought anatomical structures. To support the effective parsing of high resolution volumetric data, we apply elements from scale-space theory and enhance our framework to support the learning of multi-scale search strategies through the scale-space representation of medical images. Finally, to enable the accurate recognition whether certain anatomical landmarks are missing from the field-of-view, we exploit prior knowledge about the anatomy and ensure the spatial coherence of the agents by using statistical shape modeling and robust estimation theory. Comprehensive experiments demonstrate a high level of accuracy, compared to state-of-the-art solutions, without failures of clinical significance. In particular, our method achieves 0% false positive and 0% false negative rates at detecting whether anatomical structures are captured in the field-of-view (excluding border cases). The dataset contains 5043 3D computed tomography volumes from over 2000 patients, totaling over 2,500,000 image slices. A significant increase in accuracy compared to reference solutions is achieved on additional 2D ultrasound and 2D/3D magnetic resonance datasets, containing up to 1000 images. Most importantly, this paradigm improves the detection-speed of the previous solutions by 2-3 orders of magnitude, achieving unmatched real-time performance on high resolution volumetric scans.
|
|
Felix Lugauer-Breunig
|
14.12.2018
|
|
Iterative Reconstruction Methods for Accelerating Quantitative Abdominal MRI
|
|
Vincent Christlein
|
30.11.2018
|
|
Handwriting with Focus on Writer Identification and Writer Retrieval
In the course of the mass digitization of historical as well as contemporary sources, an individual examination by means of historical or forensic experts is no longer feasible. A solution could be an automatic handwriting analysis that determines or suggests script attributes, such as the writer or the date of a document. In this work, several novel techniques based on machine learning are presented to obtain these attributes from a single document image. The focus lies on writer recognition for which a novel pipeline is developed, which identifies the correct writer of a given sample in over 99 % of all tested contemporary datasets, numbering between 150 and 310 writers each, with four to five samples per writer. In a large historical dataset, consisting of 720 writers and five samples per writer, an identification rate of close to 90 % is achieved. Robust local descriptors play a major role in the success of this pipeline. Shape- and histogram-based descriptors prove to be very effective. Furthermore, novel deep-learningbased features are developed using deep convolutional neural networks, which are trained with writer information from the training set. While these features achieve very good results in contemporary data, they lack distinctiveness in the evaluated historical dataset. Therefore, a novel feature learning technique is presented that solves this by learning robust writer-independent script features in an unsupervised manner. The computation of a global descriptor from the local descriptors is the next step. For this encoding procedure, various techniques from the speech and computer vision community are investigated and thoroughly evaluated. It is important to counter several effects, such as feature correlation and the over-counting of local descriptors. Overall, methods based on aggregating first order statistics of residuals are the most effective approaches. Common writer recognition methods use the global descriptors directly for comparison. In contrast, exemplar classifiers are introduced in this thesis allowing sample-individual similarities to be learned, which are shown to be very effective for an improved writer recognition. This writer recognition pipeline is adapted to other tasks related to digital paleography. Medieval papal charters are automatically dated up to an error range of 17 years. Furthermore, an adapted pipeline is among the best to classify medieval Latin manuscripts into twelve different script types. This information can then be used for a pre-sorting of documents or as a preprocessing step for handwritten text recognition. It turns out that counteracting different illumination and contrast effects is an important factor for deep-learning-based approaches. The observation that script has similar tubal structures to blood vessels is exploited for an improved text block segmentation in historical data by means of a well-known medical filtering technique. This work sets new recognition standards for several tasks, allowing the automatic document analysis of large corpora with low error rates. These methods are also applicable to other fields, such as forensics or paleography, to determine writers, script types or other metadata of contemporary or historical documents.
|
|
Tanja Kurzendorfer
|
15.10.2018
|
|
Fully Automatic Segmentation of Anatomy and Scar from LGE-MRI
The leading cause of death worldwide are cardiovascular diseases. In addition, the number of patients suffering from heart failure is rising. The underlying cause of heart failure is often a myocardial infarction. For diagnosis in clinical routine, cardiac magnetic resonance imaging is used, as it provides information about morphology, blood flow, perfusion, and tissue characterization. In more detail, the analysis of the tissue viability is very important for diagnosis, procedure planning, and guidance, i.e., for implantation of a bi-ventricular pacemaker. The clinical gold standard for the viability assessment is 2-D late gadolinium enhanced magnetic resonance imaging (LGE-MRI). In the last years, the imaging quality continuously improved and LGE-MRI was extended to a 3-D whole heart scan. This scan guarantees an accurate quantification of the myocardium to the extent of myocardial scarring. The main challenge arises in the accurate segmentation and analysis of such images. In this work, novel methods for the segmentation of the LGE-MRI data sets, both 2-D and 3-D, are proposed. One important goal is the direct segmentation of the LGE-MRI and the independence of an anatomical scan to avoid errors from the anatomical scan contour propagation. For the 2-D LGE-MRI segmentation, the short axis stack of the left ventricle (LV) is used. First, the blood pool is detected and a rough outline is maintained by a morphological active contours without edges approach. Afterwards, the endocardial and epicardial boundary is estimated by either a filter or learning based method in combination with a minimal cost path search in polar space. For the endocardial contour refinement, an additional scar exclusion step is added. For the 3-D LGE-MRI, the LV is detected within the whole heart scan. In the next step, the short axis view is estimated using principal component analysis. For the endocardial and epicardial boundary estimation also a filter based or learning based approach can be applied in combination with dynamic programming in polar space. Furthermore, because of the high resolution also the papillary muscles are segmented. In addition to the fully automatic LV segmentation approaches, a generic semi- automatic method based on Hermite radial basis function interpolation is introduced in combination with a smart brush. Effective interactions with less number of equations accelerate the performance and therefore, a real-time and an intuitive, interactive segmentation of 3-D objects is supported effectively. After the segmentation of the left ventricle’s myocardium, the scar tissue is quantified. In this thesis, three approaches are investigated. The full-width-at-half-max algorithm and the x-standard deviation methods are implemented in a fully automatic manner. Furthermore, a texture based scar classification algorithm is introduced. Subsequently, the scar tissue can be visualized, either in 3-D as a surface mesh or in 2-D projected onto the 16 segment bull’s eye plot of the American Heart Association. However, for precise procedure planning and guidance, the information about the scar transmurality is very important. Hence, a novel scar layer visualization is introduced. Therefore, the scar tissue is divided into three layers depending on the location of the scar within the myocardium. With this novel visualization, an easy distinction between endocardial, mid-myocardial, or epicardial scar is possible. The scar layers can also be visualized in 3-D as surface meshes or in 2-D projected onto the 16 segment bull’s eye plot.
|
|
Matthias Weidler
|
14.09.2018
|
|
3-D/2-D Registration of Left Atria Surface Models to Flouroscopic Images for Cardiac Ablation Procedures
Atrial fibrillation is one of the most common cardiac arrhythmias. It can be treated minimally invasive by catheter ablation. For guidance during the intervention, augmented fluoroscopy systems gain more and more popularity. These systems allow to fuse data which was pre-operatively acquired, e.g., using computed tomography, and intra-operative patient data. This facilitates navigation during the procedure by overlaying a 3-D model of the patient's left atrium to the fluoroscopic images. Moreover, if X-ray images are acquired from two views, 2-D image annotations can be displayed
with respect to the 3-D patient model. Image fusion and annotation requires an accurate registration of pre-operative and intra-operative data which is mostly performed manually. This thesis is primarily concerned with methods for automatizing both the registration process and also steps which are required for registration. Thus, we propose also methods for reconstructing the 3-D shape of catheters from 2-D X-ray images as this is needed later for registration. In the first part of this thesis, we present methods for fast 3-D annotation of catheters. The first method is able to annotate whole line-shaped catheters in 2-D X-ray images based on a single seed point. To this end, catheter-like image regions are transformed into a graph like structure which serves as reduced search space for the catheter detection method. Resulting annotations in two X-ray images from
different views can then be used to compute a 3-D reconstruction of the catheters. Our proposed method establishes point correspondences based on epipolar geometry. We define an optimality criterion that makes this approach robust with respect to spurious and missing point correspondences. Both methods are then used to establish a method for automatic cryoballoon catheter reconstruction. The second part investigates registration methods based on devices placed at certain anatomical structures. We present two different methods, one for thermal ablation
and one for cryoablation. The first method relies on line-shaped devices placed outside the left atrium in the oesophagus and the coronary sinus. Their 3-D shape can be reconstructed using the algorithms presented in the first part and can then be aligned to their corresponding 3-D structures segmented from the preoperative data. The second method uses the pulmonary vein ostium which is a structure inside the left atrium in which cryoballoons are placed. A registration is established by relating the ostium position to a reconstruction of the cryoballoon computed using the approach presented in the first part. We use a skeletonization of the 3-D left atrium model to extract potential ostia from the model. In the last part we consider an automatic registration method based on injected contrast agent. In this context we present a method for classification of contrasted frames. Moreover, we present a novel similarity measure for 3-D/2-D registration that takes into account how plausible a registration is. Plausibility is determined with respect to a reconstructed contrast agent distribution within the 3-D left atrium and the contrast agent in the 2-D images. We show that a combination of this similarity measure and a measure that relates edge information from the contrast agent in 2-D images to edges of the 3-D model increases accuracy substantially. As a final step, the frame-wise registration results are postprocessed by means of a Markovchain model of the cardiac motion. This method of temporal filtering reduces outliers and improves registration quality significantly.
|
|
|
Automated Glaucoma Detection with Optical Coherence Tomography
The number of patients suffering from the glaucoma disease will increase in the future. A further automation of parts of the diagnostic routine is inevitable to use limited examination times more efficiently. Optical coherence tomography (OCT) technology has become a widespread tool for glaucoma diagnosis, and data collections in the clinics have been built up in recent years that now allow for data mining and pattern recognition approaches to be applied to the diagnostic challenge. A complete pattern recognition pipeline to automatically discriminate glaucomatous from normal eyes with OCT data is proposed, implemented and evaluated. A data collection of 1024 Spectralis HRA+OCT circular scans around the optic nerve head from 565 subjects build the basis for this work. The data collection is labeled with 4 diagnoses: 453 healthy (H), 179 ocular hypertension (OHT), 168 preperimetric glaucoma (PPG), and 224 perimetric glaucoma (PG) eyes.
In a first step, 6 retinal layer boundaries are automatically segmented by edge detection and the minimization of a custom energy functional, which was established in preceeding work by the author. The segmentation algorithm is evaluated on a subset consisting of 120 scans. The automatically segmented layer boundaries are compared to a gold standard (GS) created from manual corrections to the automated results by 5 observers. The mean absolute difference of the automated segmentation to the GS for the outer nerve fiber layer boundary is 2.84mum. The other layers have less or almost no segmentation error. No significant correlation between the segmentation error and scans of bad quality or glaucomatous eyes could be found for any layer boundary. The difference of the automated segmentation to the GS is not much worse than the single observer’s manual correction difference to the GS. In a second step, the thickness profiles generated by the segmentation are used in a classification system: In total, 762 features are generated, including novel ratio and principal component analysis features. “Forward selection and backward elimination” selects the best performing features with respect to the classwise averaged classification rate (CR) on the training data. The segmentations of the complete dataset were manually corrected so that the classification experiments could either be run on manually corrected or purely automated segmentations. Three classifiers were compared. The support vector machine classifier (SVM) performed best in a 10-fold cross-validation and differentiated non-glaucomatous (H and OHT) from glaucomatous (PPG and PG) eyes with a CR of 0.859 on manually corrected data. The classification system adapts to the less reliable purely automated segmentations by choosing features of a more global scale. Training with manually corrected and testing with purely automated data and vice versa shows that it is of advance to use manually corrected data for training, no matter what the type of test data is. The distance of the feature vectors to the SVM decision boundary is used as a basis for a novel glaucoma probability score based on OCT data, the OCT-GPS.
|
|
Peter Fürsattel
|
04.05.2018
|
|
Accurate Measurements with Off-the-Shelf Range Cameras
Distance cameras have gained large popularity in the last years. With more than 24 million Microsoft Kinect units sold and the proliferation of 3-D sensors for biometric authentication, these cameras have reached the mass market. Distance cameras capture an image in which each pixel encodes the distance to its corresponding point in the scene. This opens up new application possibilities which are difficult or even impossible to implement with normal gray-level or color cameras. These new applications are particular useful if they can be implemented with low-cost consumer 3-D cameras. However, this is problematic as these sensors have only limited accuracy compared to professional measurement systems and are thus not yet sufficient for many applications. In this thesis, several aspects that affect the accuracy of time-of-flight and structured light cameras are discussed.
The calibration of cameras, i. e. the calculation of an exact camera model, is of major importance. The estimation of these models requires point correspondences between the scene and the camera image. Whenever high accuracy is required, it is recommended to use calibration patterns such as checkerboards. This thesis introduces two methods, which find checkerboards more reliably and accurately than existing algorithms. The evaluation of the measurement errors of distance cameras requires reference values that are considerably more accurate than those of the camera. This thesis presents a method that allows using a terrestrial laser scanner to acquire such data. However, before the reference data can be used for error analysis, it is necessary to transform the measurements of both sensors into a common coordinate system. For this purpose, an automatic method was developed that reliably calculates the unknown transformation based on a single calibration scene. The accuracy of this approach is confirmed in several experiments and clearly exceeds the accuracy of the competing state-of-the-art method. In addition, it is possible to generate reference distance images with this method, which can subsequently be used for the evaluation of distance cameras.
Time-of-flight (ToF) cameras have some error sources that are characteristic for this measurement principle. In order to be able to better compensate for these errors, it is first necessary to investigate the nature of these errors. This thesis also presents a comprehensive, standardized evaluation of the systematic errors of ToF cameras. For this purpose, six experiments are defined, which are subsequently carried out with eight ToF cameras. The evaluation of these experiments shows that the characteristic errors are differently pronounced with the investigated cameras, but nonetheless, can be observed even with the most recent models. Finally, a new calibration method for structured-light sensors is proposed, as well as an algorithm which refines parametric models of objects. The calibration method allows calculating a complete sensor model based on two or more images, even if the projected pattern is unknown. This is particularly necessary if the sensor does not use a regular projector, but emits the pattern with a diffractive optical element. This part of the thesis also presents a novel refinement method for parametric models that uses the developed camera model. The evaluation results show that the proposed method computes more accurate model parameters than state-of-the-art fitting algorithms.
|
|
Oliver Taubmann
|
19.04.2018
|
|
Dynamic Cardiac Chamber Imaging in C-arm Computed Tomography
Cardiovascular diseases, i.e. disorders pertaining to the heart and blood vessels, are a major cause of mortality in developed countries. Many of these disorders, such as stenoses and some cases of valvular dysfunction, can be diagnosed and treated minimally invasively in percutaneous, catheter-based interventions. Navigation of the catheters as well as assessment and guidance of these procedures rely on interventional X-ray projection imaging performed using an angiographic C-arm device.
From rotational angiography acquisitions, during which the C-arm rotates on a circular trajectory around the patient, volumetric images can be reconstructed similarly to conventional computed tomography (CT). A three-dimensional representation of the beating heart allowing for a comprehensive functional analysis during the intervention would be useful for clinicians. However, due to the slow rotational speed of the C-arm and the resulting inconsistency of the raw data, imaging dynamic objects is challenging. More precisely, only small, substantially undersampled subsets of the data, which correspond to the same cardiac phases, are approximately consistent. This causes severe undersampling artifacts in the images unless sophisticated reconstruction algorithms are employed. The goal of this thesis is to develop and evaluate such methods in order to improve the quality of dynamic imaging of cardiac chambers in C-arm CT.
One of the two approaches that is investigated in this work aims to mitigate raw data inconsistencies by compensating for the heart motion. It relies on a non-rigid motion estimate obtained from a preliminary reconstruction by means of image registration. We develop a pipeline for artifact reduction and denoising of these preliminary images that increases the robustness of motion estimation and thus removes artificial motion patterns in the final images. We also propose an iterative scheme alternating motion estimation and compensation combined with spatio-temporal smoothing to further improve both image quality and accuracy of motion estimation. Furthermore, we design an open-source tool for comparing motion-compensated reconstruction methods in terms of edge sharpness.
The other approach formulates reconstruction as an optimization problem and introduces prior models of the image appearance in order to find a suitable solution. In particular, sparsity-based regularization as suggested by compressed sensing theory proves beneficial. We investigate and compare temporal regularizers, which yield considerable image quality improvements. In a task-based evaluation concerned with functional analysis of the left ventricle, we study how spatio-temporally regularized reconstruction, carried out with a state-of-the-art proximal algorithm, degrades when the number of projection views is reduced. Finally, we devise a correction scheme that enables dynamic reconstruction of a volume of interest in order to reduce computational effort.
Compared to one another, the approaches exhibit differences with regard to the appearance of the reconstructed images in general and the cardiac motion in particular. A straightforward combination of the methods yields a trade-off between these properties. All in all, both the hybrid and the individual approaches are able to reconstruct dynamic cardiac images with good quality in light of the challenges of rotational angiography.
|
|
|
Multi-Frame Super-Resolution Reconstruction with Applications to Medical Imaging
The optical resolution of a digital camera is one of its most crucial parameters with broad relevance for consumer electronics, surveillance systems, remote sensing, or medical imaging. However, resolution is physically limited by the optics and sen- sor characteristics. In addition, practical and economic reasons often stipulate the use of out-dated or low-cost hardware. Super-resolution is a class of retrospec- tive techniques that aims at high-resolution imagery by means of software. Multi- frame algorithms approach this task by fusing multiple low-resolution frames to reconstruct high-resolution images. This work covers novel super-resolution methods along with new applications in medical imaging.
The first contribution of this thesis concerns computational methods to super- resolve image data of a single modality. The emphasis lies on motion-based algo- rithms that are derived from a Bayesian statistics perspective, where subpixel mo- tion of low-resolution frames is exploited to reconstruct a high-resolution image. More specifically, we introduce a confidence-aware Bayesian observation model to account for outliers in the image formation, e. g. invalid pixels. In addition, we propose an adaptive prior for sparse regularization to model natural images ap- propriately. We then develop a robust optimization algorithm for super-resolution using this model that features a fully automatic selection of latent hyperparam- eters. The proposed approach is capable of meeting the requirements regarding robustness of super-resolution in real-world systems including challenging con- ditions ranging from inaccurate motion estimation to space variant noise. For in- stance, in case of inaccurate motion estimation, the proposed method improves the peak-signal-to-noise ratio (PSNR) by 0.7 decibel (dB) over the state-of-the-art.
The second contribution concerns super-resolution of multiple modalities in the area of hybrid imaging. We introduce novel multi-sensor super-resolution techniques and investigate two complementary problem statements. For super- resolution in the presence of a guidance modality, we introduce a reconstruction algorithm that exploits guidance data for motion estimation, feature driven adap- tive regularization, and outlier detection to reliably super-resolve a second modal- ity. For super-resolution in the absence of guidance data, we generalize this ap- proach to a reconstruction algorithm that jointly super-resolves multiple modali- ties. These multi-sensor methodologies boost accuracy and robustness compared to their single-sensor counterparts. The proposed techniques are widely appli- cable for resolution enhancement in a variety of multi-sensor vision applications including color-, multispectral- and range imaging. For instance in color imag- ing as a classical application, joint super-resolution of color channels improves the PSNR by 1.5 dB compared to conventional channel-wise processing.
The third contribution transfers super-resolution to workflows in healthcare. As one use case in ophthalmology, we address retinal video imaging to gain spatio- temporal measurements on the human eye background non-invasively. In order to enhance the diagnostic usability of current digital cameras, we introduce a frame- work to gain high-resolution retinal images from low-resolution video data by exploiting natural eye movements. This framework enhances the mean sensitiv- ity of automatic blood vessel segmentation by 10 % when using super-resolution for image preprocessing. As a second application in image-guided surgery, we investigate hybrid range imaging. To overcome resolution limitations of current range sensor technologies, we propose multi-sensor super-resolution based on domain-specific system calibrations and employ high-resolution color images to steer range super-resolution. In ex-vivo experiments for minimally invasive and open surgery procedures using Time-of-Flight (ToF) sensors, this technique im- proves the reliability of surface and depth discontinuity measurements compared to raw range data by more than 24 % and 68 %, respectively.
|
|
Mathias Unberath
|
29.05.2017
|
|
Signal Processing for Interventional X-ray-based Coronary Angiography
Rotational angiography using C-arm scanners enables intra-operative 3-D imaging that has proved beneficial for diagnostic assessment and interventional guidance. Despite previous efforts, rotational angiography was not yet successfully established in clinical practice for coronary artery imaging but remains subject of intensive academic research. 3-D reconstruction of the coronary vasculature is impeded by severe lateral truncation of the thorax, as well as substantial intra-scan respiratory and cardiac motion. Reliable and fully automated solutions to all of the aforementioned problems are required to pave the way for clinical application of rotational angiography and, hence, sustainably change the state-of-care. Within this thesis, we identify shortcomings of existing approaches and devise algorithms that effectively address non-recurrent object motion, severe angular undersampling, and the dependency on projection domain segmentations. The proposed methods build upon virtual digital subtraction angiography (vDSA) that voids image truncation and enables prior-reconstruction-free respiratory motion compensation using both Epipolar consistency conditions (ECC) and auto-focus measures (AFMs). The motion-corrected geometry is then used in conjunction with a novel 4-D iterative algorithm that reconstructs images at multiple cardiac phases simultaneously. The method allows for communication among 3-D volumes by regularizing the temporal total variation (tTV) and thus implicitly addresses the problem of insufficient data very effectively. Finally, we consider symbolic coronary artery reconstruction from very few observations and develop generic extensions that consist of symmetrization, outlier removal, and projection domain-informed topology recovery. When applied to two state-of-the-art reconstruction algorithms, the proposed methods substantially reduce problems due to incorrect 2-D centerlines, promoting improved performance. Given that all methods proved effective on the same in silico and in vivo data sets, we are confident that the proposed algorithms bring rotational coronary angiography one step closer to clinical applicability.
|
|
Johannes Jordan
|
15.05.2017
|
|
Interactive Analysis of Multispectral and Hyperspectral Image Data
A multispectral or hyperspectral sensor captures images of high spectral resolution by dividing the light spectrum into many narrow bands. With the advent of affordable and flexible sensors, the modality is constantly widening its range of applications. This necessitates novel tools that allow general and intuitive analysis of the image data. In this work, a software framework is presented that bundles interactive visualization techniques with powerful analysis capabilities and is accessible through efficient computation and an intuitive user interface. Towards this goal, several algorithmic solutions to open problems are presented in the fields of edge detection, clustering, supervised segmentation and visualization of hyperspectral images.
In edge detection, the structure of a scene can be extracted by finding discontinuities between image regions. The high dimensionality of hyperspectral data poses specific challenges for this task. A solution is proposed based on a data-driven pseudometric. The pseudometric is computed through a fast manifold learning technique and outperforms established metrics and similarity measures in several edge detection scenarios.
Another approach to scene understanding in the hyperspectral or a derived feature space is data clustering. Through pixel-cluster assignment, a global segmentation of an image is obtained based on reflectance effects and materials in the scene. An established mode-seeking method provides high-quality clustering results, but is slow to compute in the hyperspectral domain. Two methods of speedup are proposed that allow computations in interactive time. A further method is proposed that finds clusters in a learned topological representation of the data manifold. Experimental results demonstrate a quick and accurate clustering of the image data without any assumptions or prior knowledge, and that the proposed methods are applicable for the extraction of material prototypes and for fuzzy clustering of reflectance effects.
For supervised image analysis, an algorithm for seed-based segmentation is introduced to the hyperspectral domain. Specific segmentations can be quickly obtained by giving cues about regions to be included in or excluded from a segment. The proposed method builds on established similarity measures and the proposed data-driven pseudometric. A new benchmark is introduced to assess its performance.
The aforementioned analysis methods are then combined with capable visualization techniques. A method for non-linear false-color visualization is proposed that distinguishes captured spectra in the spatial layout of the image. This facilitates the finding of relationships between objects and materials in the scene. Additionally, a visualization for the spectral distribution of an image is proposed. Raw data exploration becomes more feasible through manipulation of this plot and its link to traditional displays. The combination of false-color coding, spectral distribution plots, and traditional visualization enables a new workflow in manual hyperspectral image analysis.
|
|
Christoph Forman
|
28.03.2017
|
|
Iterative Reconstruction Methods to Reduce Respiratory Motion Artifacts in Cartesian Coronary MRI
Cardiovascular diseases and coronary artery disease (CAD) in particular are the leading cause of death in most developed countries worldwide. Although CAD progresses slowly over several years, it often remains unnoticed and may lead to myocardial infarction in a sudden event. For this reason, there is a strong clinical need for the development of non-invasive and radiation-free screening methods allowing for an early diagnosis of these diseases. In this context, magnetic resonance imaging (MRI) represents a promising imaging modality. However, the slow acquisition process and the consequent susceptibility to artifacts due to cardiac and respiratory motion are major challenges for coronary MRI and have so far hindered its routine application in clinical examinations. Commonly, respiratory motion is addressed during free-breathing acquisitions by gating the scan to a consistent respiratory phase in end-expiration with a navigator monitoring the patient's diaphragm. Acceptance rates below 40% lead to a prolonged total acquisition time that is also not predictable in advance. In this work, a novel variable-density spiral phyllotaxis pattern is introduced for free-breathing whole-heart coronary MRI. It provides an incoherent sub-sampling of the Cartesian phase-encoding plane and allows for highly accelerated data acquisition when combined with compressed sensing reconstruction. With this sampling pattern, sub-sampling rates up to 10.2 enable a significant reduction of the total acquisition time. Furthermore, this sampling pattern is well-prepared for respiratory self-navigation, which performs respiratory motion compensation solely relying on the acquired imaging data and promises full scan efficiency. In this context, a novel motion detection approach is proposed that provides a robust tracking of respiration. However, 1-D motion compensation based on respiratory self-navigation was found to be not always sufficient for Cartesian imaging. Hence, an alternative method is presented to reduce the effects of respiration following the concept of weighted iterative reconstruction. In this technique, inconsistencies introduced by respiratory motion during data acquisition are addressed by weighting the least squares optimization according to a data consistency measure that is obtained from respiratory self-navigation. This approach forms the basis for an extension to a motion-compensated reconstruction with a dense, non-rigid motion model. With the proposed method, motion-compensated reconstruction was enabled for the first time in 3-D whole-heart imaging without the need for either additional calibration data or manual user interaction. The techniques presented in this thesis were fully integrated in a clinical MR scanner and tested in 14 volunteers. These results were compared to a navigator-gated reference acquisition. The acquisition time of the navigator-gated reference acquisition of 10.1 ± 2.3 min was reduced by one third to 6.3 ± 0.9 min with the proposed method utilizing respiratory self-navigation. After motion-compensated reconstruction, assessment of image quality revealed no significant differences com- pared to the images of the reference acquisition. Vessel sharpness was measured as 0.44 ± 0.05 mm−1 and 0.45 ± 0.05 mm−1 for RCA, and 0.39 ± 0.04 mm−1 and 0.40 ± 0.05 mm−1 for LAD, respectively.
|
|
|
Motion Correction and Signal Enhancement in Optical Coherence Tomography
Optical Coherence Tomography (OCT) is a non-invasive optical imaging modality with micron scale resolution and the ability to generate 2D and 3D images of the human retina. OCT has found widespread use in ophthalmology. However, motion artifacts induced by the scanning nature restrict the ability to have reliable quantification of OCT images. Furthermore, OCT suffers from speckle noise and signal quality issues.
This work addresses these issues by treating the motion correction problem as a special image registration problem. Two or more 3D-OCT volumes with orthogonal scan patterns are acquired. A custom objective function is used to register the input volumes. As opposed to standard image registration, there is no reference volume as all volumes are assumed to be distorted by motion artifacts. To improve the robustness of the correction algorithm, multi-stage and multi-resolution optimization, illuminationand tilt-correction and custom similarity measures and regularization are employed. After registration, the corrected volumes are merged and a single volume with less noise is constructed by adaptively combining the registered data.
A large-scale quantitative evaluation was performed using data acquired from 73 healthy and glaucomatous eyes. Three independent orthogonal volume pairs for each location of both the optic nerve head and the macula region were acquired. The results of two motion correction algorithm profiles were compared with performing no motion correction. The evaluation measured registration performance, reproducibility performance and signal improvement using mutual information, error maps based on the difference of automatic segmentation of retinal features and a no-reference image quality assessment. In all three of these aspects, the proposed algorithm leads to major improvements, in accordance with visual inspection. For example, the mean blood vessel map reproducibility error over all data is reduced to 47% of the uncorrected error.
The algorithm has been deployed to multiple clinical sites so far. In addition, the technique has been commercialized. The main application is structural imaging for clinical practice and research. The removal of motion artifacts enables high quality en face visualization of features. The technique has also been applied to hand held OCT imaging and small animal imaging. Furthermore, applications in functional imaging in the form of intensity based angiography and Doppler OCT have been demonstrated.
Overall, the motion correction algorithm can improve both the visual appearance and the reliability of quantitative measurements derived from 3D-OCT data substantially. This promises to improve diagnosis and tracking of retinal diseases using OCT data.
|
|
|
Automatic Assessment of Prosody in Second Language Learning
The present thesis studies methods for automatically assessing the prosody of non-native speakers for the purpose of computer-assisted pronunciation training. We study the detection of word accent errors, and the general assessment of the appropriateness of a speaker’s rhythm. We propose a flexible, generic approach that is (a) very successful on these tasks, (b) competitive to other state-of-the-art result, and at the same time (c) flexible and easily adapted to new tasks.
For word accent error detection, we derive a measure for the probability of acceptable pronunciation which is ideal for a well-grounded decision whether or not to provide error feedback to the learner. Our best system achieves a true positive rate (TPR) of 71.5 % at a false positive rate (FPR) of 5 %, which is a result very competitive to the state-of-the art, and not too far away from human performance (TPR 61.9 % at 3.2 % FPR).
For scoring general prosody, we obtain a Spearman correlation of ρ = 0.773 to the human reference scores on the C-AuDiT database (sentences read by non-native speakers); this is slightly better than the average labeller on that data (comparable quality measure for machine performance: r = 0.71 vs. 0.66 for human performance). On speaker level, performance is more stable with ρ = 0.854. On AUWL (non-native speakers practising dialogues), the task is much harder for both human and machine. Our best system achieves a correlation of ρ = 0.619 to the reference scores; here, humans are better than the system (quality measure for humans: r = 0.58 vs. 0.51 for machine performance). On speaker level, correlation rises to ρ = 0.821. On both databases, the obtained results are competitive to the state-of-the-art.
|
|
Bharath Navalpakkam
|
15.12.2016
|
|
MR-Based Attenuation Correction for PET/MR Hybrid Imaging
The recent and successful integration of positron emission tomography (PET) and magnetic resonance imaging (MRI) modalities in one device has gained wide attention. This new hybrid imaging modality now makes it possible to image the functional metabolism from PET in conjunction with MRI with its excellent soft tissue contrast. Besides providing specific anatomical detail, MRI also eliminates any ionizing radiation from eg. computed tomography (CT) examinations that is otherwise performed in standard PET/CT hybrid imaging systems. However, an unsolved problem is the question of how to correct for the PET attenuation in an PET/MR system. In this respect, the knowledge of the spatial distribution of linear attenuation coefficients (LAC) of the patient at the PET energy level of 511 keV is required. In standalone PET systems, transmission scans using radioactive sources were used for PET attenuation correction (AC) and if needed were scaled to the PET photon energy level. While in PET/CT systems, the CT information was scaled to PET energies for the same purpose. However, in PET/MR hybrid imaging systems, this approach is not feasible as MR and CT measure aspects of proton and electron densities respectively. Therefore alternate approaches to extract attenuation information have to be pursued. One such approach is to use MR information to estimate the distribution of attenuation coefficients within the imaging subject. This is done by using a simple limited class segmentation procedure to delineate air, soft tissue, fat and lung classes and subsequent assignment of their respective attenuation coefficients at PET energy of 511 keV. This way of generating attenuation maps (μ-maps) is however far from ideal as the most attenuating medium such as cortical bone is ignored. They are instead replaced by the attenuation coefficient of a soft tissue. While this approximation has been widely accepted for PET quantification in whole-body research, it has severe underestimation effects for brain studies.
In this thesis, we propose an improved MR-based μ-map generation approach. We demonstrate that dedicated MR sequences such as ultrashort echo time sequences (UTE) are useful for the purpose of attenuation correction. From a multitude of MR images, we generate μ-maps that include cortical bone and contain continuous Hounsfield units (HU) akin to a patient CT. These are then compared against segmentation based approaches. The efficacy of continuous valued μ-maps towards PET quantification is analyzed against different μ-maps such as patient CT, segmented patient CT with bone and segmented patient CT without bone. Results indicate that the proposed MR-based μ-maps provide a less than 5% error in PET quantification than any segmentation based μ-maps for brain studies.
|
|
|
Cell Culture Monitoring with Novel Bright Field Miniature Microscopy Prototypes
Cell cultures are monitored to develop new drugs, to find efficient ways to produce vaccines and to perform toxicity tests. The cells are cultivated in an incubator and the monitoring steps such as the acquisition of images and the counting of cells are often done outside. As part of a research project, novel bright field miniature microscopy prototypes were developed. These prototypes were designed to work inside the incubator, and hence, they need to be very small. In this thesis, image processing methods for these systems (at different development stages) are presented. These methods made the systems usable for cell monitoring in an incubator. This is a main contribution of the thesis. Our analyses of the system and its components helped to improve the development of the systems. A calibration procedure and algorithms for adjusting the illumination and the focus position of these systems are introduced. Moreover, the proposed preprocessing steps such as illumination correction and contrast enhance ment improved the image quality. An image processing library and a cell monitoring software u sing the library were developed. An algorithm for counting cells in images of the prototype system was included in the image processing library. Features for viability determination were investigated and also included in the library. Another main contribution is related to all bright field microscopes. They have the following effect in common: Focusing of very thin (phase) objects differs from focusing of objects that are thicker and less transparent for light. This effect is investigated in detail, explained, and the calculation of different useful focus positions for phase objects is derived. The optical focus position can be used for applications such as phase retrieval. Slightly defocused cell images with a maximum in contrast at small details can be useful for applications such as cell segmentation or cell analysis. Strongly defocused cell images with a maximum in contrast for the cell borders can be used for applications such as cell detection.
|
|
|
Automatic Unstained Cell Detection in Bright Field Microscopy
Bright field microscopy is preferred over other microscopic imaging modalities whenever ease of implementation and minimization of expenditure are main concerns. This simplicity in hardware comes at the cost of image quality yielding images of low contrast. While staining can be employed to improve the contrast, it may complicate the experimental setup and cause undesired side effects on the cells. In this thesis, we tackle the problem of automatic cell detection in bright field images of unstained cells. The research was done in context of the interdisciplinary research project COSIR. COSIR aimed at developing a novel microscopic hardware having the following feature: the device can be placed in an incubator so that cells can be cultivated and observed in a controlled environment. In order to cope with design difficulties and manufacturing costs, the bright field technique was chosen for implementing the hardware. The contributions of this work are briefly outlined in the text which follows. An automatic cell detection pipeline was developed based on supervised learning. It employs Scale Invariant Feature Transform (SIFT) keypoints, random forests, and agglomerative hierarchical clustering (AHC) in order to reliably detect cells. A keypoint classifier is first used to classify keypoints into cell and background. An intensity profile is extracted between each two nearby cell keypoints and a profile classifier is then utilized to classify the two keypoints whether they belong to the same cell (inner profile) or to different cells (cross profile). This two-classifiers approach was used in the literature. The proposed method, however, compares to the state-of-the-art as follows: 1) It yields high detection accuracy (at least 14% improvement compared to baseline bright field methods) in a fully-automatic manner with short runtime on the low-contrast bright field images. 2) Adaptation of standard features in literature from being pixel-based to adopting a keypoint-based extraction scheme: this scheme is sparse, scale-invariant, orientation-invariant, and feature parameters can be tailored in a meaningful way based on a relevant keypoint scale and orientation. 3) The pipeline is highly invariant with respect to illumination artifacts, noise, scale and orientation changes. 4) The probabilistic output of the profile classifier is used as input for an AHC step which improves detection accuracy. A novel linkage method was proposed which incorporates the information of SIFT keypoints into the linkage. This method was proved to be combinatorial, and thus, it can be computed efficiently in a recursive manner. Due to the substantial difference in contrast and visual appearance between suspended and adherent cells, the above-mentioned pipeline attains higher accuracy in separate learning of suspended and adherent cells compared to joint learning. Separate learning refers to the situation when training and testing are done either only on suspended cells or only on adherent cells. On the other hand, joint learning refers to training the algorithm to detect cells in images which contain both suspended and adherent cells. Since these two types of cells coexist in cell cultures with shades of gray between the two terminal cases, it is of practical importance to improve joint learning accuracy. We showed that this can be achieved using two types of phasebased features: 1) physical light phase obtained by solving the transport of intensity equation, 2) monogenic local phase obtained from a low-passed axial derivative image. In addition to the supervised cell detection discussed so far, a cell detection approach based on unsupervised learning was proposed. Technically speaking, supervised learning was utilized in this approach as well. However, instead of training the profile classifier using manually-labeled ground truth, a self-labeling algorithm was proposed with which ground truth labels can be automatically generated from cells and keypoints in the input image itself. The algorithm learns from extreme cases and applies the learned model on the intermediate ones. SIFT keypoints were successfully employed for unsupervised structure-of-interest measurements in cell images such as mean structure size and dominant curvature direction. Based on these measurements, it was possible to define the notion of extreme cases in a way which is independent from image resolution and cell type.
|
|
|
Motion-Corrected Reconstruction in Cone-Beam Computed Tomography for Knees under Weight-Bearing Condition
Medical imaging plays an important role in diagnosis and grading of knee conditions such as osteoarthritis. In current clinical practice, 2-D radiography is regularly applied under weight-bearing conditions, which is known to improve diagnostic accuracy. However, 2-D images cannot fully cover the complexity of a knee joint, whereas current 3-D imaging modalities are inherently limited to a supine, unloaded patient position. Recently, cone-beam computed tomography (CBCT) scanners for 3-D weight-bearing imaging have been developed. Their specialized acquisition trajectory poses several challenges for image reconstruction. Patient motion caused by standing or squatting positions can substantially deteriorate image quality, such that the motion has to be corrected during reconstruction. Initial work on motion correction is based on fiducial markers, yet, the approach prolonged image acquisition and required a large amount of manual interaction. The goal of this thesis was to further develop innovative motion correction methods for weight-bearing imaging of knees.
Within the course of this thesis, the marker-based motion correction was steadily enhanced. Manual annotation of markers has been replaced by a robust, fully automatic detection of markers and their correspondences. A clear disadvantage of markers is the often tedious attachment, which decreases patient comfort and interferes with the acquisition protocol. Also, the method is limited to rigid motion and an extension to nonrigid deformations is nontrivial. To alleviate these drawbacks, we introduce a novel motion estimation approach that makes use of a prior, motion-free reference reconstruction. The motion of femur and tibia is determined individually by rigid 2-D/3-D registration of bone segmentations from the prior scan, to each of the acquired weight-bearing projection images. Reliability of the registration is greatly influenced by the large amount of overlapping structures, especially for lateral view directions. We compare two different similarity measures used for 2-D/3-D registration and also introduce a temporal smoothness regularizer to improve registration accuracy. A common evaluation of marker- and registration-based approach yields superior image quality using 2-D/3-D registration, particularly in presence of severe, nonrigid motion. Further reduction of the algorithm’s runtime and an automation of bone segmentations could allow for a complete replacement of marker-based motion correction in future applications.
In case the clinical setup prohibits acquisition of a prior scan, motion correction relies solely on the acquired projection images. We derived a new motion correction method based on Fourier consistency conditions (FCC) which is independent of surrogates or prior acquisitions. So far, FCC have not been used for motion correction and were typically limited to fan-beam geometries. We first introduced the motion estimation for the fan-beam geometry, followed by a practical extension to CBCT. Numerical phantom simulations revealed a particularly accurate estimation of high-frequency motion and of motion collinear to the scanner’s rotation axis. FCC are currently limited to nontruncated, full-scan projection data, and thus, not yet applicable to real weight-bearing acquisitions. However, a dynamic apodization technique is introduced to account for axial truncation, allowing application to a squatting knee phantom with realistic motion. Given the large improvements in image quality, we are confident that FCC is a future candidate for a completely self-contained motion correction approach in CBCT weight-bearing imaging of knees.
|
|
|
Evaluation Methods for Stereopsis Performance
Stereopsis is one mechanism of visual depth perception, which gains 3D information from the displaced images of both eyes. Depth is encoded by disparity, the offset between the corresponding projections of one point in both retinas. Players in ball sports, who are adapted to highly competitive environments, can be assumed to develop improved performances in stereopsis, as they are required and thus trained constantly to rapidly and accurately estimate the distance of the ball. However, literature provides controversial results on the impact of stereopsis on sports such as baseball or soccer. The standard method to quantify stereopsis is to evaluate near static stereo acuity only, which denotes a subject’s minimum perceivable disparity from a near distance with stationary visual targets. These standard methods fail to reveal potential contributions of further components of stereopsis such as recognition speed, distance stereo acuity, and dynamic stereopsis, which were identified by literature to be important to describe the performance of stereopsis in sports. Therefore, this thesis contributes to the literature by introducing the Stereo Vision Performance (StereoViPer) test, which combines distance stereo acuity and response time analyses for static and dynamic stereopsis by using 3D stereo displays. The first purpose was to provide a proof of concept for the static test. Experiments analyzed the response time measurements, compared the test with traditional methods and evaluated the ability of the test to discriminate between clear and known differences in stereopsis performance, i.e. normal and defective stereopsis. The second purpose was to provide investigations of stereopsis in highly competitive ball sports. Therefore, the test was extended by a dynamic part and a gesture driven input interface to support the connection between visual perception and motor reaction. The method was used to evaluate stereopsis in soccer by comparing professional, amateur, and inexperienced subjects. This thesis contributes to the evaluation of stereopsis in soccer by speed measurements and dynamic stimuli. The third purpose was to evaluate the influence of the used 3D stereo displays on the conducted stereopsis measurements. As 3D displays provide unnatural viewing conditions, a zone of comfortable viewing has been introduced in literature that should avoid discomfort during the consumption of simulated 3D content. This thesis contributes by investigating whether the zone is sufficient to obtain natural stereopsis performance results and which further limitations due to artificial 3D content might apply. As the method could successfully discriminate between normal and defective stereopsis and produced results, which were in agreement with the literature, the proof of concept could be shown. However, soccer players did not show superior stereopsis performance compared to inexperienced subjects, although they demonstrated significantly (p <= 0.01) lower monocular choice reaction times. The zone of comfortable viewing did not preserve natural stereopsis performance. Therefore, disparities need to be selected as low as possible for stereopsis performance measurements. In conclusion, the StereoViPer test produced results that are in agreement with the literature and extended the evaluation of stereopsis by static and dynamic stereo acuity measurements in combination with response time analyses. The test provides a finer discrimination of stereopsis performance than traditional methods. This thesis contributed to the investigation of stereopsis in competitive sports by introducing an extensive testing battery, which meets the requirements of suggestions in literature.
|
|
|
Hybrid RGB/Time-of-Flight Sensors in Minimally Invasive Surgery
Nowadays, minimally invasive surgery is an essential part of medical interventions. In a typical clinical workflow, procedures are planned preoperatively with 3-dimensional (3-D) computed tomography (CT) data and guided intraoperatively by 2-dimensional (2-D) video data. However, accurate preoperative data acquired for diagnose and operation planning is often not feasible to deliver valid information for orientation and decisions within the intervention due to issues like organ movements and deformations. Therefore, innovative interventional tools are required to aid the surgeon and improve safety and speed for minimally invasive procedures. Augmenting 2-D color information with 3-D range data allows to use an additional dimension for developing novel surgical assistance systems. Here, Time-of-Flight (ToF) is a promising low-cost and real-time capable technique that exploits reflected near-infrared light to estimate the radial distances of points in a dense manner. This thesis covers the entire implementation pipeline of this new technology into a clinical setup, starting from calibration to data preprocessing up to medical applications.
The first part of this work covers a novel automatic calibration scheme for hybrid data acquisition based on barcodes as recognizable feature points. The common checkerboard pattern is overlaid by a marker that includes unique 2-D barcodes. The prior knowledge about the barcode locations allows to detect only valid feature points for the calibration process. Based on detected feature points seen from different points of view a sensor data fusion for the complementary modalities is estimated. The proposed framework achieved subpixel reprojection errors and barcode identification rates above 90% for both the ToF and the RGB sensor.
As range data of low-cost ToF sensors is typically error-prone due to different issues, e.g. specular reflections and low signal-to-noise ratio (SNR), preprocessing is a mandatory step after acquiring photometric and geometric information in a common setup. This work proposes the novel concept of hybrid preprocessing to exploit the benefits of one sensor to compensate for weaknesses of the other sensor. Here, we extended established preprocessing concepts to handle hybrid image data. In particular, we propose a nonlocal means filter that takes an entire sequence of hybrid image data into account to improve the mean absolute error of range data by 20%. A different concept estimates a high-resolution range image by means of super-resolution techniques that takes advantage of geometric displacements by the optical system. This technique improved the mean absolute error only by 12% but improved the spatial resolution simultaneously. In oder to tackle the issue of specular highlights that cause invalid range data, we propose a multi-view scheme for highlight correction. We replace invalid range data at a specific viewpoint with valid data of another viewpoint. This reduced the mean absolute error by 33% compared to a basic interpolation.
Finally, this thesis introduces three novel medical applications that benefit from hybrid 3-D data. First, a collision avoidance module is introduced that exploits range data to ensure a safety margin for endoscopes within a narrow operation side. Second, an endoscopic tool localization framework is described that exploits hybrid range data to improve tool localization and segmentation. Third, a data fusion framework is proposed to extend the narrow field of view and reconstruct the entire situs.
This work shows that hybrid image data of ToF and RGB sensors allows to improve image based assistance systems with more reliable and intuitive data for better guidance within a minimally invasive intervention.
|
|
|
Region-of-Interest Imaging with C-arm Computed Tomography
C-arm based flat-detector computed tomography (FDCT) is a promising approach for neurovascular diagnosis and intervention since it facilitates proper analysis of surgical implants and intra-procedural guidance. In the majority of endovascular treatments, intra-procedural updates of the imaged object often are restricted to a small diagnostic region of interest (ROI). Such targeted ROI is often the region of intervention that contains device/vessel specific information such as stent expansion or arterial wall apposition. Following the principle of as low as reasonably achievable (ALARA), it is highly desirable to reduce unnecessary peripheral doses outside an ROI by using physical X-ray collimation, leading to substantial reduction of patient dose. However, such a technique gives rise to severely truncated projections from which conventional reconstruction algorithms generally yield images with strong truncation artifacts.
The primary research goal of this thesis, therefore, lies on the algorithmic development of various truncation artifact reduction techniques that are dedicated for different imaging scenarios. First, a new data completion method is proposed that utilizes sinogram consistency conditions to estimate the missing sinogram. Although it is only extended to a 2D fan-beam geometry, preliminary results suggest the method is promising regarding truncation artifact reduction and attenuation coefficient recovery. Thereafter, three algorithms are presented, which either follow the analytic filtered backprojection (FBP) frame or are by construction in an iterative manner. They are capable of generating a 3D image from transaxially truncated data and thus appear to be closer to clinical applications. The first approach is the refinement of an existing truncation robust algorithm – ATRACT, which is implicitly effective with respect to severely truncated data. In this thesis, ATRACT is modified to more practically-useful reconstruction methods by expressing its expensive non-local filter as an efficient 1D/2D analytic convolution. The second approach is targeted to particular imaging applications that require an ROI with high image quality for diagnosis, and also a surrounding region with the relatively low resolution for orientation. To accomplish this task, an interleaved acquisition strategy that acquires both a sparse set of global non-truncated data and a dense set of truncated data is presented, along with three associated algorithms. The third approach is an attempt to exploit low-dose patient-specific prior knowledge for the extrapolation of truncated projections. The comparative evaluation clearly depicts the algorithmic performance of all investigated 3D methods under a uniform evaluation framework. In general, ATRACT appears to be more robust than the explicit water cylinder extrapolation in severe truncation case. Contrary to the heuristic methods, the techniques that come with either a sparse set of global data or prior knowledge achieve the ROI reconstructions in a more accurate and robust manner. The decision on which method should be selected relies on multiple factors, but the presented results could be used as the first indicator for the ease of such selection.
|
|
|
Design Considerations and Application Examples for Embedded Classification Systems
Wearable athlete support systems are a popular technology for performance enhancement in sports. The complex signal and data analysis on these systems is often tackled with pattern recognition techniques like classification. The implementation of classification algorithms on mobile hardware is called embedded classification. Thereby, technical challenges arise from the restricted computational power, battery capacity and size of such mobile systems.
The accuracy-cost tradeoff describes the two conflicting design goals of embedded classification systems; the accuracy and the computational cost of a classifier. A data analysis system should be as accurate as possible and, at the same time, as cheap as required for an implementation on a mobile system with restricted resources. Thus, accuracy and cost have to be simultaneously considered in the design phase of the algorithms. Furthermore, an accurate modeling of classification systems, a precise estimation of the expected accuracy and energy efficient algorithms are needed. The first main goal of this thesis was to develop design considerations to support the solution of the accuracy-cost tradeoff that occurs during embedded classification system design.
The success of wearable technology in sports originates in several application opportunities. A wearable system can collect data in the field without the restrictions of a lab environment or capture volume. Data with realistic variation and a high number of trials can be used to analyze the true field performance as well as its long-term progress over time. Real-time data processing allows to support athletes and coaches with augmented feedback in the field. Therefore, wearable systems enable new opportunities for the analysis of athletic performance. The second main goal of this thesis was to illustrate the benefits and opportunities of wearable athlete support systems with three application examples.
This thesis presents applications for plyometric training, golf putting and swimming. The applications were realized as body sensor networks (BSNs) and analyzed kinematic data that were acquired with inertial measurement units (IMUs).
The plyometric training application targeted the ground contact time calculation in drop jump exercises. The presented algorithm used a hidden markov model to calculate the ground contact with high accuracy. The ground contact time is an important training parameter to assess athlete performance and cannot be visually assessed.
The golf putt application realized a system for technique training in the field. It featured an algorithm for automatic putt detection and parameter extraction as a basis for augmented feedback training applications. Long-term kinematic golf training data provided new insights in the progress of golf putting.
The swimming application realized an unobtrusive swimming exercise tracker. The system was able to classify the four swimming styles, turns and breaks based on the head kinematics. Furthermore, an algorithm to classify long-term fatigue that occurs in a 90 min swimming exercise was presented. The swimming style, turn and break classification was implemented as embedded classification system on a BSN sensor node. This application underlines the applicability of the presented design considerations for solving the accuracy-cost tradeoff.
The contributions of this thesis are considerations for the design of embedded classification systems and application examples that illustrate the benefits of wearable athlete support systems. The focus was to provide practical support for the solution of the accuracy-cost tradeoff and to present the technical realization of wearable athlete support systems. The findings of this thesis can be beneficial for research as well as industrial applications in the sport, health and medical domain.
|
|
Juan Rafael Orozco-Arroyave
|
27.11.2015
|
|
Analysis of speech of people with Parkinson's disease
The analysis of speech of people with Parkinson's disease is an interesting and highly relevant topic that has been addressed in the research community during several years.
There are important contributions of this topic considering perceptual and/or semi-automatic analyses. Those contributions are mainly based on detailed observations performed by clinicians; however, the automatic analysis of signals allowed by the rapid development of technological and mathematical tools, has motivated the research community to work on the development of computational tools to perform automatic analysis of speech.
There are also several contributions on this topic; however, most of them are focused on sustained phonation of vowels and only consider recordings of one language. This thesis addresses two problems considering recordings of sustained phonations of vowels and continuous speech signals: (1) the automatic classification of Parkinson's patients vs healthy speakers, and (2) the prediction of the neurological state of the patients according to the motor section of the Unified Parkinson's Disease Rating Scale (UPDRS).
The classification experiments are performed with recordings of three languages: Spanish, German, and Czech. German and Czech data were provided by other researchers, and Spanish data were recorded in Medellín, Colombia, during the development of this work. The analyses performed upon the recordings are divided into three speech dimensions: phonation, articulation, and prosody.
Several classical approaches to assess the three speech dimensions are tested, and additionally a new method to model articulation deficits of Parkinson's patients is proposed. This new articulation modeling approach shows to be more accurate and robust than others to discriminate between Parkinson's patients and healthy speakers in the three considered languages. Additionally, the articulation and phonation seem to be the most suitable speech dimensions to predict the neurological state of the patients.
|
|
|
Reconstruction Techniques for Dynamic Radial MRI
Today, magnetic resonance imaging (MRI) is an essential clinical imaging modality and routinely used for orthopedic, neurological, cardiovascular, and oncological diagnosis. The relatively long scan times lead to two limitations in oncological MRI. Firstly, in dynamic contrast-enhanced MRI (DCE-MRI), spatial and temporal resolution have to be traded off against each other. Secondly, conventional acquisition techniques are highly susceptible to motion artifacts. As an example, in DCE-MRI of the liver, the imaging volume spans the whole abdomen and the scan must take place within a breath-hold to avoid respiratory motion. Dynamic imaging is achieved by performing multiple breath-hold scans before and after the injection of contrast agent. In practice, this requires patient cooperation, exact timing of the contrast agent injection, and limits the temporal resolution to about 10 seconds. This thesis addresses both challenges by combining a radial k-space sampling technique with advanced reconstruction algorithms for higher temporal resolution and improved respiratory motion management.
A novel reconstruction technique, golden-angle radial sparse parallel MRI (GRASP), enables performing DCE-MRI at simultaneously high spatial and temporal resolution. Iterative gradient-based and alternating optimization techniques were implemented and evaluated. GRASP is based on a single, continuous scan during free breathing, allowing for a simplified and more patient-friendly examination workflow. The technique is augmented by an automatic detection of the contrast agent bolus arrival and by incorporating variable temporal resolution. These proposed extensions reduce the number of generated image volumes, resulting in faster reconstruction and post-processing.
The radial trajectory also allows to extract a respiratory signal directly from the scan data. This self-gating property can be used for dynamic imaging in such a way that different, time-averaged phases of respiration are retrospectively reconstructed from a free-breathing scan. Automated algorithms for deriving, processing, and applying the self-gating signal were developed. The clinical relevance of self-gating was demonstrated by generating a motion model to correct for respiratory motion in a simultaneous positron emission tomography (PET) examination on hybrid PET/MRI scanners. This approach reduces the motion blur and, thus, improves tracer uptake quantification in moving lesions, while avoiding an increased noise level as it would be the case for conventional gating techniques.
In conclusion, the presented advanced reconstruction techniques help to improve the spatio-temporal resolution as well as the robustness with respect to motion of dynamic radial MRI. The effectiveness of the proposed methods was supported by numerous studies in patient settings, showing that non-Cartesian k-space sampling can be advantageous in a variety of applications.
|
|
Michal Cachovan
|
21.05.2015
|
|
Motion Corrected Quantitative Imaging in Multimodal Emission Tomography
Nuclear medicine has been using single photon emission computed tomography (SPECT) for several decades in order to diagnose and enable the treatment of patients for various clinical applications. Traditionally, routine SPECT has been used for qualitative image interpretation based diagnosis which was not backed up with data on quantitative assessment of the encountered disease. However, latest research and development introduced a novel and yet unexplored feature of quantitative measurements to clinical practice. With the introduction of new quantitative reconstruction techniques many technological questions have to be answered. This thesis presents novel methods for enhancing quantitative iterative SPECT reconstruction by means of runtime and motion correction. These methods are evaluated with clinical practice and protocols in mind.
Quantitative iterative reconstruction in SPECT is a computationally intensive process with clinical runtime burden. It becomes even more demanding when more precise system modeling is performed in order to improve quantitative accuracy of the reconstructed data. The latest graphics computational hardware of the year 2012 was successfully employed in this dissertation and a novel approach to tomographic reconstruction was proposed. The introduced method uses dedicated graphics hardware. It outperforms any currently known competing implementation by a factor eight and can therefore reconstruct high resolution quantitative images in clinically acceptable time. Patient motion during SPECT acquisition is a significant degrading factor in quantitative lesion detection, evaluation and therapy planning, yet few clinical products are available in the field that can correct for this effect. A contribution to the field of multimodal motion corrected emission tomography is proposed in this work, which is integrated within the reconstruction. This concept has the potential to improve lesion detectability by correcting motion induced defects and to reduce quantitative errors in the clinical environment. An implementation of the method on graphics cards is introduced, which fulfills clinically acceptable computation times. A quantitative evaluation of bone SPECT is introduced in this thesis and findings are reported that are new in the field of musculoskeletal clinical research. Correlations between bone turnover, bone density and dependence of these biological variables on patients’ age are reported. These results can potentially contribute to a better understanding of bone repair mechanisms in human anatomy and physiology.
|
|
Michael Manhart
|
24.10.2014
|
|
Dynamic Interventional Perfusion Imaging: Reconstruction Algorithms and Clinical Evaluation
Acute ischaemic stroke is a major cause for death and disabilities with increasing prevalence in aging societies. Novel interventional stroke treatment procedures have the potential to improve the clinical outcome of certain stroke-affected patients. Certainly, prompt diagnosis and treatment are required. Brain perfusion imaging with computed tomography (CT) or magnetic resonance imaging (MRI) is a routine method for stroke diagnosis. However, in the interventional room usually only CT imaging with flat detector C-arm systems is available, which do not support dynamic perfusion imaging yet. Enabling flat detector CT perfusion (FD-CTP) in clinical practice could support optimized stroke management. By stroke diagnosis in the interventional room precious time until the start of treatment could be saved.
Recently, first promising clinical results for FD-CTP imaging under laboratory conditions have been presented. Based on this work, this dissertation introduces and evaluates novel technical contributions for noise reduction, artifact reduction and dynamic reconstruction in FD-CTP. Furthermore, the feasibility of FD-CTP imaging in clinical practice is demonstrated for the first time using data acquired during interventional stroke treatments.
CT perfusion imaging requires measurement of dynamic contrast agent attenuation over time. The contrast agent signal in the brain tissue is very low and noise is a major problem. Thus a novel computationally fast noise reduction technique for perfusion data is introduced.
Currently available C-arm systems have a comparably low rotation speed, which makes it challenging to reconstruct the dynamic change of contrast agent concentration over time. Therefore, a dynamic iterative reconstruction algorithm is proposed to utilize the high temporal resolution in the projection data for improved reconstruction of the contrast agent dynamics.
Novel robotic C-arm systems (Artis zeego, Siemens Healthcare, Germany) provide a high speed rotation protocol (HSP) to improve the temporal acquisition of the contrast agent dynamics. However, the HSP suffers from angular under-sampling, which can lead to severe streak artifacts in the reconstructed perfusion maps. Thus a novel, computationally fast noise and streak artifact reduction approach for FD-CTP data is proposed. The feasibility of FD-CTP using the HSP is demonstrated with clinical data acquired during interventional treatment of two stroke cases.
Furthermore, the design of a digital brain perfusion phantom for the thorough numerical evaluation of the proposed techniques is discussed. The quality of the perfusion maps acquired and reconstructed using the introduced novel approaches suggests that FD-CTP could be clinically available in the near future.
|
|
|
High Performance Iterative X-Ray CT with Application in 3-D Mammography and Interventional C-arm Imaging Systems
Medical image reconstruction is a key component for a broad range of medical imaging technologies. For classical Computed Tomography systems the amount of measured signals per second increased exponentially over the last four decades, whereas the computational complexity of the majority of utilized algorithms has not changed significantly.
A major interest and challenge is to provide optimal image quality at the fewest patient dose possible. One solution and active research field towards solving that problem, are iterative reconstruction methods. Their complexity is a multiple com- pared to the classical analytical methods which were used in nearly all commercially available systems. In this thesis the application of graphics cards in the field of iterative medical image reconstruction is investigated. The major contributions are the demonstrated fast implementations for off-the-shelf hardware as well as the motivation of graphics card usage in upcoming generations of medical systems. The first realization describes the implementation of a commonly used analytical cone- beam reconstruction method for C-arm CT, before covering iterative reconstruction methods. Both analytical as well as iterative reconstruction methods share the compute-intensive back-projection step. In addition iterative reconstruction methods require a forward-projection step with similarly high computational cost. The introduced Compute Unified Device Architecture (CUDA) builds the basis for the presented GPU implementation of both steps. Different realization schemes are presented by combining both steps and applying minor modifications. The implementations of the SART, SIRT as well as OS-SIRT illustrate the realization of algebraic reconstruction methods. Further, a realization for the more advanced statistical reconstruction methods is described, introducing a GPU accelerated implementation of a maximum likelihood reconstruction using a concave objective function.
The achieved reconstruction performance is based on different detailed optimizations and exploitation of various technical features. In addition the performance results are evaluated for different hardware platforms – like the CPU – and for the proposed algorithms. The results implicate that for all presented reconstruction methods a significant speedup compared to a CPU realization is achieved. In example, we achieve at least a speedup factor of 10 for the presented OS-SIRT com- paring a NVIDIA QuadroFX 5600 graphics card with a workstation equipped with two Intel Xeon Quad-Core E5410 processors. This is additionally supported by the comparison of the presented implementations to the CUDA alternative OpenCL underpinning the performance lead of GPUs using CUDA.
A further contribution of this thesis is the exemplary clinical application of the pro- posed algorithms to two different modalities: C-arm CT and 3-D mammography. These applications demonstrate the potential and importance of GPU accelerated iterative medical image reconstruction. This thesis is concluded with a summary and an outlook on the future of GPU accelerated medical imaging processing.
|
|
Sebastian Bauer
|
24.09.2014
|
|
Rigid and Non-Rigid Surface Registration for Range Imaging Applications in Medicine
The introduction of low-cost range imaging technologies that are capable of acquiring the three-dimensional geometry of an observed scene in an accurate, dense, and dynamic manner holds great potential for manifold applications in health care. Over the past few years, the use of range imaging modalities has been proposed for guidance in computer-assisted procedures, monitoring of interventional workspaces for safe robot-human interaction and workflow analysis, touch-less user interaction in sterile environments, and for application in early diagnosis and elderly care, among others. This thesis is concerned with the application of range imaging technologies in computer-assisted and image-guided interventions, where the geometric alignment of range imaging data to a given reference shape – either also acquired with range imaging technology or extracted from tomographic planning data – poses a fundamental challenge. In particular, we propose methods for both rigid and non-rigid surface registration that are tailored to cope with the specific properties of range imaging data.
In the first part of this work, we focus on rigid surface registration problems. We introduce a point-based alignment approach based on matching customized local surface features and estimating a global transformation from the set of detected correspondences. The approach is capable of handling gross initial misalignments and the multi-modal case of aligning range imaging data to tomographic shape data. We investigate its application in image-guided open hepatic surgery and automatic patient setup in fractionated radiation therapy. For the rigid registration of surface data that exhibit only slight misalignments, such as with on-the-fly scene reconstruction using a hand-guided moving range imaging camera, we extend the classical iterative closest point algorithm to incorporate both geometric and photometric information. In particular, we investigate the use of acceleration structures for efficient nearest neighbor search to achieve real-time performance, and quantify the benefit of incorporating photometric information in endoscopic applications with a comprehensive simulation study.
The emphasis of the second part of this work is on variational methods for non-rigid surface registration. Here, we target respiratory motion management in radiation therapy. The proposed methods estimate dense surface motion fields that describe the elastic deformation of the patient’s body. It can serve as a highdimensional respiration surrogate that substantially better reflects the complexity of human respiration compared to conventionally used low-dimensional surrogates. We propose three methods for different range imaging sensors and thereby account for the particular strengths and limitations of the individual modalities. For dense but noisy range imaging data, we propose a framework that solves the intertwined tasks of range image denoising and its registration with an accurate planning shape in a joint manner. For accurate but sparse range imaging data we introduce a method that aligns sparse measurements with a dense reference shape while simultaneously reconstructing a dense displacement field describing the non-rigid deformation of the body surface. For range imaging sensors that additionally capture photometric information, we investigate the estimation of surface motion fields driven by this complementary source of information.
|
|
Davide Piccini
|
23.06.2014
|
|
Respiratory Self-Navigation for Whole-Heart Coronary Magnetic Resonance Imaging
As the average life span of the world population increases, cardiovascular diseases firmly establish themselves as the most frequent cause of death in many of the developed countries. Coronary artery disease (CAD) is responsible for more than half of these cases and there is, hence, a strong need for a non-invasive and radiation-free test that could be reliably adopted for its assessment in clinical routine. Although coronary magnetic resonance imaging (MRI) has always been regarded with high expectations, it is still not considered for clinical assessment of CAD. This is mainly due to several limitations of current coronary MRI examinations. The complex anatomy of the coronary arteries requires extensive scout-scanning to precisely plan the actual data acquisition. The current speed limitations of the MRI scanners and the contribution of cardiac and respiratory motion do not allow the high resolution acquisitions to be performed within the fraction of a single heartbeat. Consequently, data acquisition must be split into multiple heartbeats and usually performed during free-breathing. At the same time, gating with respect to a consistent respiratory position is applied using an interleaved navigated scan which monitors the position of the subject's diaphragm. Major improvements in standard navigator-gated free-breathing coronary MRI have been achieved in recent years, but a number of important intrinsic limitations, such as the prolonged and unknown acquisition times, the non-linearity of the motion compensation, and the complexity of the examination setup have so far hindered the clinical usage of this technique. In contrast, a technique known as self-navigation, which performs motion detection and correction solely based on imaging data of the heart, promises a priori knowledge of the duration of the acquisition with improved accuracy of the motion compensation and requires minimal expertise for the planning of the examination. In this work, a novel acquisition and motion correction strategy for free-breathing selfnavigated whole-heart coronary MRA was introduced, analyzed and implemented to be entirely integrated in a clinical MR scanner. The proposed acquisition method consists of a novel interleaved 3D radial trajectory, mathematically constructed on the basis of a spiral phyllotaxis pattern, which intrinsically minimizes the eddy currents artifacts of the balanced steady state free-precessing acquisition, while ensuring a complete and uniform coverage of k-space. The self-navigated respiratory motion detection is performed on imaging readouts oriented along the superior-inferior axes and is based on a method for the isolation and automatic segmentation of the bright signal of the blood pool. Motion detection of the segmented blood pool is then performed using a cross-correlation technique. This fully automated respiratory selfnavigated method offers an easy and robust solution for coronary MR imaging that can also be integrated into a regular clinical routine examination. The technique was tested in volunteers, compared to the standard navigator-gating approach, and, for the first time to the author's knowledge, allowed self-navigation to be positively applied to a large patient study in an advanced clinical setting.
|
|
|
Boosting Methods for Automatic Segmentation of Focal Liver Lesions
Over the past decades, huge progress has been made in treatme nt of cancer, decreasing fatality rates despite a growing number of cases. Technical achievements had a big share in this development. With modern image acquisition techniques, most types of tum ors can be made visible. Automatic processing of these images to support diagnosis a nd therapy, on the other hand, is still very basic. Marking lesions for volume measurement s, intervention planning or tracking over time requires a lot of manual interaction, whi ch is both tedious and error prone. The work at hand therefore aims at providing tools for the aut omatic segmentation of liver lesions. A system is presented that receives a contras t enhanced CT image of the liver as input and, after several preprocessing steps, decides fo r each image voxel inside the liver whether it belongs to a tumor or not. That way, tumors are not o nly detected in the image but also precisely delineated in three dimensions. For the d ecision step, which is the main target of this thesis, we adopted the recently proposed Prob abilistic Boosting Tree. In an offline learning phase, this classifier is trained using a num ber of example images. After training, it can process new and previously unseen images. Such automatic segmentation systems are particularly valu able when it comes to moni- toring tumors of a patient over a longer period of time. There fore, we propose a method for learning a prior model to improve segmentation accuracy for such follow-up examinations. It is learned from a number of series of CT images, where each se ries contains images of one patient. Two different ways of incorporating the model i nto the segmentation system are investigated. When acquiring an image of a patient, the sy stem can use the model to calculate a patient specific lesion prior from images of the s ame patient acquired earlier and thus guide the segmentation in the current image. The validity of this approach is shown in a set of experiments on clinical images. When comparing the points of 90% sensitivity in these experiment s, incorporating the prior im- proved the precision of the segmentation from 82.7% to 91.9% . This corresponds to a reduction of the number of false positive voxels per true pos itive voxel by 57.8%. Finally, we address the issue of long processing times of cla ssification based segmen- tation systems. During training, the Probabilistic Boostin g Tree builds up a hierarchy of AdaBoost classifiers. In order to speed up classification duri ng application phase, we mod- ify this hierarchy so that simpler and thus faster AdaBoost cl assifiers are used in higher levels. To this end, we introduce a cost term into AdaBoost tra ining that trades off dis- criminative power and computational complexity during fea ture selection. That way the optimization process can be guided to build less complex cla ssifiers for higher levels of the tree and more complex and thus stronger ones for deeper level s. Results of an experimental evaluation on clinical images are presented, which show tha t this mechanism can reduce the overall cost during application phase by up to 76% withou t degrading classification ac- curacy. It is also shown that this mechanism could be used to o ptimize arbitrary secondary conditions during AdaBoost training.
|
|
|
Accelerated Non-contrast-enhanced Morphological and Functional Magnetic Resonance Angiography
Cardiovascular diseases such as stroke, stenosis, peripheral or renal artery disease require accurate angiographic visualization techniques both for diagnosis and treatment planning. Beside the morphological imaging, the in-vivo acquisition of blood flow information gained increasing clinical importance in recent years. Non-contrast-enhanced Magnetic Resonance Angiography (nceMRA) provides techniques for both fields. For morphological imaging, Time of Flight (TOF) and magnetization-prepared balanced Steady State Free Precession (mp-bSSFP) offer non-invasive, ionizing-radiation free and user independent alternatives to clinically established methods such as Digital Subtraction Angiography, Computed Tomography or Ultrasound. In the field of functional imaging, unique novel possibilities are given with three-directional velocity fields, acquired simultaneously to the morphological information using Phase Contrast Imaging (PCI). But the wider clinical use of nceMRA is still hampered by long acquisition times. Thus, accelerating nceMRA is a problem of high relevance and with great potential clinical impact. In this thesis, acceleration strategies based on k -space sampling below the Nyquist limit and adapted reconstruction techniques, combining parallel MRI (pMRI) methods with Compressed Sensing (CS), are developed for both types of nceMRA methods. This includes contributions to all relevant parts of the reconstruction algorithms, the sampling strategy, the regularization technique and the optimization method. For morphological imaging, a novel analytical pattern combining aspects of pMRI and CS, called the MICCS pattern, is proposed in combination with an adapted Split Bregman algorithm. This allows for a reduction in the acquisition time for peripheral TOF imaging of the entire lower vasculature from over 30 minutes to less than 8 minutes. Further acceleration is achieved for 3-D free-breathing renal angiography using mp-bSSFP, where the entire volume can be acquired in less than 1 minute instead of over 8 minutes. In addition, organ based evaluations including the vessel sharpness at important positions show the diagnostic usability and the increased accuracy over clinically established acceleration methods. For PCI, advances are achieved with a dedicated novel sampling strategy, called I-VT sampling, including interleaved variations for all dimensions. Furthermore, two novel regularization techniques for PCI are developed in this thesis. First, a novel temporally masked and weighted strategy focusing on enhanced temporal fidelity, referred to as TMW, is developed. This fully automatic approach uses dynamic and static vessel masks to guide the influence specifically to the static areas. Second, the low rank and sparse decomposition model, is extended to PCI, combined with adapted sparsity assumptions and the unconstrained Split Bregman algorithm. These methods are successfully applied to the carotid bifurcation, a region with a huge demand of significant acceleration as well high spatial and temporal accuracy of the flow values. But all algorithmic contributions exploit inherent properties of the acquisition technique, and thus can be applied for further applications. In summary, the main contribution of this thesis is significant acceleration of nceMRA achieved with novel sampling, regularization and optimization elements.
|
|
Kerstin Müller
|
09.05.2014
|
|
3-D Imaging of the Heart Chambers with C-arm CT
Nowadays, angiography is the gold standard for the visualization of the morphology of the cardiac vasculature and cardiac chambers in the interventional suite. Up to now, high resolution 2-D X-ray images are acquired with a C-arm system in standard views and the diagnosis of the cardiologist is based on the observations in the planar X-ray images. No dynamic analysis of the cardiac chambers can be performed in 3-D. In the last years, cardiac imaging in 3-D using a C-arm system becomes of more and more interest in the interventional catheter laboratory. Furthermore, the analysis of the 3-D motion would provide valuable information with respect to functional cardiac imaging. However, cardiac motion is a challenging problem in 3-D imaging, which leads to severe imaging artifacts in the 3-D image. Therefore, the main research goal of this thesis was the visualization and extraction of dynamic and functional parameters of the cardiac chambers in 3-D using an interventional angiographic C-arm system.
In this thesis, two different approaches for cardiac chamber motion-compensated reconstruction have been developed and evaluated. The first technique addresses the visualization of the left ventricle. Therefore, a whole framework for left ventricular tomographic reconstruction and wall motion analysis has been developed. Dynamic surface models are generated from the 2-D X-ray images acquired during a short scan of a C-arm scanner using the 2-D bloodpool information. The acquisition time is about 5 s and the patients have normal sinus rhythm. Due to the acquisition time of about 5 s of the C-arm, no valuable retrospective ECG-gated reconstructions are possible. The dynamic surface LV model comprises a sparse motion vector field on the surface, which can be used for functional wall motion analysis. Furthermore, applying various interpolation schemes, dense motion vector fields can be generated for a tomographic motion-compensated reconstruction. In this thesis, linear interpolation methods and spline-based methods have been compared. The combination of the wall motion analysis and the motion-compensated reconstruction is of great value to the diagnostic of pathological regions in cardiac interventions.
The second motion-compensated reconstruction approach uses volume-based motion estimation algorithms for the reconstruction of two - left atrium and left ventricle - to four heart chambers. A longer C-arm acquisition and contrast protocol allows for the generation of initial images at various heart phases. However, the initial image quality is not sufficient for motion estimation. Therefore, different pre-processing techniques, e.g., bilateral filtering or iterative reconstruction techniques, to improve the image quality were tested in combination with different motion estimation techniques.
Overall, the results of this thesis highly demonstrate the feasibility of dynamic and functional cardiac chamber imaging using data from an interventional angiographic C-arm system for clinical applications.
|
|
|
Magnetic Resonance Imaging for Percutaneous Interventions
The fundamental motivation for all percutaneous interventions is to improve patient care by reducing the invasiveness of the procedure. An increasing number of percutaneous interventions from biopsies, targeted drug delivery to thermal ablations are performed under magnetic resonance (MR) guidance. Its excellent soft-tissue contrast and multiplanar imaging capabilities make MRI an attractive alternative to computed tomography or ultrasound for real-time image-guided needle placement, in particular for targets requiring a highly angulated approach and non-axial scan planes. MRI further provides the unique ability to monitor spatial temperature changes in real-time.
The research efforts of this dissertation were focused on improving and simplifying the workflow of MR-guided percutaneous procedures by introducing novel imagebased methods without the need for any additional equipment. For safe and efficient MR-guided percutaneous needle placement, a set of methods was developed that allows the user to: 1) plan an entire procedure, 2) directly apply this plan to skin entry site localization without further imaging, and 3) place a needle under real-time MR guidance with automatic image plane alignment along a planned trajectory with preference to the principal patient axes. Methods for enhanced MR thermometry visualization and treatment monitoring were also developed to support an effective thermal treatment facilitating the ablation of tumor tissue without damaging adjacent healthy structures.
To allow for an extensive in-vitro and in-vivo validation, the proposed methods for both needle guidance and MR thermometry were implemented in an integrated prototype. The clinical applicability was demonstrated for a wide range of MR-guided percutaneous interventions emphasizing the relevance and impact of the conducted research.
|
|
|
Automatic Classification of Cerebral Gliomas by Means of Quantitative Emission Tomography and Multimodal Imaging
Cerebral gliomas represent a common type of cancer of the human brain with many tumor grades which express a huge diversity in growth characteristics and have a highly varying malignancy. The optimal treatment for a cerebral glioma is only ensured if the underlying tumor grade is known. One very common grading scheme is the World Health Organization (WHO) Classification of tumors of the central nervous system, which differentiates four grades. The de facto standard of grading a glioma is based on bioptic samples which are obtained in invasive interventions. These interventions pose significant risks for the patients and add more time delays between an initial evidence of the tumor, usually found by X-ray computed tomography (CT) or magnetic resonance imaging (MRI) and the initiation of a treatment. On the other side, versatile imaging modalities like CT, MRI and from the field of nuclear medicine, positron emission tomography (PET) cover various aspects of the morphology and physiology of a tumor. The information gained from medical imaging thus can indicate the grade of a cerebral glioma without any invasive intervention. The multimodal imaging often results in a high complexity that makes if difficult to diagnose the malignancy solely based on the visual interpretation of medical images. In this thesis, we present approaches for an extensive pattern recognition pipeline for the grading of cerebral gliomas based on tomographic datasets from MRI, CT, and PET. More specifically, we use gadolinium contrast-enhanced T1-weighted MRI, T2-weighted fluid attenuated inversion recovery MRI, diffusion-weighted MRI, non contrast-enhanced low-dose X-ray CT, and dynamic (multiple acquired time frames) [18F]-Fluor-Ethyl-Tyrosine (FET) PET. Our setup includes image preprocessing, feature extraction and calculation, feature normalization, and finally fully automatic classification. We propose the imaging modalities and the classifiers which performed best for our patient population and show that inter-dataset normalization as a preprocessing step helps to improve the classification rate for cerebral gliomas. As the PET is acquired over a lengthy time period which can lead to substantial patient motion, we present a retrospective motion correction technique based on image registration, which improves the image quality of the PET data. The presented approaches underline that diagnostic statements can be gained from highly complex, multimodal image data in an automated fashion. We can differentiate not only low- and high-grade tumors, but also aid in distinguishing between the four WHO grades within some limitations.
|
|
|
C-arm Computed Tomography with Extended Axial Field-of-View
C-arm computed tomography (CT) is an innovative imaging technique in the interventional room that enables a C-arm system to generate 3D images like a CT system. Clinical re- ports demonstrate that this technique can help reduce treatment-related complications and may improve interventional efficacy and safety. However, currently, C-arm CT is only capable of imaging axially-short object, because it employs a single circular data acqui- sition geometry. This shortcoming can be a problem in some intraoperative cases when imaging a long object, e.g., the entire spine, is crucial. A new technique, C-arm CT for axially-long objects, namely extended-volume C-arm CT, has to be developed. This thesis aims at achieving this development. In particular, this thesis designs and analyzes data acquisition geometries as well as develops and implements reconstruction algorithms for extended-volume C-arm CT.
The thesis consists of three parts. In the first part, we studied three data acquisition geometries and invented two thereof. For these geometries, we investigated their feasi- bility on a C-arm system and analyzed their possibility for efficient, theoretically-exact and -stable (TES) reconstruction algorithms. We observed that the reverse helical trajec- tory is a good start for real data test and the novel ellipse-line-ellipse trajectory is a good candidate for efficient TES image reconstruction. In the second part, we developed and im- plemented geometry-specific reconstruction algorithms. For the reverse helix, we designed three Feldkamp-Davis-Kress (FDK)-type reconstruction methods. Among the three meth- ods, the Fusion-RFDK and Fusion-HFDK methods are preferred as they are more practical and produce acceptable images for extended-volume C-arm CT. For the ellipse-line-ellipse trajectory, we established an efficient TES reconstruction scheme, which makes proficient use of the geometry of this trajectory. In the third part, we conducted the first experiment for extended-volume C-arm CT on a laboratorial Artis zeego system. In this experiment, cone-beam data were reliably acquired using the reverse helical trajectory and 3D images were successfully reconstructed by the Fusion-RFDK method. The consistency among the- oretical understanding, simulation results and achieved image quality from a real system strongly demonstrate feasibility of extended-volume C-arm CT in the interventional room.
|
|
Stefan Wenhardt
|
17.06.2013
|
|
Ansichtenauswahl für die 3-D-Rekonstruktion statischer Szenen
The problem of 3-D reconstruction is one of the main topics in computer vision. If certain imaging parameters can be modified to improve the 3-D reconstruction result, the question how to select this parameters belongs to a domain called active vision. The active parameters in our case are the focal length of the camera, which can be controlled by a zoom lens, and the pose, i. e. translation and rotation of the camera. The camera is mounted on a robot, so the position of the camera can be controlled.
Usually, active vision for 3-D reconstruction means either scene exploration or most accurate estimation of the 3-D structure of an object. Of course, there are approaches trying to find a trade-off of the two aspects. This thesis focuses only on the aspect of high accurate estimates. For this purpose feature points are extracted from the images to estimate their 3-D coordinates. Here two different approaches are developed and evaluated: a geometric approach for stereo camera systems and a probabilistic approach.
The geometric approach considers only stereo camera systems, i. e. systems which consist of exactly two cameras. The influence of the active parameters (translation, rotation and focal length) are evaluated and, if possible, analytically proven.
The probabilistic approach determines the next best view, to increase the accuracy of the current estimate. Therefore it is necessary to describe the problem of 3-D reconstruction as a state estimation problem. The state estimation is solved by the extended Kalman filter. So, it is possible to improve the current state estimate of the 3-D coordinates by additional observations. This thesis derives a modification of the Kalman filter,which allows to reduce the calculation complexity drastically. For this modification only some simple assumptions are necessary, but it is discussed, why these assumptions are meaningful in the application of 3-D reconstruction. This modification is exact, i. e. there is no approximation required.
A 3-D point to be reconstructed may be invisible, e. g. it is occluded by the object itself or its projection is outside of the field of view of the camera. Therefore, the next best view planning has to consider whether the point is visible or not from a certain view. We will show how the probability of visibility of a 3-D point can be calculated and further, how we can integrate the visibility issue into the closedform optimization criteria for the next best view planning.
Another aspect of next best view planning is to move the camera to the desired position. Therefore, the question is, which position is reachable by the used robot device the camera is mounted on. In former publications, this aspect is either ignored or it is assumed the camera canmove only on a (part-)sphere around the object. But this thesis describes the reachable workspace by the Denavit-Hartenberg matrix. This allows to consider the complete workspace of the used robot in the next best view planning, without any unnecessary limitation to a (part-)sphere.
|
|
|
Automated Evaluation of Three Dimensional Ultrasonic Datasets
Non-destructive testing has become necessary to ensure the quality of materials and components either in-service or at the production stage. This requires the use of a rapid, robust and reliable testing technique. As a main testing technique, the ultrasound technology has unique abilities to assess the discontinuity location, size and shape. Such information play a vital role in the acceptance criteria which are based on safety and quality requirements of manufactured components. Consequently, an extensive usage of the ultrasound technique is perceived especially in the inspection of large scale composites manufactured in the aerospace industry.
Signicant technical advances have contributed into optimizing the ultrasound acquisition techniques such as the sampling phased array technique. However, acquisition systems need to be complemented with an automated data analysis procedure to avoid the time consuming manual interpretation of all produced data. Such a complement would accelerate the inspection process and improve its reliability.
The objective of this thesis is to propose an analysis chain dedicated to automatically process the 3D ultrasound volumes obtained using the sampling phased array technique. First, a detailed study of the speckle noise aecting the ultrasound data was conducted, as speckle reduces the quality of ultrasound data. Afterward, an analysis chain was developed, composed of a segmentation procedure followed by a classication procedure. The proposed segmentation methodology is adapted for ultrasound 3D data and has the objective to detect all potential defects inside the input volume. While the detection of defects is vital, one main diculty is the high amount of false alarms which are detected by the segmentation procedure. The correct distinction of false alarms is necessary to reduce the rejection ratio of safe parts. This has to be done without risking missing true defects. Therefore, there is a need for a powerful classier which can eciently distinguish true defects from false alarms. This is achieved using a specic classication approach based on data fusion theory.
The chain was tested on several ultrasound volumetric measures of Carbon Fiber Reinforced Polymers components. Experimental results of the chain revealed high accuracy, reliability in detecting, characterizing and classifying defects.
|
|
|
Computerized Automatic Modeling of Medical Prostheses
In this thesis we study artificial intelligence methods, rule-based expert systems in particular, for the task of automatically designing customized medical prostheses. Here, the term design denotes the shaping or modeling of the prostheses and not their functional design. The challenge of the task at hand lies in designing prostheses that fit perfectly to the anatomy of the patient, and in many cases have to support additional functionality. Hence, each prosthesis has to be unique. Therefore, medical prostheses are usually designed starting with a template of the patient’s anatomy, e. g. acquired using CT data or scanned and digitized molds. In this thesis we assume that the template data is given as a triangle mesh in 3-D.
To address the challenge of automatically designing medical prostheses, we develop an expert system framework consisting of an expert system shell, a knowledge base and a feature detection unit. The framework is integrated into an existing modeling software. In the following, we denote the complete system as Expert System for Automatic Modeling (ESAM). The architecture of ESAM is generic and can be used for all kinds of design tasks. The specialization for the application in mind can be achieved by providing the necessary design rules and by adjusting the feature detection algorithms.
Our expert system specializes in monitoring and controlling a CAD software. Thus, it defines the parameters of the CAD tools, executes the tools and monitors the results by constantly analyzing the current shape. As part of the expert system we develop a knowledge representation language to structure and store the expert knowledge. The language is easy to understand, flexible and can be extended as required. The knowledge base is created in interaction with experts of the field. In addition, we study methods to extend, improve and maintain the knowledge base. We evaluate two methods for rule creation and rule critic. On the one hand, we apply genetic programming as a rule learning technique. On the other hand, we use a heuristic method based on data generated by ESAM. For the latter, we develop a tool that generates statistics about rule performance, rule relationships and user interaction. The thereby gained knowledge is integrated into the knowledge base, resulting in a higher performance of ESAM, e.g. the completion rate increased by about 30 %.
We apply two types of feature detection methods for the detection of surface features on the given templates. The first method analyzes the surface of the given template for peaks, depressions, ridges and combinations of these generic features. The generality of the detected features allows a simple adjustment to different anatomies. The second method uses registration in order to copy features from a labeled template to an unlabeled one. As a first step, it applies clustering techniques to identify a suitable set of representative templates. In the second step, these templates are labeled by a domain expert. Subsequently, the labels can be transferred based on the result of an ICP registration. Our experiments show that the second approach results in a higher quality of the detected features, e. g. mean deviation is reduced from approximately 3.8 mm by about 30 % to approximately 2.6 mm.
ESAM is verified using the example of customized in-the-ear hearing aid design. An industry partner provides the domain knowledge necessary to create the knowledge base as well as the possibility to verify the system in a real production environment. In order to ensure the quality of the designed and manufactured in-the-ear hearing aids, the system is verified while running in a semi-automatic mode. The semi-automatic mode allows a modeling expert to monitor and correct the system if necessary. During the verification and practical usage of ESAM thousands of customized in-the-ear hearing aid shells are manufactured. It could be shown that compared to the manual approach the design consistency improves by about 10% and the design time is reduced by about 30 %. The overall acceptance rate of an expert system rule is 76 %. In addition to that, ESAM provides a framework, which guides the modeler through the complex design process, and thereby reduces the amount of design errors and avoids unnecessary process steps. As a consequence of these positive evaluation results our industry partner continues to apply ESAM on its production floor.
|
|
Christian Riess
|
21.12.2012
|
|
Physics-based and Statistical Features for Image Forensics
The objective of blind image forensics is to determine whether an image is authentic or captured with a particular device. In contrast to other security-related fields, like watermarking, it is assumed that no supporting pattern has been embedded into the image. Thus, the only available cues for blind image forensics are either a) based on inconsistencies in expected (general) scene and camera properties or b) artifacts from particular image processing operations that were performed as part of the manipulation.
In this work, we focus on the detection of image manipulations. The contributions can be grouped in two categories: techniques that exploit the statistics of forgery artifacts and methods that identify inconsistencies in high-level scene information. The two categories complement each other. The statistical approaches can be applied to the majority of digital images in batch processing. If a particular, single image should be investigated, high-level features can be used for a detailed manual investigation. Besides providing an additional, complementary testing step for an image, high-level features are also more resilient to intentional disguise of the manipulation operation.
Hence, the first part of this thesis focuses on methods for the detection of statistical artifacts introduced by the manipulation process. We propose improvements to the detection of so-called copy-move forgeries. We also develop a unified, extensively evaluated pipeline for copy-move forgery detection. To benchmark different detection features within this pipeline, we create a novel framework for the controlled creation of semi-realistic forgeries. Furthermore, if the image under investigation is stored in the JPEG format, we develop an effective scheme to expose inconsistencies in the JPEG coefficients.
The second part of this work aims at the verification of scene properties. Within this class of methods, we propose a preprocessing approach to assess the consistency of the illumination conditions in the scene. This algorithm makes existing work applicable to a broader range of images. The main contribution in this part is a demonstration of how illuminant color estimation can be exploited as a forensic cue. In the course of developing this method, we extensively study color constancy algorithms, which is the classical research field for estimating the color of the illumination. In this context, we investigate extensions of classical color constancy algorithms to the new field of non-uniform illumination. As part of this analysis, we create a new, highly accurate ground truth dataset and propose a new algorithm for multi-illuminant estimation based on conditional random fields.
|
|
Korbinian Riedhammer
|
03.08.2012
|
|
Interactive Approaches to Video Lecture Assessment
A growing number of universities and other educational institutions record videos of regularly scheduled classes and lectures to provide students with additional resources for their study. However, the video alone is not necessarily the same than a carefully prepared educational video. The main issue is that they are typically not post-processed in an editorial sense. That is, the videos often contain longer periods of silence or inactivity, unnecessary repetitions, spontaneous interaction with students, or even corrections of prior false statements or mistakes. Furthermore, there is often no summary or table of contents of the video, unlike with educational videos that supplement a certain curriculum and are well scripted and edited. Thus, the plain recording of a lecture is a good start but far from a good e-learning resource.
This thesis describes a system that can close the gap between a plain video recording and useful e-learning resource by producing automatic summaries and providing an interactive lecture browser that can visualize automatically extracted key phrases and their importance on an augmented time line. The lecture browser depends on four tasks: automatic speech recognition, automatic extraction and ranking of key phrases, extractive speech summarization, and the visualization of the phrases and their salience. These tasks as well as the contribution to the state of the art are described in detail and evaluated on a newly acquired corpus of academic spoken English, the LMELectures. A first user study shows that students using the lecture browser can solve a topic localization task about 29% faster than students that are provided with the video only.
|
|
|
Efficient and Trainable Detection and Classification of Radio Signals
The strong need for robust and efficient radio signal detection and classification can be found both in civil and non-civil applications. Most of the current state-of-the-art systems are based on a separated version of detection and classification. Within this work, a new integrated approach is presented. The proposed system combines the insights into communications intelligence, cognitive radio and modern pattern recognition ideas.
Due to the reason that the number of modems and waveforms is continuously growing, the proposed framework is easily adaptable to new situations. Thus, it is focused on machine learning methods as a basis for all processing stages. Some of these stages use unsupervised learning and some supervised learning methods. The latter need a training data set to estimate their working parameters. For this reason, the generic radio scenario corpus, which contains more than 3730.00 MB of wideband scenarios with more than 875 labeled narrowband emissions, was established.
The supervised learning blocks are supported by an improved version of an AdaBoost classifier. To make a decision, the classifier can fall back on a big feature pool consisting of traditional features, e. g., spectral form, and more modern features, e. g., Haar-like features or cepstral based features. The applied features are extracted on demand and combined in a specific way to improve the classification accuracy and to increase the system performance.
A prototype was evaluated by two institutions, the MEDAV GmbH and a European agency. It was possible to show the high accuracy of the framework and to demonstrate successfully the processing under real-world conditions.
|
|
Alexander Brost
|
07.05.2012
|
|
Image Processing for Fluoroscopy Guided Atrial Fibrillation Ablation Procedures
Atrial fibrillation is a common heart arrhythmia and is associated with an increased risk of stroke. The current state-of-the-art treatment option is the minimally invasive catheter ablation. During such procedures, the four pulmonary veins attached to the left atrium are electrically isolated. New methods to guide these procedures are presented in this work.
Two methods for catheter reconstruction from two views are presented and evaluated. The first method focuses on the circumferential mapping catheter and the second on the cryo-balloon catheter. The result of the mapping catheter reconstruction is later used for the motion compensation methods.
As there is currently no planning support for single-shot-devices like the cryo-balloon catheter, a planning tool is presented, the Atrial Fibrillation Ablation Planning Tool (AFiT). AFiT provides direct visual feedback about the fit of a cyro-balloon to the patient’s anatomy. Another tool to provide intra-procedural support is the tracking of cyro-balloon catheters. Visual feedback about the position and dimensions of the balloon catheter can be superimposed onto live fluoroscopy.
In order to provide overlay images in sync with live fluoroscopic images, cardiac and respiratory motion must be taken into account. Therefore, several novel approaches for motion compensation are presented. The methods differ in their targeted application. A novel method, particularly designed for monoplane image acquisition, facilitates motion compensation by model-based 2-D/2-D registration. Another novel method focuses on simultaneous biplane image acquisition, requiring a 3-D catheter model of the circumferential mapping catheter. Motion compensation is then achieved by using a model-based 2-D/3-D registration to simultaneously acquired biplane images. As simultaneous biplane acquisition is rarely used in clinical practice, a new approach for a constrained model-based 2-D/3-D registration is presented to facilitate motion compensation using sequentially acquired biplane images. The search space of the registration is restricted to be parallel to the image plane. To further extend this approach, a novel method is proposed that involves a patient-specific motion model. A biplane training phase is used to generate this motion model, which is afterwards used to constrain the model-based registration. Overall, our motion compensation approaches achieve a tracking accuracy of less than 2.00 mm in 97.90 % of the frames.
As the circumferential mapping catheter needs to be moved during the procedure, a novel method to detect this motion is introduced. This approach requires the tracking of the mapping catheter and a virtual reference point on the coronary sinus catheter. As soon as the relative distance between circumferential mapping catheter and the reference point changes more than 5 %, non-physiological motion can be considered. We also investigated an option to provide motion compensation when the circumferential mapping catheter is not available. We propose a novel method for compensation using the coronary sinus catheter that requires a training phase. Our method outperforms a similar method reported in literature. We can conclude that motion compensation using the coronary sinus catheter is possible, but it is not as accurate as it could be using the circumferential mapping catheter.
|
|
Ahmed El-Rafei
|
13.03.2012
|
|
Diffusion Tensor Imaging Analysis of the Visual Pathway with Application to Glaucoma
Glaucoma is an optic neuropathy affecting the entire visual system. The world-wide prevalence of glaucoma is estimated to be 60.5 million people. The visual disorder caused by glaucoma can reach complete blindness if untreated. Various treatment approaches exist that can largely prevent the visual disability and limit the vision loss due to glaucoma if the disease is diagnosed in its early phases. Nevertheless, the slow progression of the disease along with the lack of clear symptoms results in the late identification of glaucoma. Moreover, the pathophysiology of glaucoma, and its biological foundation and factors are not yet fully determined or understood. Therefore, novel directions are essential for improving the diagnostic flow and the understanding of the glaucoma mechanism.
Most of the glaucoma diagnostic methods analyze the eye with a main focus on the retina, despite the transsynaptic nature of the fiber degeneration caused by glaucoma. Thus, they ignore a significant part of the visual system represented by the visual pathway in the brain. The advances in neuroimaging, especially diffusion tensor imaging (DTI), enable the identification and characterization of white matter fibers. It has been reported that glaucoma affects different parts of the visual system. Optic nerve and optic radiation were shown to have abnormalities measured by DTI-derived parameters in the presence of glaucoma. These outcomes suggest the significance of visual pathway analysis in the diagnosis.
In this work, we propose visual pathway analysis using DTI in glaucoma diagnosis to complement the existing retina-based techniques. A system is proposed to automatically identify the optic radiation on the DTI-images. The segmentation al- gorithm is applied to healthy and glaucoma subjects and showed high accuracy in segmenting such a complicated fiber structure. The automation eliminates the necessity of medical experts’ intervention and facilitates studies with large number of subjects. This algorithm was incorporated in a framework for the determination of the local changes of the optic radiation due to glaucoma using DTI. The framework can aid further studies and understanding of the pathophysiology of glaucoma. Moreover, the framework is applied to normal and glaucoma groups to provide localization maps of the glaucoma effect on the optic radiation. Finally, we propose a system that extracts different aspects of the visual pathway fibers from the diffusion tensor images for detecting and discriminating different glaucoma entities. The classification results indicate the superior performance of the system compared to many state of the art retina-based glaucoma detection systems.
The proposed approach utilizes visual pathway analysis rather than the conventional eye analysis which presents a new trend in glaucoma diagnosis. Analyzing the entire visual system could provide significant information that can improve the glaucoma examination flow and treatment.
|
|
Johannes Feulner
|
05.03.2012
|
|
Machine Learning Methods in Computed Tomography Image Analysis
Lymph nodes have high clinical relevance because they are often affected by cancer, and also play an important role in all kinds of infections and inflammations in general. Lymph nodes are commonly examined using computed tomography (CT).
Manually counting and measuring lymph nodes in CT volumes images is not only cumbersome but also introduces the problem of inter-observer variability and even intra-observer variability. Automatic detection is however challenging as lymph nodes are hard to see due to low contrast, irregular shape, and clutter. In this work, a top-down approach for lymph node detection in 3-D CT volume images is proposed. The focus is put on lymph nodes that lie in the region of the mediastinum.
CT volumes that show the mediastinum are typically scans of the thorax or even the whole thoracic and abdominal region. Therefore, the first step of the method proposed in this work is to determine the visible portion of the body from a CT volume. This allows pruning the search space for mediastinal lymph nodes and also other structures of interest. Furthermore, it can tell whether the mediastinum is actually visible. The visible body region of an unknown test volume is determined by 1-D registration along the longitudinal axis with a number of reference volumes whose body regions are known. A similarity measure for axial CT slices is proposed that has its origin in scene classification. An axial slice is described by a spatial pyramid of histograms of visual words, which are code words of a quantized feature space. The similarity of two slices is measured by comparing their histograms. As features, descriptors of the Speeded Up Robust Features are used. This work proposes an extension of the SURF descriptors to an arbitrary number of dimensions (N-SURF). Here, we make use of 2-SURF and 3-SURF descriptors.
The mediastinal body region contains a number of structures that can be confused with lymph nodes. One of them is the esophagus. Its attenuation coefficient is usually similar, and at the same time it is often surrounded by lymph nodes. Therefore, knowing the outline of the esophagus both helps to reduce false alarms in lymph node detection, and to put focus on the neighborhood. In the second part of this work, a fully automatic method for segmenting the esophagus in 3-D CT images is proposed. Esophagus segmentation is a challenging problem due to limited contrast to surrounding structures and a versatile shape and appearance. Here, a multi step method is proposed: First, a detector that is trained to learn a discriminative model of the appearance is combined with an explicit model of the distribution of respiratory and esophageal air. In the next step, prior shape knowledge is incorporated using a Markov chain model and used to approximate the esophagus shape. Finally, the surface of this approximation is non-rigidly deformed to better fit the boundary of the organ.
The third part of this work is a method for automatic detection and segmentation of mediastinal lymph nodes. Having low contrast to neighboring structures, it is vital to incorporate as much anatomical knowledge as possible to achieve good detection rates. Here, a prior of the spatial distribution is proposed to model this knowledge. Different variants of this prior are compared to each other. This is combined with a discriminative model that detects lymph nodes from their appearance. It first generates a set of possible lymph node center positions. Two model variants are compared. Given a detected center point, either the bounding box of the lymph node is directly detected, or a lymph node is segmented. A feature set is introduced that is extracted from this segmentation, and a classifier is trained on this feature set and used to reject false detections.
|
|
Christoph Schmalz
|
05.03.2012
|
|
Robust Single Shot Structured Light
In this thesis a new robust approach for Single-Shot Structured Light 3D scanning is developed. As the name implies, this measurement principle requires only one image of an object, illuminated with a suitable pattern, to reconstruct the shape and distance of the object. This technique has several advantages. It can be used to record 3D video with a moving sensor or of a moving scene. Since the required hardware is very simple, the sensor can also be easily miniaturized. Single-Shot Structured Light, thus, has the potential to be the basis of a versatile and inexpensive 3D scanner.
One focus of the work is the robustness of the method. Existing approaches are mostly limited to simple scenes, that is, smooth surfaces with neutral color and no external light. In contrast, the proposed method can work with almost any close-range scene and produces reliable range images even for very low-quality input images. An important consideration in this respect is the design of the illumination pattern. We show how suitable color stripe patterns for different applications can be created. A major part of the robustness is also due to the graph-based decoding algorithm for the pattern images. This has several reasons. Firstly, any color assessments are based on ensembles of pixels instead of single pixels. Secondly, disruptions in the observed pattern can be sidestepped by finding alternative paths in the graph. Thirdly, the graph makes it possible to apply inference techniques to get better approximations of the projected colors from the observed colors. For a typical camera resolution of 780x580, the whole decoding and reconstruction algorithm runs at 25Hz on current hardware and generates up to 50000 3D points per frame.
The accuracy of the recovered range data is another important aspect. We implemented a new calibration method for cameras and projectors, which is based on active targets. The calibration accuracy was evaluated using the reprojection error for single camera calibrations as well as the 3D reconstruction errors for complete scanner calibrations. The accuracy with active targets compares favorably to calibration results with classic targets. In a stereo triangulation test, the root-mean-square error could be reduced to a fifth. The accuracy of the combined Structured Light setup of camera and projector was also tested with simulated and real test scenes. For example, using a barbell-shaped reference object, its known length of 80.0057mm could be determined with a mean absolute error of 42µm and a standard deviation of 74µm.
The runtime performance, the robustness and the accuracy of the proposed approach are very competitive in comparison with previously published methods. Finally, endoscopic 3D scanning is a showcase application that is hard to replicate without Single-Shot Structured Light. Building on a miniature sensor head designed by Siemens, we developed calibration algorithms and apply the graph-based pattern decoding to generate high-quality 3D cavity reconstructions.
|
|
|
Model-Constrained Non-Rigid Registration in Medicine
The aim of image registration is to compute a mapping from one image's frame of reference to another's, such that both images are well aligned. Even when the mapping is assumed to be rigid (only rotation and translation) this can be a quite challenging task to accomplish between different image modalities. Noise and other imaging artifacts like bias fields in magnetic resonance (MR) imaging or streak artifacts in computed tomography (CT) can pose additional problems. In non-rigid image registration these problems are further compounded by the additional degrees of freedom in the transform.
Another problem is that the non-rigid registration problem is usually ambiguous: Different deformation fields can lead to equally well aligned images. Nevertheless, one would prefer deformations that coincide with medical or physiological expectations. For instance, in MR low intensity image values can indicate bones as well as air. We would prefer a registration result that only maps bone to bone and air to air, even though matching air to bone might lead to a visually similar result.
This work strives to address some of these problems. In a first step we provide a solid non-rigid registration algorithm. We compare several optimization algorithms, to ensure that the registration result is at least numerically as good as possible. We also explore how the parameter determining the global stiffness of the computed transform can be specified in a way that yields predictable results. In a second step we want to integrate prior information about the desired deformation into this registration algorithm. Two types of prior information are considered in this work:
The first are known point correspondences that explicitly specify the desired deformation for some parts of the images. This provides a very straightforward way for a user to interact with the registration algorithm. The known correspondences are efficiently integrated into the registration algorithm, which allows the specification of arbitrary number of correspondences and the application of the approach in 2d and 3d. As the landmarks are treated as hard constraints it is guaranteed that they are matched exactly. It is shown that this additional information can immensely benefit the registration result, especially in difficult cases like the registration of relatively unrelated imaging modalities like positron emission tomography (PET) and CT.
The second type of information is provided in the form of training deformations reflecting the kinds of deformation usually encountered in an application. These are used to generate a model which can be used to guide the registration to a result that is similar to the training data. We consider two variants of statistical deformation models. Either the model is generated and applied on the deformations themselves or on their Laplacian. The latter has the advantage of being inherently invariant to remaining rigid misalignments in the training data. They are applied in the context of atlas registration for MR/PET attenuation correction. An template CT image is registered with the patient MR to generate a pseudo-CT of the patient that can be used for the PET attenuation correction. However, the different intensity distributions in CT and MR, effects like bias fields and the low inter-slice resolution common in MR imaging, make the multi-modal registration prone to errors. The deformation model, learned from a set of mono-modal registrations, is used to constrain and thus improve the multi-modal registration. The algorithm is evaluated on a set of patient data for which the ground-truth CT scan is available. This allows the evaluation of the atlas registration results through a direct comparison with the ground truth CT data. Our experiments show that the registration employing the statistical deformation models yields generally improved results.
|
|
Christian Schaller
|
14.09.2011
|
|
Time-of-Flight - A New Modality for Radiotherapy
In this work, one of the first approaches utilizing so-called Time-of-Flight cameras for medical applications is presented. Using Time-of-Flight cameras it is feasible to acquire a 3-D model in real-time with a single sensor. Several systems for managing motion within radiotherapy are presented. There are five major contributions in this work: A method to verify internal tumor movement with an external respiratory signal on-line, the application of a novel technology to medical image processing and the introduction of three novel systems, one to measure respiratory motion and two other to position patients. The algorithm to correlate external and internal motion is an image-based synchronization procedure that automatically labels pre-treatment fluoroscopic images with corresponding 4-D CT phases. It is designed as an optimization process and finds the optimal mapping between both sequences by maximizing the image similarity between the corresponding pairs while preserving a temporal coherency. It is both evaluated at synthetic and patient data and an average of 93% correctly labeled frames could be achieved. The Time-of-Flight based respiratory motion system enables the simultaneously measurement of different regions. We evaluate the system using a novel body phantom. Tests showed, that the system signal and the ground truth signal of the phantom have a reliable correlation of more than 80% for amplitudes greater 5 mm. The correlation of both systems is independent (always more than 80%) of the respiratory frequency. Furthermore, the measured signals were compared with a well-established external gating system, the Anzai belt. These experiments were performed on human persons. We could show a correlation of about 88% of our system and the Anzai system. The first positioning system is able to position a C-arm like device with respect to the patient. Therefore, a Time-of-Flight camera acquires the whole body of the patient and segments it into meaningful anatomical regions, like head, thorax, abdomen, legs. The system computes 3-D bounding boxes of the anatomical regions and computes the isocenter of the boxes. Using this information, the C-arm system can automatically position itself and perform a scan. The system is evaluated using a body phantom and an accuracy within the patient table accuracy of 1 cm could be shown. The second system deals with surface-based positioning of a patient with respect to a priorly acquired surface of the same patient. Such systems are necessary, e.g. in radiotherapy or multi-modal imaging. The method uses an Iterative-Closest-Point algorithm, tailored to Time-of-Flight cameras. It is evaluated using a body phantom and obtains an overall accuracy of 0.74 mm +/ 0.37 mm for translations in all three room directions within 10 mm.
|
|
|
Quantitative Computed Tomography
Computed Tomography (CT) is a wide-spread medical imaging modality. Traditional CT yields information on a patient's anatomy in form of slice images or volume data. Hounsfield Units (HU) are used to quantify the imaged tissue properties. Due to the polychromatic nature of X-rays in CT, the HU values for a specific tissue depend on its density and composition but also on CT system parameters and settings and the surrounding materials. The main objective of Quantitative CT (QCT) is measuring characteristic physical tissue or material properties quantitatively. These characteristics can, for instance, be density of contrast agents or local X-ray attenuation. Quantitative measurements enable specific medical applications such as perfusion diagnostic or attenuation correction for Positron Emission Tomography (PET).
This work covers three main topics of QCT. After a short introduction to the physical and technological basics for QCT, we focus on spectral X-ray detection for CT. Here, we introduce two simulation concepts for spectral CT detectors, one for integrating scintillation and one for directly-converting counting detectors. These concepts are tailored specifically for the examined detector type and are supported by look-up tables. They enable whole scan simulations about 200 times quicker than standard particle interaction simulations without sacrificing the desired precision. These simulations can be used to optimize detector parameters with respect to the quality of the reconstructed final result. The results were verified with data from real detectors, prototypes and measuring stations.
The second topic is QCT algorithms which use spectral CT data to realize QCT applications. The core concept introduced here is Local Spectral Reconstruction (LSR). LSR is an iterative reconstruction scheme which yields an analytic characterization of local spectral attenuation properties in object space. From this characterization, various quantitative measures can be computed. Within this theoretical framework, various QCT applications can be formulated. This is demonstrated for quantitative beam-hardening correction, PET and SPECT attenuation correction and material identification.
The final part is dedicated to noise reduction for QCT. In CT noise reduction is directly linked to patient dose saving. Here, we introduce two novel techniques: Firstly an image-based noise reduction based on joint histograms of multi-energy data-sets. This method explicitly incorporates the typical signal properties of multi-spectral data. We demonstrate a dose saving potential of 20% on real and synthetic data. The second method is a non-linear filter applied to projection data. It uses a point-based projection model to identify and preserve structures in the projection domain. This principle is applied to a modified bilateral filter, where the photometric similarity measure is replaced with a structural similarity measure derived from this concept. We examine the properties of this filter on synthetic and real patient data. We get a noise reduction potential of about 15% without sacrificing image sharpness. This is verified on synthetic data and real phantom and patient scans.
|
|
Andreas Fieselmann
|
09.09.2011
|
|
Interventional Perfusion Imaging Using C-arm Computed Tomography: Algorithms and Clinical Evaluation
A stroke is a medical emergency which requires immediate diagnosis and treatment. For several years, image-based stroke diagnosis has been assisted using perfusion computed tomography (CT) and perfusion magnetic resonance imaging (MRI). A contrast agent bolus is injected and time-resolved imaging, at typically one frame per second, is used to measure the contrast agent flow. However, these two modalities are not accessible in the interventional suite where catheter-guided stroke treatment actually takes place. Thus, interventional perfusion imaging, which could lead to optimized stroke management, is currently not available.
In this thesis, a novel approach is developed that makes interventional perfusion imaging possible. It uses a C-arm angiography system capable of CT-like imaging (C-arm CT). This system can acquire projection images during a rotation around the object which are then used to reconstruct 3-D data sets. The comparably low C-arm rotation speed (typically 3–5 seconds per 200°) is the main technical challenge of this approach.
One of the major contributions of this thesis lies in the development and evaluation of a novel combined scanning and reconstruction method. It uses several interleaved scanning sequences to increase the temporal sampling of the dynamic perfusion signals. A dedicated reconstruction scheme is applied to process the data from this protocol. For the first time, in vivo C-arm CT perfusion studies have been carried out and the results have been compared to those from a reference perfusion CT exam. Promising correlation values ranging from 0.63 to 0.94 were obtained.
An additional contribution was made in the field of image reconstruction theory by deriving a theoretical model for image reconstruction artifacts due to time-varying attenuation values. The attenuation values in C-arm CT perfusion imaging vary due to the contrast agent flow during the long C-arm rotation time. It was shown that the magnitude of these artifacts can be reduced when using optimized reconstruction parameters.
Furthermore, investigations regarding special injection protocols were carried out and fundamental image quality measurements were made.
Through the methods developed, the measurements conducted and results obtained, this thesis made a number of significant and original contributions, both on a practical and on a theoretical level, to the novel and highly relevant research field of interventional C-arm CT perfusion imaging.
|
|
Martin Spiegel
|
04.07.2011
|
|
Patient-Specific Cerebral Vessel Segmentation with Application in Hemodynamic Simulation
Cerebral 3-D rotational angiography has become the state-of-the-art imaging modality in modern angio suites for diagnosis and treatment planning of cerebrovascular diseases, e. g. intracranial aneurysms. Among other reasons, it is believed that the incidence of aneurysms is due to the local prevalent hemodynamic pattern. To study such a hemodynamic behavior, the 3-D vessel geometry has to be extracted from 3-D DSA data. Since 3-D DSA data may be influenced by beam hardening, inhomogeneous contrast agent distribution, patient movement or the applied reconstruction kernel, this thesis describes a novel vessel segmentation framework seamlessly combining 2-D and 3-D vessel information to overcome the aforementioned factors of influence. The main purpose of this framework is to validate 3-D segmentation results based on 2-D information and to increase the accuracy of 3-D vessel geometries by incorporating additional 2-D vessel information into the 3-D segmentation process. Three major algorithmic contributions are given within this framework: (1) a classification-based summation algorithm of 2-D DSA series such that 2-D vessel segmentation becomes feasible, (2) a 3-D ellipsoid-based vessel segmentation method which allows for local adaptations driven by 2-D vessel segmentations and (3) a mesh size evaluation study investigating the influence of different mesh type elements and resolutions w. r. t. hemodynamic simulation results. Moreover, this work is chamfered by a simulation study which evaluates the impact of different vessel geometries on the simulation result. The vessel geometries are computed by different segmentation techniques working on the same patient dataset. The evaluation of each framework component revealed high accuracy and algorithmic stability to be applied in a clinical environment.
|
|
Christoph Gütter
|
28.01.2011
|
|
Statistical Intensity Prior Models with Applications in Multimodal Image Registration
Deriving algorithms that automatically align images being acquired from different sources (multimodal image registration) is a fundamental problem that is of importance to several active research areas in image analysis, computer vision, and medical imaging. In particular, the accurate estimation of deformations in multimodal image data perpetually engages researchers while playing an essential role in several clinical applications that are designed to improve available healthcare. Since the field of medical image analysis has been rapidly growing for the past two decades, the abundance of clinical information that is available to medical experts inspires more automatic processing of medical images.
Registering multimodal image data is a difficult task due to the tremendous variability of possible image content and diverse object deformations. Motion patterns in medical imaging mostly originate from cardiac, breathing, or patient motion (i.e. highly complex motion patterns), and the involved image data may be noisy, furnished with image reconstruction artifacts, or rendered with occluded image information resulting from imaged pathologies. A key problem with methods reported in the literature is that they purely rely on the quality of the available images and have, therefore, difficulties in reliably finding an accurate alignment when the underlying multimodal image information is noisy or corrupted.
In this research, we leverage prior knowledge about the intensity distributions of accurate image alignments for robust and accurate registration of medical image data. The following contributions to the field of multimodal image registration are made. First, we developed a prior model called integrated statistical intensity prior model that incorporates both current image information and prior knowledge. It shows an increased capture range and robustness on degenerate clinical image data compared to traditional methods. Second, we developed a generalization of the first model that allows for modeling all available prior information and greater accuracy in aligning clinical multimodal image data. The models are formulated in a unifying Bayesian framework that is embedded in the statistical foundations of information theoretic similarity measures. Third, we applied the proposed models to two clinical applications and validated their performance on a database of approximately 100 patient data sets. The validation is performed using a systematic framework and we further developed a criteria for assessing the quality of non-rigid or deformable registrations.
The experiments on synthetic and real, clinical images demonstrate the superior performance, i.e. in terms of robustness and accuracy, of statistical intensity prior models to traditional registration methods. This suggests that fully automatic multimodal registration (i.e. rigid and non-rigid) is achievable for clinical applications. Statistical intensity prior models deliver great accuracy from a "relatively small" amount of prior knowledge when compared to traditional machine learning approaches that is appealing in both theory and in practice.
|
|
Christopher Rohkohl
|
22.12.2010
|
|
Motion Estimation and Compensation for Interventional Cardiovascular Image Reconstruction
The minimal invasive interventional treatment of cardiac diseases is of high importance in the modern society. Catheter-based procedures are becoming increasingly complex and novel tools for planning and guiding the interventions are required. In recent years intraprocedural 3-D imaging has found its way into the clinics. Based on 2-D X-ray images from C-arm systems a 3-D image with high spatial resolution can be computed. Cardiac vessels are small and moving fast and thus pose a problem to standard reconstruction algorithms. In this thesis, the issues of existing approaches are investigated and novel algorithms are developed that mitigate todays problems in terms of image quality, runtime and assumptions on the cardiac motion. One major contribution is the development of an optimized ECG-gated reconstruction algorithm compensating for non-periodic motion. A cost function inspired from iterative reconstruction algorithms is used to assess the reconstruction quality of an analytic reconstruction algorithm. This key concept is utilized to derive a motion estimation algorithm. The efficient and compact problem formulation allows for the first time the application of ECG-gating in case of non-periodic motion patterns which cannot be reconstructed with previous methods. This significant finding is incorporated into a novel B-spline based motion estimation algorithm which can cope with flexible 3-D motions over time and uses all the projection data. It again takes advantage of an analytic reconstruction algorithm to arrive at a highly efficient, well parallelizable and stable algorithm. In the evaluation it is shown that the developed algorithms allow the reconstruction of clinically challenging cases at high image quality in under 10 minutes. Therefore it combines the desirable properties of reconstruction algorithms in the interventional environment which no other algorithm provided before.
|
|
Johannes Zeintl
|
22.12.2010
|
|
Optimizing Application Driven Multimodality Spatio-Temporal Emission Imaging
Single Photon Emission Computed Tomography (SPECT) is a widely used nuclear medicine imaging technique with many applications in diagnosis and therapy. With the introduction of hybrid imaging systems, integrating a SPECT and a Computed Tomography (CT) system in one gantry, diagnostic accuracy of nuclear procedures has been improved. Current imaging protocols in clinical practice take between 15 and 45 minutes and Filtered Backprojection (FBP) is still widely used to reconstruct nuclear images. Routine clinical diagnosis is based on reconstructed image intensities which do not represent the true absolute activity concentration of the target object, due to various effects inherent to SPECT image formation.
In this thesis, we present approaches for the optimization of current clinical SPECT/CT imaging for selected applications.
We develop analysis tools for the image quality assessment of commonly used static and dynamic cardiac image quality phantoms. We use these tools for the optimization of cardiac imaging protocols with the specific goal of reducing scan time and, at the same time, maintaining diagnostic accuracy. We propose a time-optimized protocol which uses iterative image reconstruction and offers a time reduction by a factor of two, compared to conventional FBP-driven protocols. The optimized protocol shows good agreement with the conventional protocol in terms of perfusion and functional parameters when tested on a normal phantom database and in prospective clinical studies.
In addition to optimizing image acquisition, we propose a calibration method for improved image interpretation which allows to derive absolute quantitative activity concentration values based on reconstructed clinical SPECT images. In this method, we specifically take the non-stationarity of iterative reconstruction into account. In addition, we estimate the imprecision of our quantitative results caused by errors from measurement instrumentation and accumulated through the course of calibration. We could show that accurate quantification in a clinical setup is possible in phantoms and also in-vivo in patients.
We use the proposed calibration method for the quantitative assessment of dynamic processes by using time-contiguous SPECT acquisitions in combination with co-registered CT images and three-dimensional iterative reconstruction. We develop a physical dynamic phantom and establish a baseline for dual-headed SPECT systems by varying time-activity input function and rotation speed of the imaging system. We could show that, using state-of-the-art SPECT/CT systems, an accurate estimation of dynamic parameters is possible for processes with peak times of 30 seconds.
|
|
|
Novel Techniques for Spatial Orientation in Natural Orifice Translumenal Endoscopic Surgery (NOTES)
With a novel approach abdominal surgery can be performed without skin incisions. The natural orifices provide the entry point with a following incision in stomach, colon, vagina or bladder. “Natural Orifice Translumenal Endoscopic Surgery” (NOTES) is assumed to offer significant benefits to patients such as less pain and reduced traumata as well as reduced collateral damages, faster recovery, and better cosmesis. Particular improvement can be reached even for obesity and burn injury patients and children. But the potential advantages of this new technology can only be exploited through safe and standardized operation methods. Several barriers identified for the clinical practicability in flexible intra-abdominal endoscopy can be solved with computer-assisted surgical systems. In order to assist the surgeon during the intervention and to enhance his visual perception, some of these systems are able to additionally provide 3-D information of the intervention site, for others 3-D information is even mandatory.
In this context, the question whether on-line 3-D information can be obtained in real-time had to be answered. One approach in this work to face this challenge is the acquisition of 3-D information directly via the endoscope with a hybrid imaging system. Parallel to the CCD camera a Time-of-Flight (ToF) system is integrated. ToF cameras illuminate the scene actively with an optical reference signal with intensity modulation. For each ToF pixel a distance value depending on the modulation phase shift of the reflected optical wave and the electrical reference signal is estimated. To compensate the high optical attenuation of endoscopic systems, a much more efficient illumination unit with laser diodes was designed. The 3-D depth information obtained by this “MuSToF endoscope” can furthermore be registered with preoperative acquired 3-D volumetric datasets like CT or MRI. These enhanced or augmented 3-D data volumes could then be used to find the transgastric, transcolonic, transvaginal, transurethral or transvesical entry point to the abdomen. Furthermore, such acquired endoscopic depth data can be used to provide better orientation within the abdomen. Moreover, it can also be used to prevent intra-operative collisions and provide an optimized field of view with the possibility for off-axis viewing.
Furthermore, providing a stable horizon on video-endoscopic images especially within non-rigid endoscopic surgery scenarios (particularly within NOTES) is still an open issue. Hence, this work's “ENDOrientation” approach for automated image orientation recti_cation contributes to a great extent to advance the clinical establishment of NOTES. It works with a tiny MEMS tri-axial inertial sensor that is placed on the distal tip of an endoscope. By measuring the impact of gravity on each of the three orthogonal axes and filtering the data using several subsequent algorithms the rotation angle can be estimated out of these three acceleration values. The result can be used to automatically rectify the endoscopic images using image processing methods. The achievable repetition rate is above the usual endoscopic video frame rate of 30Hz, accuracy is about one degree. The image rotation is performed by rotating digitally a capture of the endoscopic analog video signal, which can be realized in real-time. Improvements and benefits have been evaluated in animal studies: Coordination of different instruments and estimation of tissue behavior regarding gravity related deformation and movement was considered to be much more intuitive having a stable horizon within endoscopic images.
Having additional 3-D data or a fixed horizon will not be an unalterable precondition for performing NOTES. But it will help to utilize robotic devices and to support surgeons, who are novices in the field of flexible endoscopy. Since gastroenterologists and surgeons are still not absolutely familiar with the NOTES approach, they will benefit from new technologies and appreciate them.
|
|
|
Evaluation moderner Hardwarearchitekturen zur schnellen CT Rekonstruktion aus Kegelstrahlprojektionen
|
|
|
Normalization of Magnetic Resonance Images and its Application to the Diagnosis of the Scoliotic Spine
Due to its excellent soft tissue contrast and novel innovative acquisition sequences, Magnetic Resonance Imaging has become one of the most popular imaging modalities in health care. However, associated acquisition artifacts can significantly reduce image quality. Consequently, this imperfections can disturb the assessment of the acquired images. In the worst case, they may even lead to false decisions by the physician. Moreover, they can negatively influence an automatic processing of the data, e.g., image segmentation or registration. The most commonly observed artifacts are intensity inhomogeneities and a missing sequence-dependent general intensity scale.
In this thesis, several novel techniques for the correction of the intensity variations are introduced. Further on, we demonstrate their advantages in a clinical application. Many state–of–the–art approaches for correction of inhomogeneities lack either generalizability, efficiency, or accuracy. We present novel methods that overcome these drawbacks by introducing prior knowledge in the objective function and by mapping the optimization process onto a divide–and–conquer like strategy. The experiments show that we can increase the average separability of tissue classes in clinical relevant 3-d angiographies by approximately 18.2% whereas state–of–the–art methods could only achieve 11.6 %. The mapping of the intensities of a newly acquired image to a general intensity scale has to preserve the structural characteristics of the image’s histogram. Further, it has to be invertible. Hence, many standardization approaches estimate a rather coarse intensity transformation. We propose several methods for standardization that are closely related to image registration techniques. These methods compute a perintensity mapping. In addition, the methods presented are the only ones known that do a joint standardization and that can handle images with a very large field–of–view. The experiments show that our method achieves an average intensity overlap of the major tissue classes of T1w images of about 86.2%. The most commonly used state–of–the–art method resulted in only 70.1% overlap.
In order to illustrate the applicability and importance of the proposed normalization techniques, we introduce a system for the computer-aided assessment of anomalies in the scoliotic spine. It is based on the segmentation of the spinal cord using Markov random field theory. All required steps are presented, from the pre-processing to the visualization of the results. In order to evaluate the system, we use the angle between automatically computed planes through the vertebrae and planes estimated by medical experts. This results in a mean angle difference of less than six degrees being accurate enough to be applicable in a clinical environment.
|
|
|
Probabilistic Modeling for Segmentation in Magnetic Resonance Images of the Human Brain
This thesis deals with the fully automatic generation of semantic annotations for medical imaging data by means of medical image segmentation and labeling. In particular, we focus on the segmentation of the human brain and related structures from magnetic resonance imaging (MRI) data. We present three novel probabilistic methods from the field of database-guided knowledge-based medical image segmentation. We apply each of our methods to one of three MRI segmentation scenarios: 1) 3-D MRI brain tissue classification and intensity non-uniformity correction, 2) pediatric brain cancer segmentation in multi-spectral 3-D MRI, and 3) 3-D MRI anatomical brain structure segmentation. All the newly developed methods make use of domain knowledge encoded by probabilistic boosting-trees (PBT), which is a recent machine learning technique. For all the methods we present uniform probabilistic formalisms that group the methods into the broader context of probabilistic modeling for the purpose of image segmentation. We show by comparison with other methods from the literature that in all the scenarios our newly developed algorithms in most cases give more accurate results and have a lower computational cost. Evaluation on publicly available benchmarking data sets ensures reliable comparability of our results to those of other current and future methods. We also document the participation of one of our methods in the ongoing online caudate segmentation challenge (www.cause07.org), where we rank among the top five methods for this particular segmentation scenario.
|
|
Björn Eskofier
|
26.04.2010
|
|
Application of Pattern Recognition Methods in Biomechanics (external: University of Calgary, Canada)
Biomechanical studies often attempt to identify differences between groups. Several scientific methods are available for identifying such differences. Traditional methods often focus on the analysis of single variables and do not take into account high-dimensional dependencies. Moreover, the analysis procedures are often biased by the expectations of the researcher. Pattern recognition based methods provide data driven analysis often conducted simultaneously in multiple dimensions. Such algorithms have recently been applied for biomechanical analysis tasks. However, the use of pattern recognition algorithms is still not well understood in the biomechanical community. Therefore, the contribution of this thesis was to add further understanding of tools from pattern recognition to biomechanical tasks of group differentiation.
Two main application scenarios were addressed. In the first part of the thesis, questions of human gait classification were examined. Existing studies with respect to this task had two main shortcomings. First, the features used for classification were often specific to the input measurements, derived from specific time points and thus not directly transferable to different tasks. Second, frequently only information from single variables was analyzed and high-dimensional dependencies neglected. Therefore, techniques for running and walking gait pattern classification were developed that overcame these shortcomings. They employed generic features that used a more complete representation of the available information compared to traditional methods. Moreover, high-dimensional dependencies were accounted for. Several group classification tasks were successfully solved using the developed methodology. The techniques are general and applicable to different group classification tasks without adaptation.
In the second part of the thesis, the implementation of pattern recognition algorithms on embedded systems was considered. Such systems allow, for instance, the application of pattern recognition systems outside the lab for sports biomechanics as well as for many other domains. General considerations for the implementation of pattern recognition algorithms on this specific hardware environment were still missing in the literature. A general methodology for embedded classification was therefore developed. The ability of this approach to produce acceptable results in sports biomechanics related classification tasks was shown. Furthermore, the applicability of embedded solutions for data collection in sports classification studies was demonstrated.
|
|
|
Praktikable Ansätze für mehrsprachige und nicht-muttersprachliche Spracherkennung
|
|
|
Adaptive Filtering for Noise Reduction in X-Ray Computed Tomography
The projection data measured in computed tomography (CT) and, consequently, the slices reconstructed from these data are noisy. This thesis investigates methods for structure preserving noise reduction in reconstructed CT datasets. The goal is to improve the signal-to-noise ratio without increasing the radiation dose or loss of spatial resolution. Due to the close relation between noise and radiation dose, this improvement at the same time opens up a possibility for dose reduction. Two different original approaches, which automatically adapt themselves to the non-stationary and non-isotropic noise in CT, were developed, implemented and evaluated. The first part of the thesis concentrates on wavelet based noise reduction methods. They are based on the idea of using reconstructions from two disjoint subsets of projections as input to the noise reduction algorithm. Correlation analysis between the wavelet coefficients of the input images and noise estimation in the wavelet domain is used for differentiating between structures and noise. In the second part, an original approach based on noise propagation through the reconstruction algorithm is presented. A new method for estimating the local noise variance and correlation in the image from the noise estimates of the measured data is proposed. Based on the additional information about the image noise, an adaptive bilateral filter is introduced. The proposed methods are all evaluated with respect to the obtained noise reduction rate, but also in terms of their ability to preserve structures. A contrast dependent resolution analysis is performed to estimate the dose reduction potential of the different methods. The achieved noise reduction of about 60% can lead to dose reduction rates between 40% to 80%, depending on the clinical task.
|
|
|
One-to-one Edge Based Registration and Segmentation Based Validations in Hybrid Imaging
During the past decade, image registration has become an essential tool for medical treatment in clinics, by finding the spatial mapping between two images, observing the changes of anatomical structure and merging the information from different modalities. On the other hand, the matching of appropriately selected features is becoming more and more important for the further improvement of registration methods, as well as for the qualitative validation of registration. The purpose of this thesis is to solve the following two problems: How to integrate feature detection into a non-rigid registration framework, so that a high quality spatial mapping can be achieved? How to systematically measure the quality of multi-modal registration by automatically segmenting the corresponding features? For the first problem, we develop a general approach based on the Mumford-Shah model for simultaneously detecting the edge features of two images and jointly estimating a consistent set of transformations to match them. The entire variational model is realized in a multi-scale framework of the finite element approximation. The optimization process is guided by an EM type algorithm and an adaptive generalized gradient flow to guarantee a fast and smooth relaxation. This one-to-one edge matching is a general registration method, which has been successfully adapted to solve image registration problems in several medical applications, for example mapping inter-subject MR data, or alignment of retina images from different cameras. For the second problem, we propose a new method validating the hybrid functional and morphological image fusion, especially for the SPECT/CT modality. It focuses on measuring the deviation between the corresponding anatomical structures. Two kinds of anatomical structures are investigated as validationmarkers: (1) the hot spot in a functional image and its counterpart in the morphological image (2) the kidneys in both modalities. A series of special methods are developed to segment these structures in both modalities with minimum user interaction. Accuracy of the validation methods have been confirmed by experiments with real clinical data-sets. The inaccuracies of hot spot based validation for neck regions are reported to be 0.7189±0.6298mm in X-direction, 0.9250 ± 0.4535mm in Y -direction and 0.9544 ± 0.6981mm in Z-direction. While the inaccuracies of kidneys based validation for abdomen regions are 1.3979±0.8401 mm in X-direction, 1.9992 ± 1.3920 mm in Y -direction and 2.7823 ± 2.0672 mm in Z-direction. In the end, we also discuss a new interpolation based method to effectively improve the SPECT/CT fusion and present preliminary results.
|
|
|
Statistical Medical Image Registration with Applications in Epilepsy Diagnosis and Shape-Based Segmentation
The advances in scanner technologies over the past years and a growing number of modalities in medical imaging result in an increased amount of patient data. The physicians are faced with an overwhelming amount of information when comparing different scans. Therefore, automatic image processing algorithms are necessary to facilitate everyday clinical workflows. The present work focuses on automatic, statistical image registration approaches and applications in epilepsy diagnosis and shape-based segmentation. Registration algorithms based on image intensity statistics are currently state-of-the-art to automatically compute an alignment between multi-modal images. The parameters, however, are sensitive to the input data. In the present work, we study the mutual influences of these parameters on the intensity statistics and present datadriven estimation schemes to optimize them with respect to the input images. This is necessary to register large sets of images both accurately and reliably. The presented evaluation results, which are based on a database with an established gold standard, confirm that individually optimized parameters lead to improved results compared to standard settings found in literature. Besides spatial accuracy, the reduction of the computation time for the registration is equally important. In this thesis, we present an approach to reduce the search space for the optimization of a rigid registration transform by a nonlinear projection scheme, which is closely related to the concept of marginalization of random variables. Within each projection, a disjoint subset of the transform parameters is optimized with greatly reduced computational complexity. With a good choice of the projection geometry, the search space can be separated into disjoint subsets. In the case of rigid 3-D image registrations, the nonlinear projection onto a cylinder surface allows for an optimization of the rotation around the cylinder axis and a translation along its direction without the need for a reprojection. Sub-volume registration problems are supported by fitting the projection geometry into the overlap domain of the input images. The required objective functions are constrained by systems of linear inequalities and solved by means of constrained, nonlinear optimization techniques. A statistical framework is proposed to measure the accuracy of the registration algorithms with respect to manual segmentation results. The aforementioned concepts of the data-driven density estimators are adopted for the estimation of spatial densities of the segmented labels in order to model the observer reliability. The accuracy of the spatial registration transform is measured between the estimated distributions of the segmented labels in both input images using the Kullback-Leibler divergence. The proposed algorithms are evaluated by a registration of a database of morphological and functional images with an established gold standard based on fiducial marker implants. Applications are presented for the subtraction of single emission computed tomography scans for epilepsy diagnosis, where the intensity distributions are estimated for both the task of the registration and the normalization of the images. Finally, the registration is utilized for shape-based image segmentation to establish a model for the variability within a collective of segmented training shapes.
|
|
Marcus Prümmer
|
02.09.2009
|
|
Cardiac C-Arm Computed Tomography: Motion Estimation and Dynamic Reconstruction
Generating three dimensional images of the heart during interventional procedures is a significant challenge. In addition to real-time fluoroscopy, angiographic C-arm systems can also be used to generate 3-D/4-D CT images on the same system. One protocol for cardiac Computed Tomography (CT) uses electrocardiogram (ECG) triggered multi-sweep scans. A 3-D volume of the heart at a particular cardiac phase is reconstructed by the Feldkamp, Davis and Kress (FDK) algorithm using projection images with retrospective ECG gating. In this thesis we introduce a unified framework for heart motion estimation and dynamic cone-beam reconstruction using motion corrections. Furthermore, theoretical considerations about dynamic filtered backprojection (FBP) as well as dynamic algebraic reconstruction techniques (ART) are presented, discussed and evaluated. Dynamic CT reconstruction allows to improve temporal resolution and image quality using image processing. It is limited by C-arm device hardware like rotation speed. The benefits of motion correction are: (1) increased temporal and spatial resolution by removing cardiac motion which may still exists in the ECG-gated data sets, and (2) increased signal-to-noise ratio (SNR) by using more projection data than is used in standard ECG gated methods. Three signal enhancing reconstruction methods are introduced that make use of all of the acquired projection data to generate a time resolved 3-D reconstruction. The first averages all motion corrected backprojections; the second and third perform a weighted averaging according to: (1) intensity variations and (2) temporal distance to a time resolved and motion corrected reference FDK reconstruction. In a study seven methods are compared: non-gated FDK, ECG-gated FDK, ECG-gated and motion corrected FDK, the three signal enhancing approaches, and temporally aligned and averaged ECG-gated FDK reconstructions. The quality measures used for comparison are spatial resolution and SNR. Additionally new dynamic algebraic reconstruction techniques (ART) are introduced, compared to dynamic Filtered Backprojection (FBP) methods and evaluated. In ART we model the objects motion either using a dynamic projector model or a dynamic grid of the object, defining the spatial sampling of the reconstructed density values. Both methods are compared to each other as well as to dynamic FBP. Spatial and temporal interpolation issues in dynamic ART and FBP and the computational complexity of the algorithms are addressed. The subject-specific motion estimation is performed using standard non-rigid 3-D/3-D and novel 3-D/2-D registration methods that have been specifically developed for the cardiac C-arm CT reconstruction environment. In addition theoretical considerations about fast shift-invariant filtered backprojection methods in dependency of an affine, rayaffine and non-rigid motion model are presented. Evaluation is performed using phantom data and several animal models. We show that data driven and subject-specific motion estimation combined with motion correction can decrease motion-related blurring substantially. Furthermore, SNR can be increased by up to 70% while maintaining spatial resolution at the same level as it is provided by the ECGgated FDK. The presented framework provides excellent image quality for cardiac C-arm CT. The thesis contributes to an improved image quality in cardiac C-arm CT and provides several methods for dynamic FBP and ART reconstruction.
|
|
|
Speech of children with Cleft Lip and Palate: Automatic Assessment
This work investigates the use of automatic speech processing techniques for the automatic assessment of children’s speech disorders. The target group were children with cleft lip and palate (CLP). The speech processing techniques are applied to evaluate the children’s speech intelligibility and their articulation. Another goal of this work is to visualize the kind and degree of the pathology in the children’s speech. Tracking of the children’s therapy progress is also within the reach of the system.
Cleft lip and palate is the most common orofacial alteration. Even after adequate surgery, speech and hearing is still affected. The articulation or speech disorders of the children consist of typical misarticulations such as backing of consonants and enhanced nasal air emission.
State-of-the-art evaluation of speech disorders is performed perceptively by human listeners. This method, however, is hampered by inter- and intra-individual differences. Therefore, an automatic evaluation is desirable.
We developed PEAKS — the Program for the Evaluation of All Kinds of Speech disorders. With PEAKS one can record and evaluate speech data via the Internet. It runs in any web browser and features security concepts such as secure transmission and user level access control.
The agreement of PEAKS with different human experts is measured with different correlation coefficients, Kappa, and Alpha. The evaluation procedures for intelligibility employ Support Vector Machines and Regression. Furthermore, dimensionality reduction techniques such as LDA, PCA, and Sammon mapping are used for the visualization and the feature reduction. As input for these algorithms typical speech processing features such as MFCCs as well as specialized feature sets for prosody, pronunciation, and hypernasalization are employed. Another approach of this work is to use a children’s speech recognizer to model a naïve listener. If the recording conditions are kept constant, the speaker should be the only varying factor. Hence, the recognition rate should resemble the intelligibility of the speaker.
Collection of patient speech data was performed in Erlangen from 2002 until 2008. 312 children with CLP were recorded. Control groups were gathered in four major cities of Germany to cover several regions of dialect. 726 control data sets were acquired.
The experimental results showed that the automatic system yields a high and significant agreement to the human raters for global parameters such as intelligibility as well as single articulation disorders. The system is in the same range as the human raters. The intelligibility assessment was shown to be independent of the region of dialect. The visualization of the speech data also showed high agreement to perceptively rated criteria. Artifacts which were caused by the use of multiple microphones were removed.
|
|
Frank Dennerlein
|
04.12.2008
|
|
Image Reconstruction from Fan-Beam and Cone-Beam Projections
This thesis addresses the problem of reconstructing static objects in 2D and 3D transmission computed tomography (CT). After reviewing the classical CT reconstruction theory, we discuss and thoroughly evaluate various novel reconstruction methods, two of which are original. Our first original approach is for 2D CT reconstruction from full-scan fan-beam data, i.e., for 2D imaging in the geometry of diagnostic medical CT scanners. Compared to conventional methods, our approach is computationally more efficient and also yields results with an overall reduction of image noise at comparable spatial resolution, as demonstrated in detailed evaluations based on simulated fan-beam data and on data collected with a Siemens Somatom CT scanner. Part two of this thesis discusses the problem of 3D reconstruction in the short-scan circular cone-beam (CB) geometry, i.e., the geometry of medical C-arm systems. We first present a detailed comparative evaluation of innovative methods recently suggested in the literature for reconstruction in this geometry and of the approach applied on many existing systems. This evaluation involves various quantitative and qualitative figures-of-merit to assess image quality. We then derive an original short-scan CB reconstruction method that is based on a novel, theoretically-exact factorization of the 3D reconstruction problem into a set of independent 2D inversion problems, each of which is solved iteratively and yields the object density on a single plane. In contrast to the state-of-the-art methods discussed earlier in this thesis, our factorization approach does not involve any geometric approximations during its derivation and enforces all reconstructed values to be positive; it thus provides quantitatively very accurate results and effectively reduces CB artifacts in the reconstructions, as illustrated in the numerical evaluations based on computer-simulated CB data and also real CB data acquired with a Siemens Axiom Artis C-arm system.
|
|
|
Accurate Cone-Beam Image Reconstruction in C-Arm Computed Tomography
The goal of this thesis is the robust implementation of an accurate cone-beam image reconstruction algorithm such that it is able to process real C-arm data from a circle-plus-arc trajectory. This trajectory is complete and especially well suited for C-arm systems, since it can be performed purely by rotating the C-arm around the patient without the need to move the patient table. We observed two major challenges: i) non-ideal acquisition geometry and ii) data truncation. To account for deviations from the ideal description of the data acquisition geometry, we developed a novel calibration procedure for the circle-plus-arc trajectory. For the second problem, we developed two novel truncation correction methods that approximately but effectively handle data truncation problems. For image reconstruction, we adapted the accurate M-line algorithm. In particular, we applied a novel and numerically stable technique to compute the view dependent derivative with respect to the source trajectory parameter and we developed an efficient way to compute the PI-line backprojection intervals via a polygonal weighting mask. We have chosen the M-line algorithm, since it does not presume an ideal description of the data acquisition geometry. We acquired projection data of a physical phantom of a human thorax on a medical C-arm scanner. Reconstructed images exhibit strong cone-beam artifacts along the bones of the spine when applying the conventional Feldkamp algorithm. These results are compared to those obtained with our implementation of the M-line algorithm. As our ultimate goal, we demonstrate that cone-beam artifacts can be completely eliminated by applying the M-line algorithm to a Tuy complete set of data.
|
|