Big Data in Radiology: Quantitative Assessment of 3D Computer Tomography Image Data Bases

Machine learning and algorithmic data analysis has become one of the most influential research areas of modern computer science. The ability to enable a machine to learn how to relate data characteristics to semantically meaningful quantities has opened up new opportunities for science: For the first time it becomes possible to (semi-) automatically obtain phenomenological descriptions of complex patterns in real-world systems that evade traditional modeling from “first principals”. Digital sensors and availability of large computational resources allow the never-tired machine to sift through large amounts of data in order to find subtle and complex patterns that might evade the human scientist’s attention without the use of these new digital tools.
This project “Big Data in Radiology” aims at developing new methods for recognizing the relation between geometric structures in 3D computer tomography images, and medical information such as conditions, diagnoses, or prognoses. We envision a two-level design of our analysis pipeline: First, the raw 3D voxel data is annotated by a structural, anatomically-guided description that identifies anatomical features at a higher level of abstraction. Second, regression analysis is used in order to relate anatomical quanta to medical information. We aim at performing this analysis on large databases so that we will, in the long run, be able to identify even complex and subtle patterns in a statistically meaningful way. We will apply the developed machinery to a set of open medical questions in order to evaluate whether a computational pipeline can contribute new concrete medical insights. As a long term perspective of this kind of work, we envision a system that creates hypotheses on manifestations of medically relevant conditions in geometry in order to assist research and possibly, at some point, clinical screening.