Data Mining for Indoor and Outdoor Atmospheric Chemistry

There are hundreds of thousands of volatile organic compounds (VOC) in the air that we breathe. Despite being found at trace concentrations, organic species are known to play key roles in atmospheric chemistry, supplying reactive carbon, nitrogen and radicals to the remote troposphere and thus influencing the global ozone budget of the troposphere. The sources, distribution, sinks, and chemistry of these species are the subject of much current research. Recent technical advances in this field provide a vast amount of data on VOCs. So far, the data was analyzed primarily manually, yet the size of the data makes it hard to extract all relevant information in this way. Therefore, data mining and in general computational methods for the analysis of large volumes of complex data have a great potential to facilitate further progress in this area. The overall goal of this project is to develop the first dedicated data management and mining framework for the analysis of indoor and outdoor atmospheric VOCs data. In a first step, data collected continuously in a cinema over the period of a month are used for a proof-of-concept analysis of VOCs data. We will use the experience from this and extend the methods towards a more general framework for the analysis of data from both indoor and outdoor atmospheric chemistry.