Research Aims
Aim 1: Characterizing semantic content in multimodal data
semantic_content We will study how classical notions of information can be integrated with Bayesian models of multimodal data to characterize semantic content (Aim 1.1). We will also develop novel measures of semantic information content that generalize principles from information physics (Aim 1.2). Publications>>
Aim 2: Impact of data transformations on information content
data_transformations We will develop methods for mitigating the impact of data transformations and nuisance factors on the information measures described in Aim 1. For certain types of data transformations, we will show that one can derive minimal, information preserving representations of multimodal data (Aim 2.1), whose computation will require the development of efficient marginalization techniques (Aim 2.2). Since exact marginalization of nuisances will not be possible for general transformations, we will also develop domain adaptation techniques that mitigate the impact of nuisance factors on information content via optimal transformations to a latent space (Aim 2.3). We will also study generalization bounds for the proposed methods in the framework of PAC-Bayes (Aim 2.4). Publications>>
Aim 3: Most informative representations for classification and perception
informative_representations We will develop a mathematical framework for learning most informative representations of multimodal data, including methods for feature selection and dimensionality reduction for classification and perception. The new measures of information to be developed in Aim 1 will lead to nonconvex formulations of these problems (Aim 3.1) for which we will develop new theory and algorithms to guarantee global optimality (Aim 3.2) and convergence to the global optimum (Aim 3.3). We will also develop scalable algorithms for handling large volumes of complex multimodal data (Aim 3.4). Publications>>
Aim 4: Characterizing uncertainty in multimodal representations
characterizing_uncertainty We will develop a statistical framework for characterizing the uncertainty of the information representations derived in Aim 3 using both frequentists (Aim 4.1) and Bayesian (Aim 4.2) approaches. We will also develop efficient statistical sampling methods (Aim 4.3), which will be useful for both characterizing uncertainty and performing inference in the information pursuit framework. Publications>>
Aim 5: Semantic information pursuit for multimodal data
semantic_content We will develop a mathematical framework for integrating information representations obtained from different data modalities. For simple multimodal classification tasks we will develop novel information fusion techniques (Aim 5.1). For more complex semantic interpretation tasks, we will develop a novel information-theoretic framework called information pursuit (Aim 5.2), which will use the novel measures of information and uncertainty to integrate information representations. Publications>>
Aim 6: Integration and validation
integration_validation The theoretical methods developed in the project will be applied and tested on real and large datasets. This involves combining and applying the methods developed in each subproject to a variety of complex multimodal datasets such as biometrics, text, images, videos, smart phone sensor signals (wifi, GPS, accelerometer, pressure, app usage), and data from body worn cameras. Publications>>