List of Work Packages (WPs)
The work within the project will be organized into four main work packages. The first will develop efficient methods for structured output prediction in a batch-learning setting, exploring combinations with semi-supervised learning. The second will develop methods for SOP on data streams, also considering combinations with SSL. The third will be concerned with the task of SOP in network data, considering combinations with SSL and mining DS. The fourth work package will address showcase applications of the developed methods in the areas of computational biology, sensor networks, multi-media annotation and retrieval, and social networks.
WP1 – Methods for efficient SOP in a batch learning setting
WP1 will develop methods for efficient SOP in a batch learning setting, addressing the following tasks: learning tree and rule ensembles for SOP, feature ranking for SOP, semisupervised learning for SOP, and decomposition of the output space for SOP. The development and implementation of efficient tree and rule-based ensemble methods for SOP has received some attention recently but, despite the progress made, has left the issue of the trade-off between predictive and explanatory power unresolved: While ensembles have better predictive performance than single models, they are too large to inspect and understand and consequently provide explanation. To address this issue, we will develop methods for learning option trees for SOP, as well as methods for learning rule ensembles for SOP
WP2 – Methods for SOP from data streams
WP2 will develop methods for learning models for SOP from data streams. It will comprise the tasks of learning trees and rules, feature ranking, semi-supervised learning, and event/change detection, all in the context of SOP and data streams. We will develop methods for learning trees and rules for SOP in data streams, considering different types of structured outputs and different types of models (including tree and rule-based ensembles). To begin with, we will consider the task of regression, which has only recently been addressed by tree-based models on data streams: Approaches for and measures used in learning regression models can be extended to different structured-outputs much more easily than those used for classification
WP3 – Methods for analysis of network data
WP3 will focus on developing methods for analysis of network data, covering a variety of network learning settings and learning tasks (including SOP and SSL). More specifically, WP3 will address the following tasks: learning models for SOP in the presence of network autocorrelation, learning models for link prediction in multi-type networks, distributed learning on streaming network data, and learning ranking models in network data. The task of learning models for SOP in the presence of network autocorrelation will consider different types of structured output, e.g., hierarchies of labels for hierarchical MLC and real-valued time series. It will also consider different types of networks (and thus autocorrelation), such as spatial or spatio-temporal networks or networks describing interactions between nodes.
WP4 – Applications of the developed methods
WP4 will be concerned with applications of the developed methods to practically relevant problems of analysing data from the areas of computational biology, multimedia, social and sensor networks. The computational biology applications will include relating genotype and phenotype data, linking the composition of microbiomes to the host’s health and nutrition, and different types of quantitative structure-activity relationship (QSAR) modelling that links the structural properties of chemical compounds to various aspects of their biological activity. Multimedia applications will include the analysis (both retrieval and annotation) of multimedia data, such as images/videos and text/web documents. The third application domain will be the analysis of social networks. We will use approaches developed in this project to track the evolution of dynamic networks, detecting and characterizing changes in their structure. Furthermore, we design and develop a social media aggregator for data coming from the different social services and the user’s history. Using SSL/ SOP, we will create personalized social/ semantic contexts for the user’s social data and thus help users deal with the cognitive overload. Finally, we will use methods developed within the project to analyze sensor network data, addressing the task of monitoring plants for producing electrical power in the context of SmartGrids.