ComiRNet – The Database of Predicted miRNAs Regulatory Networks

ComiRNet is a database of miRNA target predictions and predicted miRNA regulatory networks. ComiRNet stores approximately 5 million predicted interactions between 934 human miRNAs and 30,875 gene transcripts (mRNAs) which are exploited in the construction of the hierarchies of overlapping biclusters representing potential miRNA regulatory networks.

Clus: A Predictive Clustering System

Clus is a decision tree and rule induction system that implements the predictive clustering framework. This framework unifies unsupervised clustering and predictive modelling and allows for a natural extension to more complex prediction settings such as multi-task learning and multi-label classification. Clus is co-developed by the Declarative Languages and Artificial Intelligence group of the Katholieke Universiteit Leuven, Belgium, and the Department of Knowledge Technologies of the Jo┼żef Stefan Institute, Ljubljana, Slovenia.


A theoretical framework that unifies different data mining tasks, on different types of data can help to formalize the knowledge about the domain of data mining and provide a base for future research, unification and standardization. It can directly support the development of a general framework for data mining, support the representation of the process of mining structured data, and allow the representation of the complete process of knowledge discovery.

Web-based system for retrieval and modality classification of medical images

The system uses multimodal features, both textual and visual. For the visual features we used the state-of-the-art opponent SIFT features, whereas, for the textual features we referred to the standard bag-of-words representation. We applied query expansion to further improve the text-based retrieval. At the end, we included the medical modality of the images as input to the retrieval.

DiatomSearch: Diatom identification system

DiatomSearch is a hierarchical multi-label classification (HMC) system for diatom image classification. Our approach to HMC exploits the classification hierarchy by building a single predictive clustering tree (PCT) that can simultaneously predict all different levels in the hierarchy of taxonomic ranks: genus, species, variety, and form. The system can be used by taxonomists to annotate new diatom images.

Interpolative Clustering Tree Learner

Interpolative Clustering Tree (ICT) is a data mining algorithm that allows us to summarize data sampled over space for a number of goephysical variables by leveraging the power of a spatial-aware clustering algortihm. ICT determines a descriptive and interpolative model of georeferenced data sampled for the set of elds under examination.

TweetViz: Twitter Data Visualization

TweetViz ia a web tool for visualizing Twitter data. TweetViz offers several different kinds of visualizations that can pertain to a Twitter user or any keyword or hashtag entered through the interface. TweetViz also includes a so called Streamgraph visualization that represents topic distribution in a set of tweets. The topic distributions are created using LDA (Latent Dirichlet Allocation).

NewsTweetSentiment – system for sentiment analysis of news-related social media responses

NewsTweetSentiment is a system for sentiment analysis of social media responses on Twitter and matching them with news articles from several sources. In this way users can get a better understanding of the reactions a news article is receiving. Currently, we support news feeds from BBC and The Guardian from several topics including worldwide news, business, sport and technology news. The articles presented in the news feed are not older than 24 hours.

HOCCLUS: Hierarchical and Overlapping Co-Clustering of mRNA:miRNA Interactions

Method for the extraction of co-clusters of miRNAs and messenger RNAs (mRNAs). Different from several already available co-clustering algorithms, our approach efficiently extracts a set of possibly overlapping, exhaustive and hierarchically organized co-clusters.

AMRules: Rules for regression data streams

The volume and velocity of data is increasing at astonishing rates. In order to extract knowledge from this huge amount of information there is a need for efficient on-line learning algorithms. Three rule-based algorithms offer state-of-the-art results for mining regression streams. The algorithms are implemented and available from MOA (MASSIVE ONLINE ANALYSIS).

DAMRules: Distributed Adaptive Model rules for Regression

This the first distributed streaming algorithm to learn decision rules for regression tasks. The algorithm is available in SAMOA (SCALABLE ADVANCED MASSIVE ONLINE ANALYSIS), an open-source platform for mining big data streams. It uses a hybrid of vertical and horizontal parallelism to distribute Adaptive Model Rules (AMRules) on a cluster.