Knowledge Bases

Knowledge Corroboration

Current knowledge bases suffer from either low coverage or low accuracy, yet user feedback can greatly improve the quality of automatically extracted knowledge bases. User feedback could help quantify the uncertainty associated with the stored statements and would enable mechanisms for searching, ranking and reasoning at entity-relationship level. Most importantly, a principled model for exploiting user feedback to learn the truth values of statements in the knowledge base would be a major step forward in addressing the issue of knowledge base curation. We present a family of probabilistic graphical models that builds on user feedback and logical inference rules derived from the popular Semantic-Web formalism of RDFS. Through internal inference and belief propagation, these models can learn both, the truth values of the statements in the knowledge base and the reliabilities of the users who give feedback. We demonstrate the viability of our approach in extensive experiments on real-world datasets, with feedback collected from Amazon Mechanical Turk.


Kasneci, Gjergji; Gael, Jurgen Van; Herbrich, Ralf; Graepel, Thore

Bayesian Knowledge Corroboration with Logical Rules and User Feedback Proceedings Article

In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 1–18, 2010.

Abstract | Links | BibTeX


Paquet, Ulrich; Gael, Jurgen Van; Stern, David; Kasneci, Gjergji; Herbrich, Ralf; Graepel, Thore

Vuvuzelas & Active Learning for Online Classification Proceedings Article

In: Proceedings of Computational Social Science and the Wisdom of Crowds Workshop, 2010.

Abstract | Links | BibTeX

Features from Knowledge Bases

The prediction accuracy of learning algorithms highly depends on the quality of the selected features; but often, the task of feature construction and selection is tedious and non-scalable. Over the years, however, there have been numerous projects with the goal of constructing general-purpose or domain-specific knowledge bases with entity-relationship-entity triples extracted from various Web sources or collected from user communities, e.g., YAGO, DBpedia, Free- base, UMLS. We introduce an expressive graph-based language for extracting features from such knowledge bases and a theoretical framework for constructing feature vectors from the extracted features. The experimental evaluation on different learning scenarios provides evidence that the features derived through our framework can considerably improve the prediction accuracy, especially when the labeled data at hand is sparse.


Cheng, Weiwei; Kasneci, Gjergji; Graepel, Thore; Stern, David H; Herbrich, Ralf

Automated Feature Generation From Structured Knowledge Proceedings Article

In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, pp. 1395–1404, 2011.

Abstract | Links | BibTeX

Efficient Graph Matching

The ⍬-subsumption problem is crucial to the efficiency of learning systems on structured knowledge bases, finding a match of a sub-graph with variables in node or edge labels. We present discuss two ⍬-subsumption algorithms based on strategies for preselecting suitable matching literals for the variables. We further map the general problem of ⍬-subsumption to a certain problem of finding a clique of fixed size in a graph, and in turn show that a specialization of the pruning strategy of the Carraghan and Pardalos clique algorithm provides a dramatic reduction of the subsumption search space.


Scheffer, Tobias; Herbrich, Ralf; Wysotzki, Fritz

Efficient $Theta$-Subsumption Based on Graph Algorithms Proceedings Article

In: Lecture Notes in Artifical Intelligence: 6th International Workshop on Inductive Logic Programming,, pp. 212–228, 1996.

Abstract | Links | BibTeX