Current knowledge bases suffer from either low coverage or low accuracy, yet user feedback can greatly improve the quality of automatically extracted knowledge bases. User feedback could help quantify the uncertainty associated with the stored statements and would enable mechanisms for searching, ranking and reasoning at entity-relationship level. Most importantly, a principled model for exploiting user feedback to learn the truth values of statements in the knowledge base would be a major step forward in addressing the issue of knowledge base curation. We present a family of probabilistic graphical models that builds on user feedback and logical inference rules derived from the popular Semantic-Web formalism of RDFS. Through internal inference and belief propagation, these models can learn both, the truth values of the statements in the knowledge base and the reliabilities of the users who give feedback. We demonstrate the viability of our approach in extensive experiments on real-world datasets, with feedback collected from Amazon Mechanical Turk.
Bayesian Knowledge Corroboration with Logical Rules and User Feedback Proceedings Article
In: Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 1–18, 2010.
Vuvuzelas & Active Learning for Online Classification Proceedings Article
In: Proceedings of Computational Social Science and the Wisdom of Crowds Workshop, 2010.
Features from Knowledge Bases
The prediction accuracy of learning algorithms highly depends on the quality of the selected features; but often, the task of feature construction and selection is tedious and non-scalable. Over the years, however, there have been numerous projects with the goal of constructing general-purpose or domain-specific knowledge bases with entity-relationship-entity triples extracted from various Web sources or collected from user communities, e.g., YAGO, DBpedia, Free- base, UMLS. We introduce an expressive graph-based language for extracting features from such knowledge bases and a theoretical framework for constructing feature vectors from the extracted features. The experimental evaluation on different learning scenarios provides evidence that the features derived through our framework can considerably improve the prediction accuracy, especially when the labeled data at hand is sparse.
Automated Feature Generation From Structured Knowledge Proceedings Article
In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, pp. 1395–1404, 2011.
Efficient Graph Matching
The ⍬-subsumption problem is crucial to the efficiency of learning systems on structured knowledge bases, finding a match of a sub-graph with variables in node or edge labels. We present discuss two ⍬-subsumption algorithms based on strategies for preselecting suitable matching literals for the variables. We further map the general problem of ⍬-subsumption to a certain problem of finding a clique of fixed size in a graph, and in turn show that a specialization of the pruning strategy of the Carraghan and Pardalos clique algorithm provides a dramatic reduction of the subsumption search space.
Efficient $Theta$-Subsumption Based on Graph Algorithms Proceedings Article
In: Lecture Notes in Artifical Intelligence: 6th International Workshop on Inductive Logic Programming,, pp. 212–228, 1996.