Macalester WikiBrain development team, Summer 2013
Brief Introduction
WikiBrain’s busy thinking up its first public release. Please be patient while we fine tune our APIs and complete our documentation. Ask us questions at the WikiBrain google group!
The WikiBrain Java library enables researchers and developers to incorporate state-of-the-art Wikipedia-based algorithms and technologies in a few lines of code.
If you’d like to cite WikiBrain, please use: Sen, Shilad, Toby Jia-Jun Li, WikiBrain Team, and Brent Hecht. “WikiBrain: Democratizing computation on Wikipedia.” In Proceedings of The International Symposium on Open Collaboration, p. 27. ACM, 2014. [pdf]
WikiBrain Features
WikiBrain is easy to use. Wikipedia data can be downloaded, parsed, and imported into a database by running a single command. WikiBrain allows you to incorporate state-of-the art algorithms in your Java projects in just a few lines of code.
WikiBrain is multi-lingual. WikiBrain supports all 267 Wikipedia language editions, and builds a concept-map that connects an article in one language to the same article in another langauge.
WikiBrain is fast.WikiBrain uses single-machine parallelization (i.e. multi-threading) for all computationally intensive features. While it imports data into standard SQL databases (h2 or Postgres), it builds optimized local caches for critical data.
WikiBrain integrates a variety of specific algorithms and datasets in one framework, including:
Semantic-relatedness algorithms that measure the strength of association between two concepts such as “racecar” and “engine.”
GeoSpatial algorithms for spatial Wikipedia pages like Minnesota and the Eiffel Tower.
Wikidata: Support for structured Wikidata “facts” about articles.
Pageviews: Public data about how often Wikipedia pages are viewed with hourly precision.
An example program
Once you have imported data, you can write programs that analyze Wikipedia. Here’s a simple example you can find in the Cookbook:
1 | // Prepare the environment |
When you run this program, you’ll see output:
1 | resolution of apple |
Developer
WikiBrain development is led by Shilad Sen at Macalester College and Brent Hecht at the University of Minnesota. WikiBrain has been generously supported by the National Science Foundation, Macalester College, the Howard Hughes Medical Institute, and the University of Minnesota. WikiBrain is licensed under the Apache License, version 2.
WikiBrain has been made possible through substantial contributions by many students, including: Alan Morales Blanco, Margaret Giesel, Rebecca Gold, Becca Harper, Ben Hillman, Sam Horlbeck, Aaron Jiang, Matthew Lesicko, Toby Li, Yulun Li, Huy Mai, Ben Mathers, Sam Naden, Jesse Russell, Laura Sousa Vonessen, Zixiao Wang, and Ari Weilland
Reference
[1]WikiBrain
[2]Semantic similarity
[3]Semantic Measures Library & ToolKit
[4]WordNet::Similarity