The ScienceBlog corpus is a selection of posts and comments from the website. Dates of publication range from year 2006 to the beginning of 2014. Posts and related comments share a common and doc.title attribute. The corpus is tagged using TreeTagger with the Penn tagset.

ScienceBlog was prepared in 2014 by Akshay Minocha (