The ScienceBlog corpus is a selection of posts and comments from the scienceblogs.com website. Dates of publication range from year 2006 to the beginning of 2014. Posts and related comments share a common doc.id and doc.title attribute. The corpus is tagged using TreeTagger with the Penn tagset.

ScienceBlog was prepared in 2014 by Akshay Minocha (akshayminocha5@gmail.com).