Yoruba web as corpus. It was compiled in June 2015 with encoding in UTF-8 and isn’t tagged yet. The corpus contains 2.8 million words.