Week 12: Final Paper

Workshop: Monday 11 May 2026, in class · Paper due: Friday 5 June 2026, 23:59 (Brightspace)

The final paper is a short research report (2,500–6,000 words) using one corpus from the curated dataset menu, supported by a public, FAIR-structured GitHub replication repository. You can do the analysis in Orange Data Mining or in R — your choice; R is an option, not a requirement. The paper is marked out of 10.

BriefPDF · 4 pp RubricPDF · 1 pp Dataset MenuGitHub · 11 corpora

Workshop (Monday 11 May)

Come prepared. Before class, pick one corpus from the dataset menu and draft a research question — phrased as a single sentence — that the corpus can answer. Question and dataset need to fit together.

In class, you will present your dataset and question to me for review. After that, you will workshop your analysis plan with classmates working on different corpora.

If you would prefer a corpus from scdenney/nlp_corpora that is not on the menu, or another corpus entirely, email me before the workshop. I will consider it, but it needs approval.

What you submit

By 23:59 on Friday 5 June 2026, on Brightspace: your paper as a single PDF (2,500–6,000 words). There is no separate field for a repository URL — embed the link to your public, FAIR-structured replication repository as a footnote on the paper’s title, using the suggested replication-package wording in the brief.

Full requirements, structure, and grading detail are in the brief and rubric.