Week 12: Final Paper
The final paper is a short research report using one corpus from the curated dataset menu, supported by a public, FAIR-structured GitHub replication repository. It must be 2,500–6,000 words, excluding references, figures/tables, and appendix material. The 2,500-word minimum is firm; papers may exceed 6,000 words by up to 10% before a length penalty applies. You can do the analysis in Orange Data Mining or in R — your choice; R is an option, not a requirement. The paper is marked out of 10.
Workshop (Monday 11 May)
Come prepared. Before class, pick one corpus from the dataset menu and draft a research question — phrased as a single sentence — that the corpus can answer. Question and dataset need to fit together.
In class, you will present your dataset and question to me for review. After that, you will workshop your analysis plan with classmates working on different corpora.
If you would prefer a corpus from scdenney/nlp_corpora that is not on the menu, or another corpus entirely, email me before the workshop. I will consider it, but it needs approval.
What you submit
By 23:59 on Friday 5 June 2026, submit one PDF on Brightspace. The PDF must contain the paper itself and a footnote on the title with the URL of your public, FAIR-structured replication repository. There is no separate Brightspace field for the repository URL, so the title footnote is where the link must appear. Use the suggested replication-package wording in the brief.
Full requirements, structure, and grading detail are in the brief and rubric.