Week 2 Deliverable: From Raw Text to Visual Patterns

Course: Topical Reading: Digital Humanities (BA3 Korean Studies)
Date: October 17
Instructors: Aron van der Pol & Steven Denney


Objective

Recreate the in-class DEMO (Oct 17) workflow to understand how preprocessing affects text representation and visualization in Orange Data Mining (ODM).


Tasks

  1. Replicate the DEMO workflow
    • Import the provided .csv corpus.
    • Apply basic preprocessing and other steps in ODM.
    • Create a Word Cloud and a Bar Chart of Word Frequencies.
    • For the bar chart, select one word of personal or analytical interest and explore its relative frequency.
  2. Document your workspace
    • Screenshot your Orange canvas showing the full workflow:
    • Screenshot or export both the Word Cloud and Bar Chart outputs.
  3. Write your reflection (README.md)
    • Explain briefly what you did, why you did it, and what you learned about preprocessing and visualization.
    • Around 200 words is fine. You can write more.
  4. Save your .ows file (this is your ODM file/workflow).

Deliverables

Place the following files in your GitHub repo folder for week02 assignment:

Folder structure:

/week02/
│
├── workflow_screenshot.png
├── wordcloud.png
├── barchart.png
├── week02.ows
└── README.md

Optional (R Track)

For students in the R Programming extension, complete:
DataCamp – Introduction to Text Analysis in R

R track assignments are due before the next class.