Overview

Course: BA2 Korean Studies, Leiden University
Instructor: Dr. Steven Denney
Time: Mondays, 15:15-17:00
Location: Huizinga 0.09 (DH Lab) & Arsenaal B0.05
Duration: 12 sessions (February 02 - May 18)


Brief Course Description

This course introduces computational text analysis as a research method in Korean and area studies. You will learn to treat text as data, transforming written sources into formats that can be analyzed computationally. The course emphasizes Korean-language primary sources, while allowing students to supplement these with other language materials, which will be organized and analyzed as text corpora. In addition, the course reviews recent research that applies digital tools and methods in the digital humanities and computational social sciences.

While this course is designed primarily for students in the Korea Studies program at Leiden University, it welcomes students from other programs and will support the use of primary source materials in languages other than Korean.

Using Orange Data Mining, a visual platform that makes computational methods accessible without advanced programming, you will work through the complete text analysis pipeline:

You will also develop foundational R programming skills through guided tutorials. No prior programming experience is required.

The course culminates in a Research Methods Project applying text analysis to Korean-language materials.

Before the first class, please complete the software installation steps in the Getting Started guide.


Learning Objectives

By the end of this course, you will be able to:

  1. Apply text preprocessing, descriptive analysis, clustering, classification, and topic modeling
  2. Practice data management and transparency best practices
  3. Establish a foundation in the R programming language
  4. Reflect on the strengths and limitations of computational methods in research

Weekly Schedule

Week Date Topic
1 Feb. 02 Introduction & Getting Started
2 Feb. 09 Foundations of Computational Text Analysis
3 Feb. 16 Text Preprocessing Basics
4 Feb. 23 Text Preprocessing Practice
5 Mar. 02 Descriptive Patterns in Text
6 Mar. 09 Midterm Review & Assessment
7 Mar. 16 Clustering
8 Mar. 30 Classification I - Dictionary & Rule-Based
9 Apr. 13 Classification II - Machine Learning (SVM)
10 Apr. 20 Topic Modeling (LDA)
11 May 11 Final Review & Assessment
12 May 18 Research Methods Project Workshop

See the Syllabus for detailed weekly content, readings, and assignments.


Resources

Textbook

Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton University Press.

Tools & Platforms


This course is part of the Korean Studies program in the Humanities Faculty at Leiden University.