Word Clouds: TF–IDF vs. Word Embeddings
(Files: cluster_wordclouds_tfidf.png and cluster_wordclouds_wordembeddings.png)
Due: Thursday, November 20, 2025 by 17:00
For this task, examine the two provided images:
cluster_wordclouds_tfidf.png— word cloud based on TF–IDF weights.cluster_wordclouds_wordembeddings.png— word cloud based on word-embedding (vector) similarity.
Your Task
Write a short response that clearly addresses:
-
What each image is showing.
Describe what you see in each cloud and what it suggests about the “modern (South) Korea” cluster. -
Why the metrics themselves are different.
Explain, as technically and specifically as you can,- what TF–IDF measures,
- what word embeddings measure,
- and why these represent fundamentally different mathematical approaches to text (e.g., frequency-based vs. distributional/semantic vector representations).
Keep it concise but technically accurate. The goal is to show that you understand the difference in what the metrics quantify, not just the visual outcome.