How Latin America is Perceived in UN General Debates

Author

Wendy Chavez Tacuri

Published

May 6, 2024

1 Introduction

Latin America is often discussed in global politics, but it’s not always clear how much attention the region actually gets at the international level, especially in formal settings like the United Nations General Assembly (UNGA). This project looks at how often Latin American countries are mentioned in UNGA speeches, using a dataset that covers speeches from 1970 to today.

The idea behind this was to see which countries get the most attention and whether this can tell us something about their political or global relevance. I used Python to clean and analyze the text data, and then used R to visualize the results. Later on, I also explored sentiment analysis, to see not just how often countries are mentioned, but in what tone.

1.1 Visualization with ggplot2

The darker shades mean more mentions.

In order to visualize how often Latin American countries are mentioned in UN General Debate speeches, I followed a process using Python and R for visualization.

I used Python to:

  • Load and clean the dataset of UN speeches

  • Define a list of Latin American countries

  • Count how many times each country was mentioned in the text of all speeches

  • Export this data into a CSV file

This resulted in a file:

latin_america_mentions.csv, containing two columns: country and mentions.

1.2 Mentions table

The following table shows each Latin American country sorted by the number of mentions.

country mentions
Cuba 2023
Panama 1867
Mexico 1635
Nicaragua 1518
Chile 1479
Guatemala 1386
Brazil 1304
Uruguay 1213
Argentina 1206
El Salvador 1103
Bolivia 1095
Ecuador 1089
Venezuela 1034
Honduras 1005
Colombia 995
Peru 963
Paraguay 832
Costa Rica 786
Dominican Rep. 524

1.3 Mentions Bar Plot

This bar chart helps visualize the rankings of countries by how often they’re mentioned.

2 Textual Analysis

2.1 Word Frequency

Here, I identified the most commonly used words in UN speeches that mention Latin America to uncover dominant narratives or themes.

True

This chart shows the 30 most common words used in UN speeches that mention Latin America.

  • The most frequent words are general terms like “nations”, “international”, and “countries”, which makes sense since these are speeches from the United Nations.

  • We also see words like “peace”, “development”, “economic”, and “security”. These suggest that Latin America is often talked about in connection with global cooperation, economic growth, and peace efforts.

Other words like “people”, “support”, and “efforts” show a focus on helping communities or promoting shared values.

2.2 Wordclouds

To complement the frequency chart, I decided to create a wordcloud to offer a visual representation of the most frequently used words.

This wordcloud shows the most common terms found in UN speeches that mention Latin America.

Words like “nations”, “international”, “countries”, and “peace” are some of the largest, meaning they were used a lot.

This tells us that when Latin America is mentioned at the UN, it’s often in the context of international cooperation, global peace, and development.

It’s also interesting that words like “security”, “economic”, and “development” appear often, these show the main topics and concerns being discussed.

2.3 Sentiment analysis

In this section, I decided to take a different approach from the one we used in class; in fact, to keep things more efficient, I decided to use TextBlob, a Python library that assigns a basic polarity score to each text. This method doesn’t capture the full complexity of political language, but it still provides a useful overview of how the tone of speeches mentioning Latin America has evolved over time.

Later in the project, I did use GPT API to interpret the topics generated by LDA, since I felt that was a moment where a more interpretive model was really needed. But for this first sentiment trend, a lighter approach felt like a better fit.

This is a line plot showing the average sentiment score of UN speeches that mention Latin America, for each year.

Here, as I said before, I used TextBlob This method automatically analyzes the text and gives a polarity score:

  • +1 = very positive

  • 0 = neutral

  • –1 = very negative

In this chart:

Each dot is the average polarity of all speeches that mention Latin America in that year.

A score around 0.11 means speeches were generally slightly positive in tone.

2.4 Choosing the Number of Topics

To decide how many topics to use in my LDA model, I tested different values and calculated their semantic coherence.

As shown in the chart below, coherence peaked at 4 topics.

data = [
    {"num_topics": 2, "coherence": 0.305},
    {"num_topics": 3, "coherence": 0.317},
    {"num_topics": 4, "coherence": 0.332},
    {"num_topics": 5, "coherence": 0.325},
    {"num_topics": 6, "coherence": 0.319},
    {"num_topics": 7, "coherence": 0.312}
]

import pandas as pd
coherence_df = pd.DataFrame(data)
library(dplyr)
library(ggplot2)

# Use dataframe from Python
coherence_df <- reticulate::py$coherence_df

ggplot(data=coherence_df) +
  geom_line(aes(x = num_topics, y = coherence), color = "darkgreen", size = 1.2) +
  geom_point(aes(x = num_topics, y = coherence), color = "lightgreen", size = 2) +
  scale_x_continuous(breaks = seq(min(coherence_df$num_topics), max(coherence_df$num_topics), by = 1)) +
  theme_bw() +
theme(
  legend.position = "bottom"
) +
labs(
  title = "Semantic Coherence by Number of Topics",
  x = "Number of Topics", y = "Coherence Score"
)

2.5 Top Modeling LDA

While sentiment analysis helped capture the general tone of the speeches, I was also interested in uncovering the main topics that appeared repeatedly across the texts. To do this, I used LDA

This graph shows the top 10 words for each topic discovered by the LDA model.

Each row represents a topic found in the speeches that mention Latin America.

The words inside each row are the most important for that topic — meaning they appear most often in the speeches grouped under that theme.

The colors show how strong or dominant each word is within its topic: brighter colors = more importance.

Some examples:

Topic 1 includes words like union, democracy, terrorism, and climate — which suggests this topic could be about international cooperation and global challenges.

Topic 2 includes latin, american, powers, and soviet — possibly connected to geopolitical influence and Cold War themes.

Topic 3 has terms like israel, arab, treaty, and reform — hinting at topics involving Middle East politics or peace processes.

Topic 4 contains democracy, regime, apartheid, and debt — suggesting a theme related to political systems and human rights.

This kind of analysis helps us move beyond just word counts; in fact, it shows what themes or narratives are being discussed when Latin America comes up in UN speeches.

2.6 GPT based interpretation

After identifying the main topics in the speeches using LDA, I wanted to better understand what each of those topics actually meant. Just looking at the top words is helpful, but not always easy to interpret clearly.

Hence, I used GPT API to analyze the top 10 words from each topic and summarize what they are really about. I asked GPT to tell me the overall tone (positive, negative, or neutral), and whether each topic focused more on opportunity, crisis, development, or conflict.

This step helped me translate the data into a more real and understandable way, and see how Latin America is framed in different ways across UN speeches.

Topic 1

Topic 1:
 Based on the key words provided, the topic of the speeches about Latin America is likely related to democracy and regional unity in the context of addressing issues such as terrorism, climate change, aggression, and possibly discussions about the Americas region. The references to Namibia, the Arab world, and the Non-Aligned Movement (NAM) suggest a broader international perspective on these issues.

In terms of tone, it seems to be a mix of both positive and negative elements. While the emphasis on democracy and union may suggest positive intentions, the presence of words like terrorism, aggression, and climate signals a recognition of challenges and threats within the region.

The focus of this topic appears to be on addressing crises and conflicts, with an emphasis on cooperation among nations to tackle common challenges such as terrorism and climate change. Overall, the topic likely revolves around the need for regional unity and international cooperation to address pressing issues in Latin America.

Topic 2

Topic 2:
 Based on the top words provided, the topic is likely about the political dynamics and interactions within Latin America, while also referencing external actors such as the Soviet Union, Europe, Israel, and Arab countries. The presence of words like "powers," "terrorism," and "third" suggest a discussion on power dynamics, security issues, and potentially the Non-Aligned Movement or the developing world (often referred to as the Third World).

The tone suggested by these words leans towards a more serious and possibly tense atmosphere, given the mentions of terrorism and references to major global actors like the Soviet Union and key regions like Europe and the Middle East.

In terms of the narrative, this collection of words indicates a mix of regional power struggles and external influences. While there are references to cooperation (e.g., union), the mention of Cold War-related terms like "Soviet" and the focus on powers and potential tensions (terrorism, Israel, Arab) suggest a narrative that includes elements of both regional power struggles and Cold War framing.

Topic 3

Topic 3:
 Based on the top words from this topic in UN speeches about Latin America, it suggests that Latin America is often framed in a global context, with references to global issues such as Israel, terrorism, European countries, and Soviet aggression. This framing implies that Latin America's role and standing in international affairs are seen as interconnected with larger global dynamics.

The inclusion of words like "treaty" and "reform" could indicate efforts towards regional cooperation and the importance of international agreements in addressing regional challenges. However, the presence of words such as "Israel", "Arab", "terrorism", and "aggression" suggests that external influences and conflicts also play a significant role in shaping the discourse around Latin America.

Overall, it appears that Latin America is often viewed through a lens that considers both external influences and regional cooperation as key factors in shaping the region's standing and relationships within the global community.

Topic 4

Topic 4:
 Based on the top words from this topic, it appears that the speeches are focusing on issues related to democracy, regime change, challenges faced by Latin America, apartheid (likely in reference to South Africa), debt, and terrorism. 

The mention of democracy and regime suggests a discussion around political systems and governance, with a possible emphasis on promoting democratic principles and addressing challenges related to authoritarian regimes. The reference to apartheid indicates a discussion on issues of racial discrimination and human rights violations. Additionally, the mention of debt points to economic struggles and financial challenges faced by countries in Latin America.

Overall, the framing of these words suggests that the speeches are highlighting a mix of human rights issues, economic struggles, and international justice concerns in Latin America. It indicates a focus on promoting democracy, addressing human rights violations and economic inequalities, while also addressing security challenges such as terrorism.

This approach helped me understand more clearly how Latin America is framed in UN speeches. In fact, Latin America is often perceived as a region marked by political and social challenges, but also as a space for cooperation and international engagement. Sometimes it is framed within broader global dynamics, like the Cold War or international security, while in other cases it appears as a region in need of attention on issues such as democracy, human rights, and development.

3 Who is Framing Latin America?

Now I wanted to understand which countries are most engaged in discussing Latin America.

3.1 Bar Plot of Speaker Countries

The following bar plot shows who is shaping the international narrative about Latin America

After identifying which countries talk the most about Latin America, I wanted to explore how this attention has evolved over time. Is the region consistently discussed, or are there specific moments when it receives more focus in UN speeches?

3.2 How Has This Changed Over Time?

Now, to see how this attention has shifted over the years, I grouped the speeches by year and looked at how often Latin America was mentioned each year.

We can now observe how Latin America fades in and out of global debates.

This line chart shows how frequently Latin America was mentioned in speeches at the United Nations General Debate, year by year from the early 1970s through 2015. The data suggest that Latin America was a prominent topic during the Cold War era (e.g., due to U.S.–Soviet interest in the region, revolutions, debt crisis). Mentions dropped in later decades, possibly due to changing geopolitical priorities.

4 Conclusion

This project gave me the chance to explore how Latin America is perceived in UN General Debate speeches. I didn’t just look at how often the region is mentioned, but also how it’s talked about and what kind of narratives are associated with it.

What I learned is that Latin America is often seen as a region dealing with political and economic challenges, like issues of governance, debt, or security. But it’s also described as a place for international cooperation, especially when it comes to topics like peace, democracy, and development. In some speeches, the region is connected to broader global tensions, such as Cold War dynamics or international conflicts, which shows how external powers can influence the way Latin America is represented.

5 Appendix

5.1 Visualization

library(tidyverse)
library(maps)

mentions <- read_csv("C:/Users/rubi2/OneDrive - John Cabot University/Desktop/PL-CS/Final project/new dataset/latin_america_mentions.csv")

# Here, I adjusted Dominican Republic to match the map's naming format 
mentions$country[mentions$country == "Dominican Republic"] <- "Dominican Rep."

latin_america <- c(
  "Argentina", "Bolivia", "Brazil", "Chile", "Colombia", "Costa Rica", "Cuba",
  "Dominican Rep.", "Ecuador", "El Salvador", "Guatemala", "Honduras",
  "Mexico", "Nicaragua", "Panama", "Paraguay", "Peru", "Uruguay", "Venezuela"
)

world_map <- map_data("world")

map_mentions <- world_map %>%
  filter(region %in% latin_america) %>%
  left_join(mentions, by = c("region" = "country")) %>%
  mutate(mentions = replace_na(mentions, 0))

ggplot(map_mentions, aes(x = long, y = lat, group = group, fill = mentions)) +
  geom_polygon(color = "white", linewidth = 0.3) +
  coord_fixed(1.3) +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  theme_void() +
  labs(title = "Mentions of Latin American Countries in UN Speeches",
       fill = "Mentions")

5.2 Mentions Bar Plot

library(tidyverse)

mentions <- read_csv("C:/Users/rubi2/OneDrive - John Cabot University/Desktop/PL-CS/Final project/new dataset/latin_america_mentions.csv")
mentions$country[mentions$country == "Dominican Republic"] <- "Dominican Rep."

ggplot(mentions, aes(x = reorder(country, mentions), y = mentions, fill = mentions)) +
  geom_col() +
  geom_text(aes(label = country), hjust = 1.1, color = "white", size = 3.5) +
  coord_flip() +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  theme_void() +
  labs(title = "Mentions of Latin American Countries in UN Speeches", fill = "Mentions")

5.3 Top 30 most common words in UN speeches about Latin America

library(tidyverse)

top_words <- read_csv("C:/Users/rubi2/OneDrive - John Cabot University/Desktop/PL-CS/Final project/new dataset/top_words_latin_speeches.csv")

ggplot(top_words, aes(x = reorder(word, frequency), y = frequency, fill = frequency)) +
  geom_col() +
  coord_flip() +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  theme_light() +
  labs(title = "Top 30 Words in Speeches About Latin America",
       x = "Word", y = "Frequency")

5.4 Wordcloud

library(tidyverse)
library(quanteda)
library(quanteda.textplots)

# Load filtered speeches
latin_df <- read_csv("latin_df_filtered.csv")

# Combine all text into one string
latin_text <- paste(latin_df$text, collapse = " ")

# Create corpus
corpus_latam <- corpus(latin_text)

# Tokenize and clean
tokens_latam <- tokens(corpus_latam,
                       remove_punct = TRUE,
                       remove_numbers = TRUE) %>%
  tokens_remove(pattern = stopwords("english"))

dfm_latam <- dfm(tokens_latam)

# Plot wordcloud
textplot_wordcloud(dfm_latam, max_words = 100)

5.5 Sentiment over time

from textblob import TextBlob
import pandas as pd

# Load the filtered speeches
latin_df = pd.read_csv("latin_df_filtered.csv")

# Apply TextBlob sentiment polarity
latin_df["sentiment"] = latin_df["text"].apply(lambda text: TextBlob(text).sentiment.polarity)

latin_df.to_csv("latin_df_with_sentiment.csv", index=False)
library(tidyverse)

sentiment_data <- read_csv("latin_df_with_sentiment.csv")

# Average sentiment per year
sentiment_by_year <- sentiment_data %>%
  group_by(year) %>%
  summarise(avg_sentiment = mean(sentiment, na.rm = TRUE)) %>%
  arrange(year) %>%
  mutate(
    rolling_avg = (lag(avg_sentiment) + avg_sentiment + lead(avg_sentiment)) / 3
  )

ggplot(sentiment_by_year, aes(x = year)) +
  geom_line(aes(y = avg_sentiment, color = "Annual Sentiment"), size = 1.2) +
  geom_point(aes(y = avg_sentiment, color = "Annual Sentiment"), size = 2) +
  geom_line(aes(y = rolling_avg, color = "3-Year Rolling Average"), size = 1.3) +
  scale_color_manual(values = c("Annual Sentiment" = "#2a9134", "3-Year Rolling Average" = "steelblue")) +
  theme_light() +
  labs(
    title = "Sentiment Over Time: UN Speeches on Latin America",
    x = "Year", y = "Average Sentiment Score",
    color = "Legend"
  )

5.6 Top Modeling LDA

import pandas as pd
import re
from gensim import corpora
from gensim.models.ldamodel import LdaModel

# Load filtered speeches
latin_df = pd.read_csv("latin_df_filtered.csv")

# Preprocess text
latin_df["processed"] = latin_df["text"].apply(lambda x: re.sub(r'[^a-z\s]', '', x.lower()).split())

# Build dictionary and corpus
dictionary = corpora.Dictionary(latin_df["processed"])
dictionary.filter_extremes(no_below=10, no_above=0.5)
corpus = [dictionary.doc2bow(text) for text in latin_df["processed"]]

# Train LDA model
lda = LdaModel(corpus=corpus, id2word=dictionary, num_topics=4, random_state=42)

# Get top terms per topic
top_terms = []
for topic_id in range(lda.num_topics):
    for term, prob in lda.show_topic(topic_id, topn=10):
        top_terms.append((topic_id + 1, term, prob))

# Format and save to CSV
top_terms_df = pd.DataFrame(top_terms, columns=["topic", "term", "beta"])
top_terms_df["rank"] = top_terms_df.groupby("topic")["beta"].rank("dense", ascending=False)

top_terms_df.to_csv("top_terms_lda.csv", index=False)
library(tidyverse)

result_df <- read_csv("top_terms_lda.csv")

ggplot(result_df, aes(y = reorder(topic, -topic), x = rank)) +
  geom_tile(aes(fill = beta)) +
  scale_fill_viridis_c() +
  geom_label(aes(label = term), fill = "white", size = 3) +
  labs(title = "Top Words per Topic from LDA Model",
       y = "Topic", x = "Top 10 Words") +
  theme_light() +
  theme(
    legend.position = "bottom",
    legend.key.width = unit(2, "cm"),
    legend.title = element_text(size = 10),
    legend.text = element_text(size = 9)
  )

5.7 Bar Plot of Speaker Countries

library(tidyverse)

speakers <- read_csv("C:/Users/rubi2/OneDrive - John Cabot University/Desktop/PL-CS/Final project/new dataset/speakers_about_latam.csv")

top_speakers <- speakers %>%
  arrange(desc(mentions)) %>%
  slice_max(order_by = mentions, n = 5)

ggplot(top_speakers, aes(x = reorder(country, mentions), y = mentions, fill = mentions)) +
  geom_col(width = 0.7) +
  geom_label(aes(label = country),
             fill = "white", color = "black",
             size = 2.7, hjust = 1.05) +
  coord_flip() +
  scale_fill_gradient(low = "#a8e6a1", high = "#2a9134") +
  theme_void() +
  labs(title = "Top 5 Countries Talking About Latin America in UN Debates", fill = "Mentions")

5.8 Change over time

mentions_by_year = latin_df.groupby("year").size().reset_index(name="mentions")
mentions_by_year.to_csv("C:/Users/rubi2/OneDrive - John Cabot University/Desktop/PL-CS/Final project/new dataset/mentions_over_time.csv", index=False)
mentions_time <- read_csv("C:/Users/rubi2/OneDrive - John Cabot University/Desktop/PL-CS/Final project/new dataset/mentions_over_time.csv")

mentions_time <- mentions_time %>%
  arrange(year) %>%
  mutate(
    rolling_avg = (lag(mentions, 1) + mentions + lead(mentions, 1)) / 3
  )


ggplot(mentions_time, aes(x = year)) +
  geom_line(aes(y = mentions, color = "Annual Mentions"), size = 1.2) +
  geom_point(aes(y = mentions, color = "Annual Mentions"), size = 2) +
  geom_line(aes(y = rolling_avg, color = "3-Year Rolling Average"), size = 1.3) +
  scale_color_manual(values = c("Annual Mentions" = "#2a9134", "3-Year Rolling Average" = "steelblue")) +
  theme_light() +
  labs(
    title = "Mentions of Latin America in UN Speeches Over Time",
    x = "Year",
    y = "Number of Speeches",
    color = "Legend"
  )