Skip to content
Text Analysis
  • AI Chat
  • Code
  • Report
  • Adam Smith and John Keynes: A Textual Exploration with R

    Introduction

    Adam Smith’s The Wealth of Nations and John Keynes’s The General Theory of Employment, Interest, and Money are two influential works in the field of economics that have greatly shaped economic thought. These works, written in different time periods, present contrasting and evolving perspectives on various economic concepts. In this presentation, I aim to conduct a text analysis of these influential works using R. By comparing the ideas and topics put forth by Smith and Keynes, I seek to gain insights into their differing views on economic principles such as market dynamics, government intervention, employment, interest rates, and money supply.

    The objective of this presentation is to highlight the contrasting viewpoints of Adam Smith and John Keynes regarding market dynamics and their implications for economic growth and development. I will only be focusing on their most famous works mentioned above. This analysis will contribute to a deeper understanding of the evolution of economic thought and its implications for contemporary economic theories and policies.

    Methods

    To compare the texts, I employed a text analysis approach using R, specifically leveraging the tidyverse ecosystem . In addition to many other packages, I utilized dplyr, tidyverse, ggplot2, and wordcloud to clean, analyze and visualize the textual data. This included steps such as tokenization, removing stop words, and frequency analysis. Next, I employed sentiment analysis to understand the overall tone and polarity of the texts. Additionally, I utilized topic modeling techniques, such as Latent Dirichlet Allocation (LDA), to discover and compare the main themes and concepts addressed by Smith and Keynes in their respective works.

    Frequency Analysis

    After tidying, tokenizing, and removing stop words from the text, below are the top 25 most frequently used words in The Wealth of Nations. Not surprisingly, during the period the book was published in 1776, the industrial revolution had barely started, so words like “stock”,” corn”, “land” and “produce” properly reflected what still dominated the economy at the time. Also, the word “price” is the most used word in the book since it is discussed by Smith in the context of supply, demand, competition, scarcity, and allocating resources, which are basically the major themes in the book. Another major theme in the book is the division of labor and how that saves time and increases productivity. That is why if we look again, we see that the word “labour” and “time” rank as the 3rd and 13th most frequently used words in the book, respectively.

    install.packages('gutenbergr') 
    library(gutenbergr)
    library(dplyr)
    install.packages('tidytext')
    library(tidytext)
    library(ggplot2)
    
    
    adam_smith <- gutenberg_download(c(3300))
    
    tidy_adam_smith<-adam_smith %>%
      unnest_tokens(word, text) %>%
      anti_join(stop_words)
    
    tidy_adam_smith%>%
      count(word, sort = TRUE) %>%
      filter(n > 200) %>%
      mutate(word = reorder(word, n)) %>%
      top_n(25)%>%
      ggplot(aes(word, n)) +
      geom_col() +
      xlab(NULL) +
      coord_flip()
    
    

    Below are the top 25 most frequently used words in The General Theory of Employment, Interest, and Money. We can see that compared to the previous book, we see new words like “rate”, “marginal”, and “investment”. The word “rate” refers to interest rate and it is ranked as the most used word in the book. It is extensively used in the book in the context of macro-economic theory, which is essentially the branch of economics that the book revolutionized. The word “marginal” refers to the incremental change in utility, efficiency, productivity, cost, etc. This reflects how the field of economics had changed since Smith’s period (1776 to 1936) and became a bit more complex and technical. We can also infer a change in the period between the two books by looking at the word I mentioned earlier, “investment”, which refers to investing in equipment, machinery, infrastructure, etc. This reflects the economy at the time the book was published, in 1936, decades after the industrial revolution had already taken place.

    install.packages("readr")
    library(readr)
    library(stringr)
    file_path <- "thegeneraltheorypdf.txt"
    John_Keyenes  <- read_file(file_path)
    John_Keyenes <- John_Keyenes %>% 
      str_split('\n')
    John_Keyenes <- John_Keyenes %>% unlist()  
    Tidy_John_Keyenes<-John_Keyenes %>% as_tibble()
    
    Tidy_John_Keyenes<-Tidy_John_Keyenes%>%unnest_tokens(word, value)%>%
    anti_join(stop_words)
    
    
      Tidy_John_Keyenes%>%
      count(word, sort = TRUE) %>%
      filter(n > 200) %>%
      mutate(word = reorder(word, n)) %>%
      top_n(25)%>%
      ggplot(aes(word, n)) +
      geom_col() +
      xlab(NULL) +
      coord_flip()
    
    

    The term frequency (tf) statistic measures the relative frequency of a term in a document. The inverse document frequency statistic (idf) measures the weight of commonly used words in a document. Multiplied together, we get a term’s tf-idf, which measures how important a term is in a collection of documents.

    Tidy_John_Keyenes<- Tidy_John_Keyenes%>% mutate(book="Theory of Employment")
    tidy_adam_smith<- tidy_adam_smith %>% mutate(book="Wealth of Nations")
    both_books<-bind_rows(Tidy_John_Keyenes,tidy_adam_smith) 
    both_books<-both_books %>%
     count(book, word, sort = TRUE) %>%
     ungroup()
    total_words <- both_books %>%
     group_by(book) %>%
     summarize(total = sum(n))
    both_books1 <- left_join(both_books, total_words)
    
    both_books1 <- both_books1 %>%
     bind_tf_idf(word, book, n)
    
    
    both_books1 %>%
     arrange(desc(tf_idf)) %>%
     mutate(word = factor(word, levels = rev(unique(word)))) %>%
     group_by(book) %>%
     top_n(15) %>%
     ungroup %>%
     ggplot(aes(word, tf_idf, fill = book)) +
     geom_col(show.legend = FALSE) +
     labs(x = NULL, y = "tf-idf") +
     facet_wrap(~book, ncol = 2, scales = "free") +
     coord_flip()+
      scale_fill_manual(values = c("Theory of Employment" = "#CC6666", "Wealth of Nations" = "#CC6666"))

    Sentiment Analysis

    Using the wordcloud package, we can again visualize the most common words in The Wealth of Nations, but this time including sentiment. The size of the word indicates how commonly it is used, and its color indicates whether it is positive or negative