Skip to contents

plot a network of co-occurrence of terms, as returned by extract_graph and then by dplyr::count(). The size of words and compound words means the individual frequency of each word/compound word. The thickness of the links indicates how often the pair occur together. Pay attention that if the words doesn't appear in different sizes when plotted, maybe the relative differences in their frequency can be very low.

Usage

net_wordcloud(
  DF,
  text,
  head_n = 30,
  lower = TRUE,
  edge_color = "lightblue",
  edge_alpha = 0.5,
  edge_cut = 2,
  text_color = "black",
  text_contour_color = NA,
  edge_norm = TRUE,
  layout = "graphopt"
)

Arguments

DF

a dataframe of co-occurrence, extracted with `extract_graph()` and `count(n1, n2)`

text

the original text used to extract the graph. It is necessary to calculate the individual frequency of the words.

head_n

number of nodes to show - the more frequent. Dedault = 30. To display all, use `n_head = ""`

lower

Convert words to lowercase. If the text is passed in all lowercase, it can return false sentence and paragraph tokenization.

edge_color

color of the links

edge_alpha

transparency of the links. Values between 0 and 1.

edge_cut

in mm, how much you want that the edge stop before reach the node. Can improve readability.

text_color

color of the text in nodes

text_contour_color

if empty (default) no contour is used. If the color is specified, so the contour is used.

layout

the layout of the plot. Options are bipartite, star, circle, nicely, dh, gem, graphopt, grid, mds, sphere, randomly, fr, kk, drl, lgl. More info at ggraph documentation

Examples


# stopwords:
my_sw <- c(stopwords::stopwords(
  language = "en",
  source = "snowball", simplify = TRUE
), "lol")

txt_wiki |> # text available in the package
  cooccur_words(sw = my_sw) |>
  net_wordcloud(txt_wiki, DF = _, head_n = 50) # plotting
#> You provided a vector of 45 elements instead of one. No problem, but these will be collapsed into a single element, with a final punctuation mark added to each, to ensure it is treated as different sentences in the process of tokenization.
#> tokenizing sentences...
#> tokenizing words...
#> You provided a vector of 45 elements instead of one. These will be collapsed into  a single element, with a final punctuation mark added to each.
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=======                                                               |   9%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |===============================================================       |  91%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |======================================================================| 100%


txt_wiki |> # text available in the package
  # because it is a vector, let's collapse it into a single element:
  cooccur_words(sw = my_sw) |>
  net_wordcloud(txt_wiki, DF = _) # plotting
#> You provided a vector of 45 elements instead of one. No problem, but these will be collapsed into a single element, with a final punctuation mark added to each, to ensure it is treated as different sentences in the process of tokenization.
#> tokenizing sentences...
#> tokenizing words...
#> You provided a vector of 45 elements instead of one. These will be collapsed into  a single element, with a final punctuation mark added to each.
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |======================================================================| 100%