Skip to contents

given a dataframe with a column (list) with POS, returns a list with three elements: 1) a tibble with the frequency of graphs; 2) a tibble with the isolated nodes, i.e. nodes withot any connection. 3) a tibble with the individual frequency of each node. If list element has only one element, it is removed.

Usage

graph_from_coocurrence_list(coocurrence_list, strip_rgx = "^the_", freq = TRUE)

Arguments

coocurrence_list

a dataframe generated by get_pairs()

strip_rgx

regex to strip. Default: "^the_". To erase nothing, use "".

freq

if TRUE (default), returns a dataframe with frequency. If FALSE, returns only the pairs

Examples

pos <- txt_wiki |>
  filter_by_query("Police") |>
  parsePOS()
entities_by_txt <- pos |>
  dplyr::group_by(doc_id) |>
  dplyr::summarise(entities = list(unique(entity)))
graph_from_coocurrence_list(entities_by_txt)
#> Error in graph_from_coocurrence_list(entities_by_txt): could not find function "graph_from_coocurrence_list"