Extract a non directional graph based on co-occurrence in the token. It extracts only if two entities are mentioned in the same token (sentence or paragraph)
Usage
extract_graph_rb(
text,
using = "sentences",
connect = connectors("misc"),
sw = c("of", "the"),
count = TRUE,
loop = FALSE
)Arguments
- text
an input text
- using
sentence or paragraph to tokenize
- connect
lowercase connectors, like the "von" in "John von Neumann".
- sw
stopwords vector.
- count
if TRUE (default) count the frequency of nodes and return it in the order of its frequency
- loop
if TRUE, it will not remove foops, a node pointing to itself.
Examples
text <- "John Does lives in New York in United States of America. He is a passionate jazz musician, often playing in local clubs."
extract_graph_rb(text)
#> Tokenizing by sentences
#> # A tibble: 10 × 3
#> n1 n2 n
#> <chr> <chr> <int>
#> 1 America He 1
#> 2 John_Does America 1
#> 3 John_Does He 1
#> 4 John_Does New_York 1
#> 5 John_Does United_States 1
#> 6 New_York America 1
#> 7 New_York He 1
#> 8 New_York United_States 1
#> 9 United_States America 1
#> 10 United_States He 1