Extract a non directional graph based on co-occurrence in the token and returns a tibble It extracts only if two entities are mentioned in the same token (sentence or paragraph)
Source:R/entity_link.R
extract_graph_df.Rd
Extract a non directional graph based on co-occurrence in the token and returns a tibble It extracts only if two entities are mentioned in the same token (sentence or paragraph)
Usage
extract_graph_df(
df,
column_id,
column_text,
using = "sentences",
connect = connectors("misc"),
sw = c("of", "the"),
loop = FALSE
)
Arguments
- df
a data frame with two columns: text and id
- column_id
name of the column with the id
- column_text
name the column with the text to extract the graph
- using
sentence or paragraph to tokenize
- connect
lowercase connectors, like the "von" in "John von Neumann".
- sw
stopwords vector.
- loop
if TRUE, it will not remove loops, a node pointing to itself.
Examples
# creating a dataframe with text and id
DF <- data.frame(text = c("John Does lives in New York in United States of America. He is a passionate jazz musician, often playing in local clubs.", r"(John Michael "Ozzy" Osbourne (3 December 1948 – 22 July 2025) was an English singer, songwriter, and media personality. He co-founded the pioneering heavy metal band Black Sabbath in 1968, and rose to prominence in the 1970s as their lead vocalist. During this time, he adopted the title "Prince of Darkness".[3][4] He performed on the band's first eight albums, most notably including Black Sabbath, Paranoid (both 1970) and Master of Reality (1971), before he was fired in 1979 due to his problems with alcohol and other drugs.)")) |> dplyr::mutate(id = paste0("id_", dplyr::row_number() ))
extract_graph_df(DF, "id", "text")
#> Error in extract_graph_df(DF, "id", "text"): object 'DF' not found