tokenize and selects only sentences/paragraphs with more than one entity per sentence or paragraph

extract_relation(
  text,
  using = "sentences",
  connect = connectors("misc"),
  sw = gen_stopwords("en")
)

Arguments

text

an input text

using

sentence or paragraph to tokenize

connect

lowercase connectors, like the "von" in "John von Neumann". To use pre built connectors use `connectors()“

sw

stopwords vector. To use pre built stopwords use `gen_stopwords()`

Examples

"John Does lives in New York in United States of America." |> extract_relation()
#> Tokenizing by sentences
#> [[1]]
#> [1] "John Does"                "New York"                
#> [3] "United States of America"
#> 
"João Ninguém mora em São José do Rio Preto. Ele foi para o Rio de Janeiro." |> extract_relation(connector = connectors("pt"))
#> Error in extract_relation("João Ninguém mora em São José do Rio Preto. Ele foi para o Rio de Janeiro.",     connector = connectors("pt")): unused argument (connector = connectors("pt"))