Group a POS of proper names — group

Group a data.frame generated by spacyr::spacy_parse(), if there is a sequence of entity = PERS* / PERSON. It works similar to spacyr::entity_extract(), but preserves dep_rel column

Usage

group_ppn(DF)

Arguments

DF: A data.frame generated by spacyr::spacy_parse().

Examples

# example in Portuguese language
# spacy_finalize() # If spacy was previously initialized with another model.
spacy_initialize(model = "pt_core_news_lg")
#> Error in spacy_initialize(model = "pt_core_news_lg"): could not find function "spacy_initialize"
"Maria Jana ama John Smith e Maria é amada por Joaquim de Souza" |>
  spacyr::spacy_parse(dependency = T) |>
  group_ppn()
#> # A tibble: 13 × 10
#> # Groups:   name [2]
#>    doc_id sentence_id token_id token   lemma  pos   head_token_id dep_rel entity
#>    <chr>        <int>    <int> <chr>   <chr>  <chr>         <dbl> <chr>   <chr> 
#>  1 text1            1        1 Maria   Maria  PROPN             2 compou… "PERS…
#>  2 text1            1        2 Jana    Jana   PROPN             3 compou… "PERS…
#>  3 text1            1        3 ama     ama    NOUN              3 ROOT    ""    
#>  4 text1            1        4 John    John   PROPN             5 compou… "PERS…
#>  5 text1            1        5 Smith   Smith  PROPN             3 appos   "PERS…
#>  6 text1            1        6 e       e      PROPN            13 compou… ""    
#>  7 text1            1        7 Maria   Maria  PROPN            13 nmod    "NORP…
#>  8 text1            1        8 é       é      PROPN            13 compou… ""    
#>  9 text1            1        9 amada   amada  PROPN            13 compou… ""    
#> 10 text1            1       10 por     por    PROPN            13 compou… ""    
#> 11 text1            1       11 Joaquim Joaqu… PROPN            13 compou… "PERS…
#> 12 text1            1       12 de      de     PROPN            13 compou… "PERS…
#> 13 text1            1       13 Souza   Souza  PROPN             5 appos   "PERS…
#> # ℹ 1 more variable: name <chr>