Skip to contents

from a POS dataframe (using `filter_by_query() |> parsePOS()` ) get the pairs of co-ocurrences.

Usage

get_cooc(pos_df, pos_cat = c("NOUN", "ENTITY"))

Arguments

pos_df

a POS dataframe

pos_cat

the POS categories to be extracted. Default: NOUN and ENTITY

Examples

x <- txt_wiki |> filter_by_query("Police")
x <- x |> parsePOS(only_entities = FALSE)
#> successfully initialized (spaCy Version: 3.8.5, language model: en_core_web_sm)
get_cooc(x)
#> $graphs
#> # A tibble: 505 × 3
#>    n1    n2          freq
#>    <chr> <chr>      <int>
#>  1 3D    Manhattan      2
#>  2 3D    New_Jersey     2
#>  3 3D    claim          2
#>  4 3D    driver         2
#>  5 3D    hostel         2
#>  6 3D    license        2
#>  7 3D    name           2
#>  8 3D    one            2
#>  9 3D    police         2
#> 10 3D    shooter        2
#> # ℹ 495 more rows
#> 
#> $isolated_nodes
#> [1] node
#> <0 rows> (or 0-length row.names)
#> 
#> $nodes
#> # A tibble: 99 × 2
#>    node                               freq
#>    <chr>                             <int>
#>  1 police                                8
#>  2 Mangione                              6
#>  3 Police                                5
#>  4 shooter                               4
#>  5 3D                                    2
#>  6 Industrial_Society_and_Its_Future     2
#>  7 New_Jersey                            2
#>  8 Pennsylvania                          2
#>  9 Ted_Kaczynski_'s                      2
#> 10 city                                  2
#> # ℹ 89 more rows
#>