All functions |
|
---|---|
returns a vector with words of a language. The intent behind it is to test regex patterns |
|
A lowercase connectors between two proper names |
|
count a vector of elements, arragange it or not, and returns a tibble |
|
Extract abbreviations from text |
|
A rule based entity extractor extracts the entity from a text using regex. This regex captures all uppercase words, words that begin with upper case. If there is sequence of this patterns together, this function also captures. In the case of proper names with common lower case connectors like "Wwwww of Wwwww" this function also captures the connector and the subsequent uppercase words. |
|
Extract a non directional graph based on co-occurrence in the token. It extracts only if two entities are mentioned in the same token (sentence or paragraph) |
|
extract a graph from text, using custom regex pattern as nodes. |
|
extract proper names from strings using regular expresssions. The function suposes any sequence of upper letter followed by lower case is a proper name. It is expected to return more things that wanted. Post-processing. |
|
tokenize and selects only sentences/paragraphs with more than one entity per sentence or paragraph |
|
To easily paste and collapse chars objects into one string. A wrapper for glue::glue. |
|
to generate a dictionary of specialized words you can use regex and the function check the dictionary of the language and returns the matched words. It is also useful to text your regex pattern. |
|
Generates a stopwords list of terms Function to generate a list of stopwords for a given language using grammar categories. |
|
A grep to be used with native pipe '|>'. |
|
grepl to be used with native pipe |> |
|
transform string to first capitalized, except join words*. A gsub to be used easily with native pipe |> gsub2 is just a wrapper around gsub |
|
Install libraries from string. |
|
load libraries from string |
|
list of strings to list of vectors. |
|
An empty function. |
|
plot a network of coocurrence of terms |
|
a regex pattern to capture brazilian names, like "Fulano de Tal", "Ciclano dos Santos" |
|
Convert the string into proper name |
|
Convert a string into a vector of elements. |
|
extract all chars. |
|
show all stopwords categories of a language |
|
generate shuffle times easily generate shuffle time, good to use in webscraping |
|
Substitute proper names/entities spaces with underscore in the text. |