{networds} - a package to build graphs from text

THIS PACKAGE IS NOW UNDER DEVELOPMENT

Extracting co-occurences and relations in text

This is a package to extract graphs and build and visualize text networks in static and dynamic graphs.

It extract graphs from plain text using:

Rule based: Regex to extract proper names, and build a co-occurrence network
(under development) Extraction using Part of Speech tagging of proper names and nouns and its co-occurrence

extraction of relations (verbs, in most cases) like in {rsyntax} and {semgram}, that uses Universal Stanford Dependencies: A cross-linguistic typology “propose an improved taxonomy to capture grammatical relations across languages, including morphologically rich ones”

(Under development) Relation extraction using Large Language Models running locally with {rollama}.

The method 1 is quick and easy to understand and to explain, but has its limitations. Method 2 and 3 are still under development and are more powerful, and can solve more complex problems. One of the problems is disambiguation, the same word can have different meanings depending on the context. Other problem, that can be solved with methods two and three, is possible to do what is called “anaphora resolution”, when there is repeated reference to the same entities with different words. for example, in the phrase: “John Doe gave Mary a flower and she loved it,” the pronoun “she” is the anaphor of “Mary” and “it” is the anaphor of “flower”. The opposite case, when a pronoun precedes its referent, it is called cataphora. We are working on the implementation of this feature to reduce the number of redundant nodes.

Installation

You can install the development version of networds from GitHub with:

# install.packages("pak")
pak::pak("SoaresAlisson/networds")

Example

Check the vignettes:
- 01 - Proper name extraction with regex - [02 - Extract entity co-ocurrences with POS]

Similar Projects

textnet - “textNet is a set of tools in the R language that uses part-of-speech tagging and dependency parsing to generate semantic networks from text data. It is compatible with Universal Dependencies and has been tested on English-language text data”.
textnets from Chris Bail.

{networds} - package to build text network

{networds} - a package to build graphs from text

Extracting co-occurences and relations in text

Installation

Example

Similar Projects