from vector of texts, tokenize by sentence or paragraph and returns a filtered list
Arguments
- txt
vector of texts
- query
query to filter the text
- by_sentence
tokenize by sentence (default TRUE). If FALSE, tokenize by paragraph
- unlist
if TRUE (default FALSE), returns a vector instead of a list object.
- msg
if TRUE (default TRUE), returns a message if the query is not found.
- ic
ignore case (default TRUE)
Examples
# loading data
data(package = "txtnet")
# sample
txt_wiki[2:3]
#> [1] " "
#> [2] "Brian Thompson, the 50-year-old CEO of the American health insurance company UnitedHealthcare, was shot and killed in Midtown Manhattan, New York City, on December 4, 2024. The shooting occurred early in the morning outside an entrance to the New York Hilton Midtown hotel.[4] Thompson was in the city to attend an annual investors' meeting for UnitedHealth Group, the parent company of UnitedHealthcare. Prior to his death, he faced criticism for the company's rejection of insurance claims, and his family reported that he had received death threats in the past. The suspect, initially described as a white man wearing a mask, fled the scene.[1] On December 9, 2024, authorities arrested 26-year-old Luigi Mangione in Altoona, Pennsylvania, and charged him with Thompson's murder in a Manhattan court.[5][6][7]"
txt_wiki |> filter_by_query("York")
#> [[1]]
#> character(0)
#>
#> [[2]]
#> character(0)
#>
#> [[3]]
#> [1] "Brian Thompson, the 50-year-old CEO of the American health insurance company UnitedHealthcare, was shot and killed in Midtown Manhattan, New York City, on December 4, 2024."
#> [2] "The shooting occurred early in the morning outside an entrance to the New York Hilton Midtown hotel.["
#>
#> [[4]]
#> [1] "8][9][10] Authorities also said his fingerprints matched those that investigators found near the New York shooting scene.["
#> [2] "12] Mangione also has an arrest warrant with five felony counts in New York, including second-degree murder.["
#>
#> [[5]]
#> character(0)
#>
#> [[6]]
#> character(0)
#>
#> [[7]]
#> character(0)
#>
#> [[8]]
#> character(0)
#>
#> [[9]]
#> [1] "The suspect arrived in New York City on November 24, 2024, on a Greyhound bus."
#> [2] "27][22] He checked into the HI New York City Hostel on the Upper West Side of Manhattan on November 24, 2024, with a falsified New Jersey identification card and paid in cash.["
#> [3] "28] He stayed all but one night of the 10 days he was in New York City at the hostel, checking out on December 3, 2024.["
#>
#> [[10]]
#> [1] "Thompson was in New York City for an annual UnitedHealth Group investors' meeting, having arrived in the city on December 2, 2024.["
#> [2] "EST (UTC−5), Thompson was walking along West 54th Street toward the New York Hilton Midtown hotel that was hosting the meeting.["
#>
#> [[11]]
#> character(0)
#>
#> [[12]]
#> character(0)
#>
#> [[13]]
#> character(0)
#>
#> [[14]]
#> character(0)
#>
#> [[15]]
#> [1] "The New York City Police Department offered a reward up to $10,000 for information about the shooter on December 4, 2024.["
#>
#> [[16]]
#> character(0)
#>
#> [[17]]
#> character(0)
#>
#> [[18]]
#> character(0)
#>
#> [[19]]
#> [1] "71][72] Altoona is about 280 miles (450 km) west of New York City.["
#>
#> [[20]]
#> [1] "80] He was denied bail for the second time on December 10, 2024, and through his Pennsylvania attorney, he indicated his intention to fight a prospective interstate extradition to New York.["
#> [2] "6][81] Mangione hired Karen Friedman Agnifilo, former prosecutor at the Manhattan District Attorney's Office and former legal analyst with CNN, as his New York case defense attorney on December 13.["
#>
#> [[21]]
#> [1] "10] According to The New York Times, the mention of CAD apparently relates to the process of 3D-printing the ghost gun's plastic frame.["
#>
#> [[22]]
#> character(0)
#>
#> [[23]]
#> [1] "New York Police Chief of Detectives Joseph Kenny believes Mangione may have targeted them because of the company's size.["
#>
#> [[24]]
#> character(0)
#>
#> [[25]]
#> character(0)
#>
#> [[26]]
#> character(0)
#>
#> [[27]]
#> character(0)
#>
#> [[28]]
#> character(0)
#>
#> [[29]]
#> character(0)
#>
#> [[30]]
#> character(0)
#>
#> [[31]]
#> character(0)
#>
#> [[32]]
#> character(0)
#>
#> [[33]]
#> [1] "Zeynep Tufekci, a professor of sociology and public affairs at Princeton University and New York Times columnist, said that the public reaction to Thompson's murder \"should ring all the alarm bells\" and resembled the reaction to the very high levels of corporate greed, exploitation, and economic inequality during the American Gilded Age, a period characterized by violent \"political movements that targeted corporate titans, politicians, judges and others\".["
#>
#> [[34]]
#> character(0)
#>
#> [[35]]
#> character(0)
#>
#> [[36]]
#> character(0)
#>
#> [[37]]
#> character(0)
#>
#> [[38]]
#> character(0)
#>
#> [[39]]
#> character(0)
#>
#> [[40]]
#> character(0)
#>
#> [[41]]
#> character(0)
#>
#> [[42]]
#> character(0)
#>
#> [[43]]
#> [1] "Independent journalist Ken Klippenstein stated that numerous major media outlets refused to publish Mangione's alleged manifesto despite being in possession of it, writing \"My queries to The New York Times, CNN and ABC to explain their rationale for withholding the manifesto, while gladly quoting from it selectively, have not been answered.\""
#> [2] "Klippenstein also alleged that The New York Times directed their staff to \"dial back\" on showing photographs containing Mangione's face.["
#>
#> [[44]]
#> character(0)
#>
#> [[45]]
#> character(0)
#>
txt_wiki |> filter_by_query("Police")
#> [[1]]
#> character(0)
#>
#> [[2]]
#> character(0)
#>
#> [[3]]
#> character(0)
#>
#> [[4]]
#> [1] "11] Mangione was held without bail in Pennsylvania on charges of possession of an unlicensed firearm, forgery, and providing false New Jersey-resident identification to police.["
#> [2] "12] Police believe that he was inspired by Ted Kaczynski's essay Industrial Society and Its Future (1995), and motivated by his personal views on health insurance.["
#>
#> [[5]]
#> character(0)
#>
#> [[6]]
#> character(0)
#>
#> [[7]]
#> character(0)
#>
#> [[8]]
#> character(0)
#>
#> [[9]]
#> character(0)
#>
#> [[10]]
#> character(0)
#>
#> [[11]]
#> character(0)
#>
#> [[12]]
#> [1] "39] According to the police, he then left the city from the George Washington Bridge Bus Station farther uptown in Upper Manhattan.["
#>
#> [[13]]
#> [1] "49] Accordingly, police stated they are investigating whether the words suggest the killer's motive.["
#>
#> [[14]]
#> [1] "50] Police said they believed they found the shooter's backpack in Central Park on December 6, 2024.["
#>
#> [[15]]
#> [1] "The New York City Police Department offered a reward up to $10,000 for information about the shooter on December 4, 2024.["
#>
#> [[16]]
#> [1] "The shooter was described by police as a white man, approximately 6 ft 1 in (185 cm) tall, wearing a light brown or cream-colored hooded jacket, dark pants, and black sneakers with white soles."
#> [2] "31][39][57][58] Police said the suspect appeared to be proficient in the use of firearms[30] and was described as being \"extremely camera savvy.\"["
#>
#> [[17]]
#> character(0)
#>
#> [[18]]
#> [1] "69] Mangione's mother contacted the San Francisco Police Department, as she believed that Mangione lived in San Francisco and had a job in the area.["
#>
#> [[19]]
#> [1] "Local police in Altoona, Pennsylvania, arrested Mangione on December 9, 2024, at a McDonald's restaurant in the city."
#> [2] "An employee there called the police to say that a customer recognized the suspect from images released by the NYPD.["
#> [3] "63] In his bag they found a 3D-printed gun and a 3D-printed suppressor, which the police claim are consistent with the weapon used in the shooting, and a falsified New Jersey driver's license with the same name as the one used by the alleged shooter to check into the Manhattan hostel.["
#> [4] "8][73][3][74] The police also said that when they arrested Mangione, they found a three-page,[74] 262-word handwritten document about the American healthcare system, which they characterized as a manifesto.["
#>
#> [[20]]
#> character(0)
#>
#> [[21]]
#> character(0)
#>
#> [[22]]
#> character(0)
#>
#> [[23]]
#> [1] "85] Police believe the motive was related to an injury that Mangione had suffered that caused him to visit the emergency room in July 2023."
#> [2] "New York Police Chief of Detectives Joseph Kenny believes Mangione may have targeted them because of the company's size.["
#>
#> [[24]]
#> [1] "Police believe that Mangione was inspired by Ted Kaczynski's Industrial Society and Its Future.["
#>
#> [[25]]
#> character(0)
#>
#> [[26]]
#> character(0)
#>
#> [[27]]
#> character(0)
#>
#> [[28]]
#> character(0)
#>
#> [[29]]
#> character(0)
#>
#> [[30]]
#> character(0)
#>
#> [[31]]
#> character(0)
#>
#> [[32]]
#> character(0)
#>
#> [[33]]
#> character(0)
#>
#> [[34]]
#> character(0)
#>
#> [[35]]
#> character(0)
#>
#> [[36]]
#> character(0)
#>
#> [[37]]
#> character(0)
#>
#> [[38]]
#> character(0)
#>
#> [[39]]
#> character(0)
#>
#> [[40]]
#> character(0)
#>
#> [[41]]
#> character(0)
#>
#> [[42]]
#> character(0)
#>
#> [[43]]
#> character(0)
#>
#> [[44]]
#> character(0)
#>
#> [[45]]
#> character(0)
#>