All Questions
123
questions
2
votes
1
answer
64
views
Extract *all* possible patterns in a variable
I have a large variable containing strings (words). I need to extract all substrings that contain any of the patters listed in a separate vector.
library(tidyverse)
df <- data.frame(Word = c("...
0
votes
3
answers
52
views
Flawed logic with RegEx and numeric ranges
I'm trying to create a new variable called 'group' in a dataset called 'data'. The variable 'group' should take the value "A" or "B" depending on how another variable in the ...
1
vote
1
answer
49
views
Tailor one string vector to exact size of other
I have text data in column Segments and their corresponding text-tag combinations in Q_c7_collpsd. However, Q_c7_collpsd is longer than Segments. My task is to trim Q_c7_collpsd to the exact length ...
1
vote
2
answers
60
views
Find the first matching word from a vector in a string column
I need to know which of the words in a vector comes first in a string. I need to run this code on a large data frame with millions of records.
df is my sample data
df <- data.frame(ID = c(1,2,3),
...
0
votes
2
answers
89
views
How to filter a table based on email address suffix
I have a table of over 100K names and addresses . I would like to filter the table to keep only those emails I think are not spam.
i have for example addresses as such
[email protected]
[email protected]
...
0
votes
2
answers
57
views
Recombine characters strings and separate initals with a period using R and Regex
I have a list of authors that are all slightly formatted differently. My goal is to extract the different components of every character string. The different components are:
initials (usually all ...
0
votes
1
answer
45
views
Keep separator using Regex in separate_rows()
How can I keep the parenthesis in from Q11 in the data below? This column is from a Google forms in which people could choose as many Brazilian regions as they wished, now I have to slipt the region. ...
3
votes
2
answers
87
views
str_extract_all not listing all possible matches (R, stringr)
Setting the Scene
I have this string:
string <- "Apples/Bananas/Grapes/"
I am trying to find all possible substring matches with this list:
pattern <- "Apples/Bananas/|Bananas/...
0
votes
0
answers
19
views
How to extract all occurrences of specific expressions using regular expressions in R? [duplicate]
I am trying to extract all occurrences of the expressions between "[XX]\n-----\n" and "\n-", for all XX. Here is the code I have come up with.
temp <- "\\[\\d+]\n-(.*)\n-&...
0
votes
2
answers
57
views
Extract leading numbers from string, but length varies R
I have a column which contains a character string containing letters and numbers. The string always starts with one or two numbers, followed by multiple characters. I am trying to separate the string ...
1
vote
2
answers
78
views
How to move characters inside brackets to another part of a string?
I'm trying to detect characters of a string which are inside brackets and take those exact characters and replace them into a different string. These are in multiple columns, ref1:ref4.
eg. The values ...
1
vote
3
answers
75
views
Regex for at least one instance of each of a list of letters?
I'm trying to sharpen my skills with regular expressions by coming up with some R code to solve the NY Time's Spelling Bee game.
I've done that, but now I'm going one step further and trying to ...
0
votes
1
answer
53
views
multiple string detections
I need to detect a patern in characters list with a negative lookbehind:
I have this type of data :
w \<- " HGQX0080 **HJFA0120** HGMA0030 ZCQH0010 **HGSA0010** ZZQX1880"
x \<- "...
-1
votes
2
answers
174
views
Extract digits from strings in R
i have a dataframe which contains a text string like below that shows the ingredients and the proportion of each ingredient. What i would like to achive is to extract the proportion of each ingredient ...
0
votes
2
answers
64
views
Places after decimal points discarded when extracting numbers from strings
I'd like to extract weight values from strings with the unit and the time of measurement using tidyverse.
My dataset is like as below:
df <- tibble(ID = c("A","B","C"),...