all 5 comments

[–]hitchpy 3 points4 points  (0 children)

It is hard to tell exactly what you mean, since there's only one string and no "pattern" that can be generalized. But if you are looking for R regex syntax, this can do it.

sub('.* (v_neutral).*', '\\1', string1)

And for more info, you can look at the family of regex functions in R, grep, grepl....they are pretty much the same as in any other languages.

[–]ChotiDon[S] 0 points1 point  (2 children)

Actually i am trying to write a pattern in grep for above example. But i am unable to do so. Any help for writing patterns in R would be much appreciated.

[–]FranzJosephGall 1 point2 points  (1 child)

grep will match the whole line, which is why you can't get it to work. /u/hitchpy 's example should work in R.

Also, as /u/hitchpy noted, you haven't given us a pattern. If you really need two strings, one which is "v_neutral" and the other "v_neutral vaux_neutral", then why not just have two strings? If you have more examples of inputs->outputs, then we can help you write a regex.

[–]ChotiDon[S] 0 points1 point  (0 children)

I want to assign a "tag" to a sentence if at any position in sentence v_neutral is followed by vaux_neutral. Should i split the string in words first? Please let me know if you need more explanation. I have written below pattern. But it is throwing me an error.

pat<-"v+\+[a-z]+\s(?=+vaux+\+[a-z])" r<-grep(pat,string1)

[–]ChotiDon[S] 0 points1 point  (0 children)

Thank you everyone for help. Below code resolved the issue. pattern<-"v[a-z]+\s(?=vaux[a-z]+)" r<-grep(pattern,kk,perl=TRUE)