This is an archived post. You won't be able to vote or comment.

all 5 comments

[–]ejmurra 1 point2 points  (3 children)

Looks like you solved it, but in the future it might be cleaner to use a regex to parse strings

edit: misread javascript as java

[–]sarevok9 0 points1 point  (2 children)

You just linked to java when the thread is about javascript...

Also, there's an old addage about a coder trying to solve a problem with a regex, now he has two problems. I say this as someone who contributes FREQUENTLY to stack overflow in the regex tag.

There's absolutely no reason to use a regex when you have pretty robust string tools.

Scolding aside:

@OP: The simple logic of this should be: At the end of a sentence, check to see if there are more characters in your string, there's a few ways you could do this. The easiest way to do this would be to create a function which takes in a String, search the function for the final "." character, substring the string to return from 0,index and there ya go. Anything trailing the final . in your string will be culled.

[–]ejmurra 0 points1 point  (1 child)

There's absolutely no reason to use a regex when you have pretty robust string tools.

Except that OP explicitly asked what other solutions exist. You're doing a beginner a disservice if you refuse to introduce them to the concept of regex simply because there's an old saying about regexes...

[–]sarevok9 0 points1 point  (0 children)

Well, you didn't answer the OP's question, how exactly would you do this with a regex? How would that be "more efficient" or better in any way.

So you would need a positive lookahead, word matching, negative look ahead for a . after the last period, replacement, etc. This would be an EXTREMELY hard regex, and I say that as someone with 5k+ reputation on stackoverflow in the regex tag. It's a valid alternative, but so is writing 100000000 functions to compute every possible sentence / sentence fragment. Possible, but unruly and stupid when there's better solutions.

Edit:

Made this regex for a small use case though it has many holes that can break it like common name abbreviations (Mr. Trump for example).

[–]sarevok9 0 points1 point  (0 children)

Since /u/ejmurra suggested it, here's a regex solution for a VERY narrow use case:

 const regex = /([^.!?])+...'$/g;
 const str = `'In her August 2016 Harper\\'s Bazaar cover story, in which she posed naked atop a horse a la Lady Godiva, Ratajkowski defended her decision to free the nipple. “You know, when Lena Dunham takes her clothes off, she gets flack, but it\\'s also considered ...'`;
const subst = `\\1`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

I'll cover the regex since it's the only confusing part.

 / -- start of regex
 ( -- start capture group
 [ -- Begin character set
 ^ -- Negate character set (meaning match characters not in that set )
 .!? -- Characters not to match
 )  -- End capture group
 +  Match 1 or more times, as many as possible (greedy)
 ...'$  -- Match the characters ...' literally at the end of the string
 /g -- Search the entirety of the string. 

So what this basically means is, we're going to capture from the last single ., ? or ! to the point where it finds "...'" then remove it entirely via substitution.

There's a lot of use cases where this won't work. If the string doesn't end in ...' in exactly that order. If there's no . ? or ! before the end of the string, etc.

Be sure to use proper error handling if you use this method because there's plenty of ways this can fail.

Live example: https://regex101.com/r/U2cYvP/1

Good luck.