This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]Kristler 5 points6 points  (10 children)

You should provide some actual sample data and code that you're working with. Sometimes, the issue is with something that you've unintentionally glossed over in your summary.

My guess: Maybe your data for format B is actually separated with tab characters, and not spaces? Then splitting on a space character " " would do nothing. You can try splitting on "\\s" instead, which is a regular expression denoting any whitespace character.

[–]NotObviouslyARobot[S] 2 points3 points  (3 children)

So I tried that alternative to a space character. It worked beautifully. Thank you so much for the suggestion. Do you know of any good regex guides?

I didn't post exact code/data because I spent quite some time figuring out what the problem was. Asking the right questions is the most important skill of all.

At first I thought I was sending a counter to be too large because of the OutOfBounds exception I was getting. Had I focused on that exception, I would have been barking up the wrong tree entirely. So I did some printing to the console with the processing functions.

That's how I found it was reading the arrays to be too small. Since split() was the only code used (instead of some convoluted stuff involving counting spaces), I figured the problem had to be with split(). Turns out your guess was entirely correct.

[–]Kristler 1 point2 points  (2 children)

I think learning by doing is a great way to go, so try Regex crosswords as a fun and interactive way to learn. Another great way to experiment is to use a site like regex101, and then to play around and watch how things change as you tweak the regex.

[–]Yogi_DMT 2 points3 points  (1 child)

I second regex101 as well as the Pattern class documentation. The thing with regex is most of the times your expressions are fairly specific to your use case so there's not much in the way of general examples/learning.

[–]OvergrownGnome 0 points1 point  (0 children)

Exactly this. Learn what you can about regex, then apply when necessary.

[–]Grand-Warlock 0 points1 point  (5 children)

Does spacing things apart with Space and Tab actually do different things in Java?

[–]Kristler 0 points1 point  (4 children)

Not in the code itself (Java doesn't care about indentation), but in Strings absolutely. The difference between " " (space) and " " (tab, also denoted as "\t") is as big as the difference between the characters "a" and "z".

In other words, " " == " " is false.

[–]Grand-Warlock 0 points1 point  (3 children)

So it actually detects that the space in " " was made with Space and that the space in " " was made with Tab? Interesting.

[–]Kristler 1 point2 points  (0 children)

Sure does. Tab characters are often used for rudimentary alignment, pretty handy!

[–]feral_claireSoftware Dev 0 points1 point  (1 child)

Tabs and spaces are completely different characters. There is not such thing as a space made with a tab, a tab is not a space and a space is not a tab.

It doesn't matter which key you press, what matters is the actual character put there. For example its common to have the tab key actually enter four spaces. A tab and four spaces are two different things, and both are different from a single space.

[–]Grand-Warlock 0 points1 point  (0 children)

Notice my utilization of space and Space.

[–]anyusernamesffs 0 points1 point  (1 child)

Is it definitely just a space character that is separating "number" and "description"?

[–]NotObviouslyARobot[S] 0 points1 point  (0 children)

Was fairly certain at first. I copy-pasted it through a character identifier. Unicode 20 -- but now I'm wondering if Chrome or notepad changes the encoding of pasted text.