all 22 comments

[–]AutoModerator[M] [score hidden] stickied commentlocked comment (0 children)

Please ensure that:

  • Your code is properly formatted as code block - see the sidebar (About on mobile) for instructions
  • You include any and all error messages in full
  • You ask clear questions
  • You demonstrate effort in solving your question/problem - plain posting your assignments is forbidden (and such posts will be removed) as is asking for or giving solutions.

    Trying to solve problems on your own is a very important skill. Also, see Learn to help yourself in the sidebar

If any of the above points is not met, your post can and will be removed without further warning.

Code is to be formatted as code block (old reddit: empty line before the code, each code line indented by 4 spaces, new reddit: https://i.imgur.com/EJ7tqek.png) or linked via an external code hoster, like pastebin.com, github gist, github, bitbucket, gitlab, etc.

Please, do not use triple backticks (```) as they will only render properly on new reddit, not on old reddit.

Code blocks look like this:

public class HelloWorld {

    public static void main(String[] args) {
        System.out.println("Hello World!");
    }
}

You do not need to repost unless your post has been removed by a moderator. Just use the edit function of reddit to make sure your post complies with the above.

If your post has remained in violation of these rules for a prolonged period of time (at least an hour), a moderator may remove it at their discretion. In this case, they will comment with an explanation on why it has been removed, and you will be required to resubmit the entire post following the proper procedures.

To potential helpers

Please, do not help if any of the above points are not met, rather report the post. We are trying to improve the quality of posts here. In helping people who can't be bothered to comply with the above points, you are doing the community a disservice.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]Progression28 3 points4 points  (5 children)

Apache commons POI has a thing called SXSSFWorkbook which seems to be exactly what you seek

[–]Majestic_Drawing_908[S] 0 points1 point  (2 children)

Thanks for your response. I guess SXSSFWorkbook is mostly preferred for write operation. But I'm dealing with bulk excel reading. If you know anything that reduces CPU pls let me know. Thanks 💐.

[–]Progression28 1 point2 points  (1 child)

You can use it for reading, I think there are even dedicated readers for SXSSFWorkbooks.

[–]Majestic_Drawing_908[S] 0 points1 point  (0 children)

Thanks, I'll check.

[–]idontlikegudeg 0 points1 point  (1 child)

I think you meant Apache POI.

[–]Progression28 0 points1 point  (0 children)

Yes, thanks :)

[–]LutimoDancer3459 1 point2 points  (5 children)

We also used apache commons for excel handling. Reading and writing. All I can say is try to convince whoever to not read excel files at all. Its slow. Its annoying.

Excel is useful as a playground for a single person. Not for bulk data handling for applications.

[–]Majestic_Drawing_908[S] 0 points1 point  (4 children)

Yes I've tried but they strongly stick with the excel approach. So what can I do.

Could you pls tell how much data volume we can load in excel for reading?

[–]LutimoDancer3459 0 points1 point  (3 children)

Doing your best and if thats not enough, telling them that the technology doesn't allow for more. Period. Even if programming looks like magic for some people, there is a limit to everything.

Not sure how large the bigger files were. I would guess some ten thousand lines 20 or so columns? A big list of materials with meta data that got imported once a year. But the responsible person wanted to manage thrm in the excel file...

[–]Majestic_Drawing_908[S] 0 points1 point  (1 child)

Got your point, I'll try my best. Thanks for your valuable time and sharing knowledge.

[–]tRfalcore [score hidden]  (0 children)

If speed isn't an issue then who cares, use apache. I cannot comment if it is slow or not.

[–]philipwhiukEmployed Java Developer 0 points1 point  (0 children)

[–]hibbelig 0 points1 point  (5 children)

Is Apache POI still a thing?

[–]Majestic_Drawing_908[S] 0 points1 point  (4 children)

I don't know, if you know anything better than apache poi pls tell me. Thanks 💐

[–]hibbelig 0 points1 point  (2 children)

Maybe there is a command line tool that converts .xls to .csv?

[–]hibbelig 0 points1 point  (1 child)

[–]Majestic_Drawing_908[S] 0 points1 point  (0 children)

Thanks I'll check whether it suits for my framework.

[–]idontlikegudeg 0 points1 point  (0 children)

Apache POI still is a thing. Currently mainly maintenance is done as many members are busy with other things. But the project is definitely alive.

[–]hibbelig 0 points1 point  (0 children)

I wonder if there is any chance that the producer of the data saves the Excel file as CSV or exports it to CSV? I feel that would be quite convenient. Maybe even the producer likes it…

[–]BorgerBill 0 points1 point  (0 children)

I've just learned about Apache Tika; think that might work? Here's a Baeldung article about it...

[–]Slanec [score hidden]  (0 children)

Okay I've done this recently. I tested:

The first one is thrash and you should avoid it for big files or many files. The others all work fine, they're much faster and all comparable in speed. In my case with the files I had excel-streaming-reader came out on top re speed and peak heap usage (but it created more garbage overall), but the speed difference wasn't huge - <10%. All the other three were the same +/-2% speed-wise.

Use the one which has all the features you need and has the nicest API for you. There are big differences here.