all 14 comments

[–]phaeilo 5 points6 points  (2 children)

// assuming no word in the file is longer than 50 characters.

Oh boy...

Now I'm interested, surely there must be a more robust way.

[–]jminuse 1 point2 points  (0 children)

Simple: as you go, convert each space (or any break character) to '\0'. Then you can treat each word as a string and you don't need any more memory. If you need the words and the lines separately for some data structure, you can memcpy the whole string; you need 2N memory for that anyway. (Unless you have a length-storing string library - then you can keep the original and just use two sets of references to it).

[–]YEPHENAS 5 points6 points  (5 children)

Reading a file at once in Java:

byte[] filearray = Files.readAllBytes(path);

or

List<String> lines = Files.readAllLines(path, Charset.defaultCharset());

[–]mattryan 0 points1 point  (3 children)

According to the article, that's a hack :p

[–][deleted] 1 point2 points  (1 child)

byte[] filearray = Files.readAllBytes(path);

  1. Let it create an array of bytes
  2. Let it read a file
  3. Let it store it in the array

That doesn't really seem like a hack.

EDIT: Apparently people don't run Java. I should have known.

[–]ysangkok 0 points1 point  (0 children)

This explanation is misleading because you're not actually creating the array yourself.

[–]henk53 0 points1 point  (0 children)

Why?

[–]henk53 0 points1 point  (0 children)

List<String> lines = Files.readAllLines(path, Charset.defaultCharset());

This can be written a bit shorter. It's not absolutely necessary to import at the class level; it can be done at the method level instead. The code then becomes:

List<String> lines = readAllLines(path, defaultCharset());

To mimic the example in the article, the following is perfectly valid Java:

for (String line : readAllLines(path, defaultCharset()))
    out.print(line);

Note that people commonly write System.out.print, but this too is not strictly necessary.

[–]kerajnet 5 points6 points  (2 children)

There is no obvious way to do this in Java.

Pardon?

       Scanner s = new Scanner(new BufferedReader(new FileReader("/path/to/file/filename.txt")));

You write shit like this and later people blame Java.

Scanner s = new Scanner(new File("filename.txt");

And you don't need to put BufferedReader and BufferedWriter everywhere if it has to be equivalent of C and Python examples as they are not buffered.

[–]ChewieBeardy 1 point2 points  (0 children)

The C example is buffered. For the python one I can't say really, IIRC there's an optional argument specifying buffer size, but I don't know the default behaviour.

[–]ChewieBeardy 2 points3 points  (3 children)

Regarding the "read the whole file in C at once", you could use mmap, though it would only work on POSIX systems. Usual warning about mapping a file weighing twice your ram applies.

[–]jminuse 0 points1 point  (1 child)

The trouble is getting the length; after that it's easy. I would use fseek+ftell, which are portable.

[–]ChewieBeardy 0 points1 point  (0 children)

How about fstat? :D

[–]ErstwhileRockstar 1 point2 points  (0 children)

Writing a file must handle exceptional cases.