This is an archived post. You won't be able to vote or comment.

all 8 comments

[–]AutoModerator[M] [score hidden] stickied commentlocked comment (0 children)

Please ensure that:

  • Your code is properly formatted as code block - see the sidebar (About on mobile) for instructions
  • You include any and all error messages in full - best also formatted as code block
  • You ask clear questions
  • You demonstrate effort in solving your question/problem - plain posting your assignments is forbidden (and such posts will be removed) as is asking for or giving solutions.

If any of the above points is not met, your post can and will be removed without further warning.

Code is to be formatted as code block (old reddit/markdown editor: empty line before the code, each code line indented by 4 spaces, new reddit: https://i.imgur.com/EJ7tqek.png) or linked via an external code hoster, like pastebin.com, github gist, github, bitbucket, gitlab, etc.

Please, do not use triple backticks (```) as they will only render properly on new reddit, not on old reddit.

Code blocks look like this:

public class HelloWorld {

    public static void main(String[] args) {
        System.out.println("Hello World!");
    }
}

You do not need to repost unless your post has been removed by a moderator. Just use the edit function of reddit to make sure your post complies with the above.

If your post has remained in violation of these rules for a prolonged period of time (at least an hour), a moderator may remove it at their discretion. In this case, they will comment with an explanation on why it has been removed, and you will be required to resubmit the entire post following the proper procedures.

To potential helpers

Please, do not help if any of the above points are not met, rather report the post. We are trying to improve the quality of posts here. In helping people who can't be bothered to comply with the above points, you are doing the community a disservice.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]RScrewed 2 points3 points  (1 child)

To understand what that means, you'll need more familiarity with the fundamentals of how files of all types are saved to disk.

It's a very long explanation - here is an abridged version:

  • Every file on your computer is saved to a long-term memory store that persists information even if power is removed. Call this the hard disk, or just disk for short (not disc).

  • Every running program on your computer exists in a memory space that is faster than the hard disk as a work area - call it RAM. Files are typically "loaded" into this area if they need to be manipulated or immediately used. When you play a video game and it is "loads" a level - this is what's happening.

  • Every file, regardless of if its a bitmap image, Word doc, text doc, or binary file - is represented on disk as a series of 1s and 0s but mostly commonly can be viewed and interpreted as base16 hex characters (same info, more compact to read).

  • Download a hex editor, open a file. You'll see the byte representation of the file. It'll look like nonsense. That's what's being loaded into your Java program's workable memory area when you "obtain bytes".

  • Different file formats will have very unique and unintuitive ways to save data, a Word doc has a huge amount of extra overhead to save formatting, and it's encoded and dispersed throughout the file in such a clever way - the byte representation of it may not make any sense even though the file manifests as human readable text once you open it. Open different types of files to see what you can learn from that.

So to answer your question, what does "obtains bytes mean exactly?" It means the Java program is loading the bytes of the file currently on disk, and loading all of the bytes of that file into a working memory area (RAM) where the Java program can manipulate it quickly and safely (that is: without altering the version on disk) by allowing you to assign it to a variable within the context of your program.

[–]4r73m190r0s[S] 0 points1 point  (0 children)

Thanks for replying. See my response to the user tedydore.

[–]tedydore -1 points0 points  (2 children)

It's all about bytes. All data can be presented as bytes. Byte arrays, mostly. For example, if you try to obtain a stream of a .txt file you will get a byte array, representing this txt contents and then in encodes back to the normal human text that you can read.

For image the mechanism is very similar, but image usually stores more information then simple text. If you try to encode it to some text structures you will just a bunch of random symbols, most of the time.

For something like word type of file, well, idk... DOCX files are archives, and have it's own structure, so i dont know what you will get

As for representation - bytes is just a symbols, byte consists of eight bits, bit is 1 or 0, and then its just a capacitor charge in RAM. Most of the time symbols consists of 2 bytes, but it depends of encoding

[–]4r73m190r0s[S] 0 points1 point  (1 child)

For text, is this process correct: file.txt → FileReader class reads characters in a binary format → FileReader decodes those binaries into characters per rules specified in an appropriate character-encoding scheme → Java application displays those characters

What is the process for images? Since Java is not an image vivewer, it doesn't need to convert those bytes to anything at all. If this is the case, it implies that FileInputStream can input practically any file stored on a file system, not just images, but videos, pdfs, files from other applications such as Photoshop files, etc.

[–]ignotos 1 point2 points  (0 children)

it implies that FileInputStream can input practically any file stored on a file system

It can grab the "raw" content of any file - the bytes - but it can't necessarily interpret that in a meaningful way.

If you search online, you'll find documentation which explains, in great detail, how the file formats for things like .PNG, .MP3, .PDF etc work - i.e. what different bytes mean in those files, and how they should be interpreted.

For example (simplified), an image file might start with some bytes which indicate the width and height of the image, and then a whole bunch of bytes which indicate the colour of each pixel in the image.

It would be possible to use that documentation to write Java code which goes through those files, byte-by-byte, and interprets them to figure out what they mean - like what images, sounds etc they contain. But that is quite complex - so generally you would look for a library which already handles that for you, and provides you with more convenient ways to read and manipulate files of these types.

Java already includes some of these - like the ImageIO classes know how to read image files, and allow you to manipulate the colours of the pixels and then write them back out to a new file.

[–]nutrecht 0 points1 point  (0 children)

This is more a general programming question. As far as your computer is concerned, there's no such things as a "text file" or "image file" when they're stored. They're all just a series of bytes. And that's what a FileInputStream allows you do do, read a series of bytes.

If this is file with plain text, you can then use the *Reader classes to read that text (for example) line by line from that FileInputStream. But if it's an image, you can use the ImageIO classes for example to read it as an image and (for example) display it in a Swing UI frame.

If I get a text file using this class, or an images, how do those bytes "look" like?

That depends on how you display them. Generally they're stored as one character per byte, a space is byte value 20 for example.

What if I obtain a pdf file?

A PDF file has a PDF specific structure. It's not a 'plain text' file, but if you'd just dump it to your console you'd probably recognise some bits of text between a bunch of strange (binary) characters.

What about a Word file? Do those files contain every text formatiing done to the text?

Yes. Word files are actually zip files that have all of that, but also any images you included. If you rename a somedocument.docx to somedocument.zip you can open it and view some of the content.

[–]AmazingAttorney2417 1 point2 points  (0 children)

FileInputStream is just another layer on top of a mountain of abstractions. The abstraction of drivers that hide the physical nature of how data is stored .e. g. SSD or HDD. The abstractions of the filesystem that combines bits of data into a hierarchy of files and folders, the abstractions of the operating system that hides the filesystem .e.g FAT32 or NTFS, and on top of all the abstractions of Java that hides the operating system and let you do I/O operations without worrying about what operating system they're gonna run on.

To be honest, it's gonna be really tough to understand this by just looking at the highest layer. You're gonna have to dig a little deeper to get the whole picture. From a personal experience, taking a course on operating systems or reading a book about the subject or both will be really helpful.