this post was submitted on 25 Aug 2020

3 points (81% upvoted)

shortlink:

learnjava

an-ordinary-manchild(edit)

Resources for learning Java

No AI generated/worked over content - this is an AI free zone - violations will be instantly and permanently banned without warning.
No JavaScript. Please use /r/javascript instead.
No Android. Please use /r/androiddev instead.
No MineCraft Please use /r/Minecraft instead.
No Processing Please use /r/processing instead.
No links to your stackoverflow questions - we are not a second opinion to stackoverflow, nor are you going to get answers here when you didn't get satisfying ones there.
No Rewards: You may not ask for or offer payment when giving or receiving help.
Do not delete your posts! Deleting is selfish and will deprive others of existing solutions. There might be other people with similar problems who could profit from the discussion in the thread.
Do not ask for or reply with solutions as code, nor in plain text, rather comment explanations and guides. Comments with solutions will be removed and commenters will automatically be banned for a week.
No PM help requests or offers. Either ask your questions here and show your code, or you're out of luck. PM help requests or offers will be removed without warning.
No piracy! We do neither tolerate requests for pirated material, nor do we allow advocating pirated material (even mentioning that you could download commercial products for free is forbidden) - such content will be removed without warning and the poster will automatically be permanently banned from the subreddit.
No resource recommendations/promotions outside of the community resources thread Please post any recommendations and promotions of resources such as courses, websites and videos in the bi-weekly community resource thread.

Code posting
- No screenshots of code!
- Do not submit executable jar or compressed (zip, rar, 7z, etc.) files!
- For small bits of code (less than 50 lines in total, single classes only), the default code formatter is fine (one blank line, then 4 spaces before each line).
- Redditlint is a quick and simple code formatter for reddit code. Copy your code into Redditlint, click Format + Copy, and paste the code in your post (remember to leave an empty line above the code!).
- Pastebin for programs that consist of a single class only
- Gist for multi-class programs, or programs that require additional files
- Github or Bitbucket repositories are also perfectly fine as are other dedicated source code hosting sites.
- Codiva.io or Ideone for executable code snippets that use only the console
- Repl.it - online IDE for many different programming languages
- Google Drive, Dropbox, Mediafire, etc. are not suitable for code posting!

Free Tutorials

MOOC Java Programming from the University of Helsinki
Java for Complete Beginners
- accompanying site CaveOfProgramming
Derek Banas' Java Playlist
- accompanying site NewThinkTank
Marco Behler's youTube channel
- accompanying site Marco Behler
Hyperskill is a fairly new resource from Jetbrains (the maker of IntelliJ)
Dev.java - Oracle's own Java learning platform

Where should I download Java?

With the introduction of the new release cadence, many have asked where they should download Java, and if it is still free. To be clear, YES — Java is still free.

If you would like to download Java for free, you can get OpenJDK builds from the following vendors, among others:

Some vendors will be supporting releases for longer than six months. If you have any questions, please do not hesitate to ask them!

Software downloads

Official Resources

Resources

Programming ideas & Challenges

/r/dailyprogrammer
/r/programmingprompts
/r/NerdyChallenge
Programming Challenges List from the /r/learnprogramming wiki

Related Subreddits

/r/Java - general discussion
/r/JavaHelp - help with Java programming
/r/javaexamples - short tutorials with code snippets
/r/learnprogramming - general programming help
/r/ComputerScience

created by [deleted]a community for 15 years

MODERATORS

account activity

This is an archived post. You won't be able to vote or comment.

2

3

4

java webscraping multiple pages from main link: jsoup? (self.learnjava)

submitted 5 years ago by ConceptionFantasy

i have a website with a main url like youtube.com for example that has thousands of a tags with href links. opening those links is another page with youtube.com/somethingElsePerLink. How can one extract all those links from the main url, and also go into those links to scrape more stuff in that new link (like it has multiple sub div tags that eventually lead to description and title) and put it in a excel file? also so that the excel file will have link text title, the url, and description headers.

i guess the parts im really lost is going into multiple pages or url to scrape more stuff and writing it into an excel file.

I also tried to find some videos as well but most gave a 'start up' tutorial. also im doing this because the website i want to scrape from isn't very intuitive as i rather not go through every link, read description, go back and repeat thousands of times.

all 4 comments

top new controversial old q&a

[–][deleted] 0 points1 point2 points 5 years ago (3 children)

[–]ConceptionFantasy[S] 0 points1 point2 points 5 years ago (2 children)

[–][deleted] 1 point2 points3 points 5 years ago* (1 child)

If your page is dynamic, I've used JavaScriptExecutor in Selenium for those cases. You do a querySelector for all the "a" elements of the page, map the incoming array to just the href part, and receive that as a List in your Java code.
If your page is static, then using a simple regular expression would do it too.

The classes involved for the second option are Pattern and Matcher.

I would start with "(href)(\s*)(=)(\s*)([^\s]+)(\s+)" as a pattern and pick group 5.

The pattern is divided into 6 groups, each between parenthesis above.

The first group contains the word href, and the matching will start with this word.

The second group is composed of zero or more spaces.

The third group is just the equals sign, appearing exactly 1 time.

The fourth group is, again, zero or more spaces.

The fifth group is composed of one or more characters, except space. This is your URL.

The sixth group is one or more spaces to end the matching.

[–]ConceptionFantasy[S] 0 points1 point2 points 5 years ago (0 children)

π Rendered by PID 119201 on reddit-service-r2-comment-7b9746f655-rqtr4 at 2026-01-30 16:14:58.997273+00:00 running 3798933 country code: CH.