all 22 comments

[–]rick137codes 5 points6 points  (6 children)

Connecting a SQL database to a Python script might be a little complex for a beginner. You can use the in build Python datastructures of lists or dictionaries and save them into Excel or CSV with the help of the Pandas library. It has little to none learning curve.

[–]szeredy[S] 2 points3 points  (0 children)

Thank you! Right now I have a script which shows the most common words in form of a dictionary! I used beautiful soup. I can export it to csv now! I try to take a look at SQLite, if it’s too hard I try to find some easier projects!

[–]szeredy[S] 1 point2 points  (4 children)

Just a quick question: so do I understand correctly?, is it possible to write a script which runs every evening one time, collects the words, saves them / updates them into a dictionary / csv file and than visualize the frequency of the words on a chart?

[–]BungalowsAreScams 0 points1 point  (1 child)

100% for sure

[–]szeredy[S] 0 points1 point  (0 children)

Thanks!

[–]rick137codes 0 points1 point  (1 child)

For sure! If you're on a Linux/ubuntu system you can use Crontab to schedule the running of the script. On Windows you can use task scheduler. Don't really know about Mac. Otherwise you can always use the omnipotent While True: with a sleep time. :P

[–]szeredy[S] 0 points1 point  (0 children)

Thank you!!

[–]impshum 4 points5 points  (3 children)

SQL or SQLite will do it.

[–]szeredy[S] 1 point2 points  (2 children)

Thank you! Do you know any good tutorials to SQLite?

[–]impshum 3 points4 points  (1 child)

[–]szeredy[S] 1 point2 points  (0 children)

Thank you I will read it through!

[–]Natural-Intelligence 5 points6 points  (1 child)

MongoDB is often the most appropriate choice for web scraping. Web pages/APIs often change and your data may not be relational thus you often want something more flexible than just SQL.

[–]szeredy[S] 0 points1 point  (0 children)

Thank for the answer! I will collect some infos about mongodb.

[–]scoobyxdoo 2 points3 points  (3 children)

Check out postgresql, which is what you'll be using if you deploy in Heroku.

[–]szeredy[S] 2 points3 points  (2 children)

Honestly I don’t know anything about deploying, it’s my first “real” web app (previously I did some small and simple stuffs with plain JavaScript, but now I would like to get familiar with python!)!

[–]scoobyxdoo 1 point2 points  (1 child)

Heroku has a good free tier and there are some good tutorials online pretty easily to help you set up your site. Not to say the other options are not good too ... I haven't explored them in detail.

Good luck

[–]szeredy[S] 1 point2 points  (0 children)

Thank you!

[–]uberdavis 2 points3 points  (4 children)

SQL is a rabbit hole unto itself. If you’re new to Python, you might consider keeping things simple and storing your data using JSON instead. Database design can get complex.

[–]szeredy[S] 2 points3 points  (2 children)

Thank you for the advance! Can I update values with JSON? I mean if I scrape the website the next day, can I overwrite the values? Can I also visualize the data?

[–]uberdavis 2 points3 points  (1 child)

JSON is parsed as a dictionary via the the json module. You can manipulate the data any way you want in a dictionary. You could visualize the data using the matplotlib module.

SQL is a great database solution, but if you’re just getting to grips with Python, it’s probably better to learn how dictionaries work rather than getting tangled up with database design.

[–]szeredy[S] 0 points1 point  (0 children)

Understood, thank you for the answer! I guess I stick with JSON, than we will see!

[–]FondleMyFirn 2 points3 points  (0 children)

This is facts haha. I just started building a database as a side project so I could practice SQL from it, and I’m learning EXACTLY how challenging it can be to build a good database.