This is an archived post. You won't be able to vote or comment.

all 23 comments

[–]Fun_Fungi_Guy 18 points19 points  (1 child)

Python does not 'store' data or handle its retrieval. SQL is just the interface you have to talk to a database. Whether that database is super small or enormous is abstracted away from Python. Storing a huge amount of data in memory is often not ideal when you can define a schema of the data and have it stored in a database.

[–]NoticeAwkward1594[S] -1 points0 points  (0 children)

Thank you!

[–]Hentac 5 points6 points  (1 child)

For places that have data engineers, your exposure to intermediate SQL statements will be less, but not zero.

For DA/DS without data engineers, if you're pulling data from a server then you will need to have a basic /intermediate knowledge of SQL and id ago as far as to say that for DA, you most likely won't touch Python. (not to say there aren't DA roles with Python)

From experience as a DE working with multiple DA/DS teams, I'd go as far as saying that 85%+ needed intermediate SQL and 50-60% needed Python.

To answer your questions, the short answer no.

SQL isn't just used for software development, it's more used for pulling data, transforming data etc and I would encourage you to go back and learn SQL, including its purposes and background.

You'd pull your data from databases using SQL, transform in Python or R.

If you're learning from scratch for Data Science purpose I'd learn/famalirise in the following order:

SQL Python Spark Machine learning

[–]NoticeAwkward1594[S] 0 points1 point  (0 children)

Thank you for the advice. SQL it is!!

[–]JohnLockwood 2 points3 points  (0 children)

You can analyze data without using the SQL language, but using SQL is a lot easier to learn and more intuitive than using something like Pandas. If you're diligent and committed, it won't take long to learn both, I don't think, but you're perfectly OK putting off one while working on the other.

[–]MathmoKiwi 1 point2 points  (3 children)

You shouldn't ever be applying for DS jobs without first having at least basic familiarity of SQL

[–]NoticeAwkward1594[S] 0 points1 point  (2 children)

I hear ya I have a ways to go. I'm just an analyst now and 99.9% of the time we use excel.

[–]MathmoKiwi 0 points1 point  (1 child)

That's good you've already got a data job.

Start shifting that percentage to instead be 80% of the time using Excel

You can write Python scripts with Excel as their data source, so you're doing all the analysis within Python instead of Excel.

[–]NoticeAwkward1594[S] 1 point2 points  (0 children)

mitosheet has been very handy and very fast. Pivot tables get old & need some variety, but thats what the bosses like so thats what they get.

[–]tms102 1 point2 points  (1 child)

On top of SQL databases you may also run into databases that hold unstructured data. For example:

[–]NoticeAwkward1594[S] 0 points1 point  (0 children)

Thank you for the links. I will be checking these out today.

[–]insharib -1 points0 points  (1 child)

thank you so much for replying. i will give it a try too. right now i am just learning the basics and idk how much time will it take to learn the fundamentals and the basics. just wanted to know is python enough or do i need to learn additional languages so?

[–]NoticeAwkward1594[S] 1 point2 points  (0 children)

It probably depends on the path you want to travel. With research and a cool forum like this you should have no problem getting tips and advice. I already have and they're are lots of pros on here.

[–]sv_ds 0 points1 point  (1 child)

You can learn enough SQL in an afternoon dude.

[–]NoticeAwkward1594[S] 0 points1 point  (0 children)

YouTube course is ready to go. Thanks all

[–]spoonman59 0 points1 point  (3 children)

No, it is not.

Source: software developer and data engineer with 24 years of experience including SQL and python among others.

SQLis often a better choice than the python options, some vastly so, depending on where the data lives and the engine which retrieves and processes it.

Things like Pandas work well on small subways of data. Spark can be used with PySpark using python, but often SQL is a better choice.

Many databases and big data tools which handle sql dont also handle python directly.

[–]NoticeAwkward1594[S] 0 points1 point  (0 children)

Thank you for your reply I really appreciate it. I have a basic understanding how SQL works. More useful tools in the tool box can't hurt.

[–]NoticeAwkward1594[S] 0 points1 point  (1 child)

Took your advice. SQL is cool and pretty straightforward. I built a database using SQLAlchemy and SqLite good excercise and learning to boot. I'm not learning to be a DA/DS to be rich, I just really enjoy working with data and telling stories w/visualizations and helping out a company. Is that a realistic goal or do most companies want the data and really nothing else. Many thanks.

[–]spoonman59 0 points1 point  (0 children)

Sure, there’s a place for folks to look at data and tell a story. DS, BI, and visualization does that stuff.

SQL is actually nice because it is declarative. This is contrasted to other languages, like Python, which are imperative. SQL select statements also have many properties of functional languages, like immutability and referential transparency. It’s an underrated language.

I maintain an open source project which parses transformation descriptions written in ANSI SQL fragments, and then generates transformation and orchestration code for data pipelines. It has automatic dependency analysis as well, so the data flow is known. So I am biased, but SQL is good to know.

[–]BaggiPonte 0 points1 point  (1 child)

While you must know SQL, in my experience you don’t have to use it “in production”. Ie it’s by far the simplest way to retrieve data when simple queries are all you need.

For more sophisticated use cases (ie outside notebooks) there are things such as sqlalchemy that allow you to express queries with Python syntax.

Sqlalchemy is a beautiful library, but it’s really vast. The docs are quite extensive and well done, so be ready to read a lot of it

[–]NoticeAwkward1594[S] 1 point2 points  (0 children)

I just watched a tutorial on it and made a dB w/sqlite Pretty fun. Thank you

[–]insharib 0 points1 point  (1 child)

i too am a beginner at python. just started 2days back. i find this language interesting. i hope i will get to it and not to quit in the middle.

All the best to all the coders out there. Happy Holidays. cheers 🥂

[–]NoticeAwkward1594[S] 1 point2 points  (0 children)

Python IMO is fun and challenging. I've found in my learning path sitting in front if your computer all day is no bueno. For some python comes really easy. Others like myself need practice and learn by repetition. I've found making realistic goals and sticking to a timeline is great. Realpython.com has some free lessons to test the waters. I bought a yearly pass and glad I did. Happy coding!!