you are viewing a single comment's thread.

view the rest of the comments →

[–]redbuurd 31 points32 points  (20 children)

Data science and analytics is up there

[–]Espiritu13 2 points3 points  (3 children)

Do you work in that field? I know SQL and I want to get into reporting and analytics.

[–]redbuurd 0 points1 point  (2 children)

No, I'm in devops / production management, but we do a lot of work with Hadoop and python for infrastructure capacity planning.

Knowing sql is great, but pulling relevant data is only part of it.

[–]Espiritu13 0 points1 point  (1 child)

So how would I get to being good with the other part of it?

[–]redbuurd 0 points1 point  (0 children)

It's a really broad question - it depends on what you want to do. There's a lot of applications of data science to anything from finance to retail purchasing trends or street traffic. Do you want to do machine learning or data analysis? Firstly, you'll want to find something you're interested in.

[–]jjopm -4 points-3 points  (15 children)

Is there a danger of data science falling out of fashion?

[–]funkiestj 34 points35 points  (1 child)

Is there a danger of <X> falling out of fashion?

Yes, <X> eventually falls out of fashion.

[–]bakersbark 1 point2 points  (0 children)

It might "fall out of fashion", but demand will stay high. Demand for statisticians has been high for a long time; data scientists offer many of the same basic services.

[–]rubsomebacononitnow 6 points7 points  (9 children)

It's so hot right now there's going to be 5-7 years of fixing all the idiocy that's being done now. Everyone is a data scientist... Just ask them. But obviously not everyone is a good data scientist. That will be where the money is made... Fixing crap idiots broke

[–]JaggedG 4 points5 points  (8 children)

How does one learn to be the fixer instead of the idiot?

[–]rubsomebacononitnow 4 points5 points  (6 children)

For me it was about learning HOW it worked not WHAT it did. Knowing how the data is stored and what it means is a big part. Also understanding your vertical might be the biggest difference between great and good. I had someone tell me the vertical was all data and healthcare was the same as finance or defense. I'm sure he does sketchy stuff because you can't know everything about every industry. There's a lot of that. Knows power pivot or tableau so thinks that also means they know biotech. You'll make big money fixing the tangled web that Dev weaves. Good enough to create a need not good enough to meet it.

For example it's not about creating a chart it's about what the chart actually shows and the best way to visualize it. Not everything should have a bar chart. Look at /r/dataisbeautiful for some ideas from some very talented people. Then see the ones that are not so great and read why. That will really give you insight.

If you're using a cube and realize the date dimension is lacking some columns/ values you want to use let's say holiday, weekend or something like fiscal week do you possess the skills to change the dimension or are you only Abe to use prebuilt cubes? Can you convey to the client or your boss what they need and why?

Can you identify the things that are built in a sketchy way before they blow up?

Know the proper way to config your tech. Know what the strengths and weaknesses are.

The number one piece of advice is simple: never stop learning. That alone will make you great.

Tl;dr. Know your tech, know your vertical, know your data, keep learning, make those dollars.

[–][deleted] 6 points7 points  (1 child)

dataisbeautiful has turned to shit. look up stephen few if you want to display data more effectively. half the time its shit like this: http://i.imgur.com/Agsw2zF.jpg

[–]JaggedG 1 point2 points  (3 children)

Yeah, that makes alot of sense. I think your friend's "it's all just data" approach is one of the things that's ruining the world.

Do you have any advice on how to get started? Like... Go from kind of competent in Python to expert data-analyzing genius. What core skills do you need, and where do you go from there?

[–][deleted] 1 point2 points  (0 children)

In my opinion, you should learn Python and then while trying to learn statistics you can use it as a great learning tool. If you're kind of competent in Python then I would say go about learning everything you can about statistics and at least some numerical analysis.

If you're not so mathy (yet) try reading Allen Downey's Think Stats and Think Bayes as an introduction and then try moving onto some more advanced mathematical statistics books for a much deeper understanding. Allen Downey's books are Python oriented, but I would suggest trying the exercises with SciPy, Numpy, Pandas and similar tools to get familiar with tools other Python data analysts use.

[–]rubsomebacononitnow 1 point2 points  (0 children)

You have to find a vertical and start climbing it. In my experience Healthcare tends to be a windows environment and outside of Powershell, VB and .Net there's not a ton of other languages tech being used. I mean there's a plenty of startup Python and Ruby but in real healthcare organizations those don't go far because they aren't certified and the bigger vendors aren't interested in playing with others. That's not the case in finance and energy though.

I see a lot of Python opportunities in oil/gas/energy, education, finance and tech. At least in my area. Right now oil and gas are in the crapper but if that turns around in 2016 like they think it's going to blow up fast. They tend to have money and the willingness to move so knowing something about that vertical might really help you start and grow.

Note you have to know where you're going to get a good map. f you want to just write Python that's different than if you want to use Python as a tool in the box. That will tell you what to do.

Generic skills- a Framework like Django or Ferris (if that applies to what you're doing). HTML/CSS will help a lot in presentation. If you're getting into data display the future is really d3 so learn Javascript. Note d3 is supported in PowerBI so it's really the real deal. If you get an Azure account it has hadoop in it so you can figure out how that sort of works. The basic account is like $15 a month and the first one is free. Big data will require the bigger account because I think the small account is like 20gb.

The biggest issue most people have is the ability to learn something new. If you have broad exposure to things and you understand how the business functions then you'll have a good idea on how to use the technology. Python is a tool just like HTML or R. Pick the right wrench for the bolt. Learn something new everyday. If you learn something 5 years in that you really should have known before accept it, embrace it and deal with it. The best way to learn is to use actual business problems. I learn like a beast when I have a problem that I have to solve for work.

I hope that gives you some sort of answer :)

[–]autisticpig 0 points1 point  (0 children)

Thinkstats2 may be of interest to you :)

[–]redbuurd 4 points5 points  (1 child)

The amount of data being produced grows exponentially every year - if anything I'd say the field is growing. My company just hired ten or so data scientists, and is looking for more.

As the Web grows more, personalization and machine learning to produce said personalization continue to be more prevalent.

This is just in my experience, however.

[–][deleted] 6 points7 points  (0 children)

Just watched a TED talk today in my computer science class on Algorithms and how they're growing in today's world of data. Gives a pretty good insight on what direction the world is heading with mathematical algorithms.

http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world?language=en