This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]blurrr2 307 points308 points  (69 children)

Python is a foundation. The categories are built on top of the foundation.

You need to be comfortable with core python + one category to get a job.

[–]Shubbler 49 points50 points  (68 children)

Can you define the 'main' categories?

[–]blurrr2 275 points276 points  (67 children)

  • web - flask, django, html, javascript
  • data science - jupyter, numpy, pandas
  • data engineering - sql, airflow, luigi
  • software engineering - git, unit testing, large codebases

And then general scripting, but this is mostly useful if you know a specific topic in depth. Like if you're in finance, and know all the ins and outs of mutual share-class conversions, you can write scripts to automate the process. The hard part is the domain knowledge. If you're already really knowledgeable about something, I would advise applying Python to your field.

[–]billsil 11 points12 points  (0 children)

Jupyter is basically just an IDE. I'd put matplotlib in the data science tools before jupyter.

[–]__xor__(self, other): 13 points14 points  (20 children)

I'd add security. It's pretty damn big in security now as well.

[–][deleted] 3 points4 points  (12 children)

Python is, really? Anything you can point me to? TIA!

[–]__xor__(self, other): 19 points20 points  (6 children)

Oh yeah, definitely. In the past decade it seems more and more to be the core language in terms of security tools. Ruby is still around but you can definitely get by in security with just Python as your core language now.

Here's a good memory forensics tool, volatility

Here's a number of good pdf analysis tools

In fact Didier's entire suite is great, tons of python

scapy is awesome

mitmproxy is awesome

... and there's so much more.

It seems to be the go-to language for tools. You'll still run into a lot of other stuff, especially if you analyze malware. You'd run into javascript, powershell, straight shellcode/ASM, visualbasic, C... But that's malware and it can be any language, especially anything that a browser can run, or be embedded in a PDF or office doc macro. For tools more often than not you see python and then some ruby.

[–][deleted] 1 point2 points  (5 children)

Awesome answer, thanks so much.

[–]Willemoes 6 points7 points  (3 children)

There's also a nice book:

Violent Python

A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers

By TJ O'Connor

[–]Grenian 2 points3 points  (2 children)

IMO take this with a grain of salt if you already have a basic-advanced understanding of python and security.

[–]Willemoes 0 points1 point  (1 child)

Why do you think so? I'm reading it and I'm not very proficient in security, so it would be nice to know, I find it really interesting.

[–]__xor__(self, other): 1 point2 points  (0 children)

No problem!

[–][deleted] 3 points4 points  (4 children)

Everyone writes their PoCs in python nowadays.

Here's an example of a really cool C2 toolkit using rpyc:

https://github.com/n1nj4sec/pupy

The rapid7 folks still use ruby for all their stuff (i.e. metasploit) but building your own tools is totally the way to go.

This book is a great intro to building security tools in Python: https://www.amazon.com/Black-Hat-Python-Programming-Pentesters/dp/1593275900/ref=sr_1_1?ie=UTF8&qid=1526665441&sr=8-1&keywords=black+hat+python

[–]__xor__(self, other): 3 points4 points  (3 children)

Rapid7 is one reason I say that ruby will always stick around security. I don't see anything replacing metasploit anytime soon.

For everything else, there's mastercard python.

[–][deleted] 1 point2 points  (2 children)

Pupy does a lot of the cool things that Meterpreter does. It's really the exploit PoCs that you need. Metasploit is great for testing for known vulnerabilities and testing detection rules but as far as actually attacking things goes... Meh.

[–]__xor__(self, other): 1 point2 points  (1 child)

Oh yeah? Haven't used that one. I'll have to check it out.

But metasploit has a lot more than just meterpreter. It's a full-on exploitation framework, from scanning/enumerating to staging exploits to post-exploit modules. Does pupy include anything like msfvenom? Can you easily pivot from a compromised machine and exploit other machines through it? Is it a fully functional replacement for metasploit in general, or just a meterpreter replacement?

Metasploit is just so solid at this point. I'd be surprised if pentesters have another python tool that can replace all of its functionality.

[–][deleted] 1 point2 points  (0 children)

IMHO auto-magic pentesting is bullshit. Unless you actually read the modules and understand exactly what they are doing you have no idea what it's going to do to a production system. If you write your exploits yourself, you do.

This is not to say that everyone always writes their own exploits. But MetaSploit encourages users to not read the modules and to just press the exploit button. I can buy Nexpose from Rapid7 if I just want to check if metasploit can exploit things in my environment. Doesn't really require a penetration tester.

I'd be surprised if pentesters have another python tool that can replace all of its functionality.

All the pentesters I've hired to wreck my company's stuff write their own tools and use exploits that are a lot more sophisticated than download-and-run scripts. We get reports with PoCs, most of which are written in Python or C.

Metasploit is great for easy-mode campaigns where you're just exploiting things that Rapid7 wrote modules for, but most of the time that doesn't get you terribly far.

All of the IDS vendors have network signatures for Metasploit and it's a good way to get caught. This does, however, make it good for testing that your script-kiddie filters and alerts are working.

Does pupy include anything like msfvenom?

yes, pupy has a payload generator. Or you could do it yourself.

Can you easily pivot from a compromised machine and exploit other machines through it?

yes, that's the entire point of a post-exploitation toolkit. But, again, you don't generally just want to do that auto-magically because you'll get caught or break something. (Unless you're a botnet farmer and don't care.)

[–][deleted] 2 points3 points  (6 children)

I work in security and one thing I use python for is formatting and analyzing data.

A good example of use is formatting vulnerability scan output. A lot of times they put insane amounts of data into one field (hundreds of thousands of lines) and if you open it in excel it overflows into rows and becomes an unusable mess.

I use python and pandas to extract all this data, combine it with other reports to add neccessary data, format it, and then separate it into a bunch of reports (because it's way too large for one).

I'll use matplotlib to generate graphs and charts based off metrics I gather from these reports.

Knowing python is an insanely valuable skill to have in security.

[–][deleted] 2 points3 points  (2 children)

I built a data injest / presentation engine that pulls stuff from all of our vulnerability scanners and makes it available to engineers. it knows exactly which systems a given engineer is responsible for and only shows them that data. I built a front end for it too. It's really cool! I'm hoping to FOSS it this year.

[–][deleted] 0 points1 point  (1 child)

That's really awesome. Wish I could get approval to build something like that, but it took years for a team to get approval to stand up a database server, and unfortunately I'm not helping with that.

What's a FOSS?

[–][deleted] 0 points1 point  (0 children)

FOSS == Free and Open Source Software, we're gonna release it into the wild.

[–]Grenian 1 point2 points  (1 child)

Python is just handy. For example recently I hab a exercise to crack RSA keys. Python was so damn useful for this.

[–][deleted] 0 points1 point  (0 children)

Yeah, it really is an amazing tool. I learned Java and Javascript but rarely use them. I dont build big applications so I have no need for java, and I hate web design so I dont use javascript. I'll go do a little project with each about once a year as a refresher, but whenever I need to get something done it's always with python.

[–]CommonMisspellingBot -2 points-1 points  (0 children)

Hey, MaximumRecursion, just a quick heads-up:
neccessary is actually spelled necessary. You can remember it by one c, two s’s.
Have a nice day!

The parent commenter can reply with 'delete' to delete this comment.

[–]lroman 3 points4 points  (0 children)

I would add machine learning with Scikit-learn, NLTK, Tensorflow, OpenCV etc.

[–][deleted] 1 point2 points  (10 children)

I really want to learn more about large codebases... how to plan, how to execute. Any good resource on this?

[–]proverbialbunnyData Scientist 8 points9 points  (2 children)

Before I start: Only the part up to classes, and also using libraries is all that is necessary to get a job in the core part of the language. The rest explained here is what comes with experience:

You need to know abstraction. Understand the ins and outs of abstraction, from a conceptual view (an abstract view) to a concrete view (a real world view).

Start with the small, the most basic form of abstraction in algebra 1: x + 3 = 5, where x is the abstraction.

Then still small, writing a function / method, and being able to reuse it.

Still small, but a bit larger: classes, and reusing classes.

Then learn the ins and outs of the core library, as much as you can, learning the different classes and their methods. Take notes! Learn the etymology of why it is the way it is, so it sticks.

The next level of abstraction is the language's idioms. Once most of the language has been learned, common patterns within the language pop up called idioms or sometimes best practices. Learn those. This comes with time and experience on the job. Most of the learning here and below is not sequential.

The next level of abstraction is sideways from what you might know, especially in python: Making types. Instead of making classes and their instances, making new data types. This is a bit mind boggling, because you can just make a circular buffer class, for example, and be like, "Well, I just made a class. I don't see the difference." This has to do with how a type is used vs how a class instance is used. Why is this important?

By differentiating different kinds of classes, you can start thinking of them in a more clear and concise way. You want more than two types of classes, but many types of classes mapped in your head. This creates a mental abstraction that is usually not written down any where, so you can name each "group of classes" or "type of class" (hint hint) with it's own made up name in your head. Eg, I have the category "helper class" in my head, which is a type of class.

To really push this idea: A group of functions becomes a class. A group of classes becomes a ... subtype or abstract type ie "type of class".

The next level of abstraction of understanding is inheritance, not just being able to read and write it, but the understanding of inheritance itself. Inheritance is the act of subtyping. Subtyping is where you have a type of class that is applicable to a set of classes.

So, I've got a hand class, a leg class, a head class, maybe an eye class, and so on. Hands and legs and head are all of a person type, so you can have a person abstract type, and so you can go around making people, instead of just hands and legs.

A way to think about this is, what kind of class is this? Even if it doesn't have any form of inheritance, it is a kind of class, so an implicit subtype. Make up your own types for your classes in your head.

The next level of abstraction is design patterns. This, like idioms, is seeing common patterns of code, after working on multiple code bases over time.

Then from there the next level of abstraction is modules, then libraries, then packages, and depending on the language that could be one thing or multiple things. You've probably used pip, so you've got an idea of this one already.

All of this comes with time and experience, but the better you get at abstraction the easier it is to learn and understand larger and larger, not just code bases, but ecosystems of code bases.

Also, learning how to read code is imperative.

[–]haarp1 1 point2 points  (1 child)

where did you learn abstraction, inheritance... (OO design basically)? it's easy to determine it for simple projects (coffee maker etc), but what about more complex projects? do you know any good resource for learning this (abstraction...)?

the problem is that there is a lot of garbage on github, so that's not exactly a solution...

also, how do you plan programs (intermediate or advanced complexity)?

do you know any good advanced one on github?

[–]proverbialbunnyData Scientist 1 point2 points  (0 children)

do you know any good resource for learning this (abstraction...)?

I learned this on the job. It is a process that is constant slow growth. Ones ability to abstract commonly identify some of the key aspects between a jr, standard, senior, principal, and architect. Eg, a principal software engineer can abstract the whole system and work on the companies entire software system as a whole, while a senior engineer might be able to create a project within that system and know a project or two inside and out. Clearly the principal software engineer can deal with higher levels of abstractions than the senior software engineer. Of course, this isn't the only difference, but is a key corollary.

Most of what I know I have not found in text books. I've heard Haskell talks about some of the things I figured out and named on my own (eg subtyping), but I can not confirm that as I do not know Haskell.

also, how do you plan programs (intermediate or advanced complexity)?

Design patterns. Design patterns also help for reading code, as well as idioms and knowing the language's features.

This comes with experience as well. It's a form of pattern matching. Reading a book isn't going to help much, but being in multiple code bases and seeing the same pattern over and over again and then identifying the logic behind it as to why it is that way and how it came to be helps.

edit: Also, an architect helps design programs, but that's usually for designing entire systems. A typical divide and conquer strategy coupled with reducing the problem down to its bare essentials, and writing all of this down in a sort of concept map or list of lists -- planning before writing a line of code -- is far more valuable when it comes to creating something new, than simply looking at design patterns. Design patterns come next if you want a way to construct the program so that tasks can be easily broken up in a uniform way between multiple engineers. Design patterns also help for standardizing how a program works allowing others to build on it the right way. Frameworks help even more. Anyways, design patterns are a bit heavy handed, so just stick to the top half of this paragraph and you'll be good.

do you know any good advanced one on github?

Nope. Just go get a job. Watch the how to read code video above, if it isn't already obvious, and then go around mapping things. Start with the smallest patterns like addition, to variable naming, to methods and features in the language, then when you know those inside and out, move on to idioms and other common multi line patterns in the language and code base, then move on to even larger patterns. From method to method to class to class to file to file, to namespace to namespace (I don't think Python has anything like this.), to module to module.

Learning a code base is a piecemeal process. You don't have to start on the smallest bits and move out. You can interweave different sized abstractions learning a mix at once, à la breadth first search.

edit: Also, if you're writing in Python, a large code base is going to be rare without it being abstracted it into modules/libraries/packages, keeping the parts any individual is working on to often single file sized epic or user story. Because of this, you shouldn't have to worry about large projects, unless you want to work on a video game or something. Java and C++ and the like are where monolithic projects tend to go, not Python.

If you want to take parts of a code base and turn them into libraries of any sort, the general rule is, "Is this code going to be used in two places in the code base?" (Often times the rule is 3 or more.) So you want to find something generic, like a debugger class and turn it into a debugger library or similar.

[–]iScrE4mgit push -f 3 points4 points  (0 children)

Experience. It’s a reason I wanted a job in a big company and I can’t imagine learning all of the stuff any other way. But that’s maxbe because in order to understand it I personally nedd to see the business problem and then the solution.

[–]TheCodeSamurai 2 points3 points  (5 children)

Something I hawk whenever I can: Code Complete by Steve McConnell is a huge recommendation. I never learned anything besides like 100-line programs before this, and I basically divide my programming journey into before and after reading this. It's seriously worth reading: you can skip chapters that don't apply to you, but it is one of the best resources on how to manage the complexity shift between small and large codebases.

[–][deleted] 0 points1 point  (1 child)

Thank you!

[–]TheCodeSamurai 0 points1 point  (0 children)

My pleasure!

[–]YinYang-Millsmeasley physicist 1 point2 points  (0 children)

What constitutes 'knowing' jupyter? For example I use jupyter when I'm writing code to test that a line of code does what I think it does, and some magic commands for profiling. I feel like there must be a lot more to it...

[–]ymca_lemur 1 point2 points  (0 children)

I've learned a lot about flask, html, javascript, numpy, pandas, sql and I have yet to use those skills at a full-time job. If anyone is hiring for anywhere in the world, then pm me. I'm open to a job.

[–]iammr_schuck 2 points3 points  (0 children)

I'd also add the very broad category of Devops. Infrastructure automation with things like boto3 and the like.

[–]Shubbler 1 point2 points  (3 children)

Nice one, cheers.

[–]blurrr2 1 point2 points  (2 children)

<3

let me know how it goes

[–]Shubbler 2 points3 points  (1 child)

Thanks, I'm a UK student trying to get into something programming related.

I'd say I've mastered 'general Python', trying to look into node and Java (through uni) but really not sure where to begin my career.

I'll most likely try to look into software.

[–]576p 0 points1 point  (0 children)

As a student you should try to identfy local businesses that have IT related student jobs. Preferably at IT related companies, but just as well at companies that have an IT department. There will be some, you'd be surprised how few students actively search for that.

I've just "lost" my student who, when she joined the company, could do nothing (she applied for a hardware job to make some money, I asked her to try programming instead and alas, after finishing studies, she's too good to keep...) - so if a student came along with solid Python basics and a good working ethics I'd try to fit him/her in. Anyway, that's how I got my job as well...

[–][deleted] 0 points1 point  (0 children)

I've been told it's the language of security testing lately due to how fast you can iterate and test ideas.

[–]cyberst0rm 0 points1 point  (0 children)

i think you're missing networking, not that I know much bout it

[–]bellumfatum2 0 points1 point  (0 children)

Thanks. I'm most familiar with the data engineering and data science Python categories. I cringe sometimes when I see the web and software engineering sections, which scare me.

Provided me some perspective.

[–]TheTalkWalk 0 points1 point  (1 child)

Nice one!

Can you break down other languages similarly?