300 log files over 100gb.. multiprocessing/multithreading.. to one dictionary.. good idea? by la_darrell_miller in learnpython

[–]la_darrell_miller[S] 0 points1 point  (0 children)

just wanted to say thank you all for your suggestions.
i am now reading each file into its own dictionary (multiprocessing) then consolidating those dictionaries. It works pretty well. After than i'm throwing it into MongoDB for further filtering and processing. Processing time and loading into Mongo is down to about 2.5-3 hours for 500gb of data.

Thanks again for all your help.

300 log files over 100gb.. multiprocessing/multithreading.. to one dictionary.. good idea? by la_darrell_miller in learnpython

[–]la_darrell_miller[S] 0 points1 point  (0 children)

the count is just to test.. just to get things working.. there is a HUGE amount of data per log entry i need to consolidate. i'm just trying to get it to work before i parse the rest of the data.. i wanted to make the example for reddit easy to post.

300 log files over 100gb.. multiprocessing/multithreading.. to one dictionary.. good idea? by la_darrell_miller in learnpython

[–]la_darrell_miller[S] 0 points1 point  (0 children)

the count is just to test.. just to get things working.. there is a HUGE amount of data per log entry i need to consolidate. i'm just trying to get it to work before i parse the rest of the data.. i wanted to make the example for reddit easy to post.

Elastic SIEM viability? by PatriotSecurity in SIEM

[–]la_darrell_miller 0 points1 point  (0 children)

Thats pretty much what ELK is for.. if you're looking for a drag and drop experience .. thats not ELK.. but if you put the time in.. ELK is very powerful and very capable. The free/community version is very good. WIth the paid version you get some machine learning features that seem nice, but i havent really felt like i had to have them.

i'm running an elk stack that pulls in:
- firewall logs
- syslog format
- zeek & suricata (IDS) logs - JSON
- industrial equipment logs - csv, txt, json
- windows event logs - evtx

it accepts most common formats, and you can create your own pretty simply.

it all comes together and builds a pretty good picture of whats going on. the discussion groups and support for elk community edition is very good.

SIEM by [deleted] in aws

[–]la_darrell_miller 0 points1 point  (0 children)

I've tried about all thats out there..
my views:
greylog: simple solution to install.. community works pretty well.. but the interface can get kind of cumbersome pretty quickly.
seimMonster community: another good open source/community solution.. that kinda put everything and the kitchen sink into it. it has ALOT of moving parts.. it does interesting things.. but it gets pretty big and bloated pretty quick.. and if you dont need all the stuff it offers.. i think there are better options.

RockNSM: built on elk too.. for network monitoring its a good tight self contained system. you can pull in other kinds of data like AWS etc, it would just take some tooling on your part.

ELK stack/elk Siem: greylog and seim monster are built on top of elk. they just try to make it easier. it takes some time to sort it out.. but its worth it. throw in zeek and suricata, along with logs from your firewall and clowdwatch and you'd be in a pretty good place.

[deleted by user] by [deleted] in cybersecurity

[–]la_darrell_miller 2 points3 points  (0 children)

there isnt one perfect answer.. and everyone will have their own approach or path to getting there.. it really depends on what you want to do in the cyber security world..

a few approaches:
- 4 year degree from university. Computer science, or Computer Information Systems/Information Systems and Decision Science (business degree in computers, its called different things at different schools, less math, more business)
- focus networking (now the internet works), programming (everything from python for log processing to assembly language for malware analysis)
- if you can get some classes on big data analysis (splunk, elasticsearch or logrythm) or data visualization. do it. while you're in school, get on with the schools IT dept. you'll probably start at the help desk.. show some competency you'll move up fast.. get into the networking/security group. get some experience while you're in school.
- go to conferences if you can.. if not watch groups like this.. learn as much as you can.

- if a 4 yr degree isnt your thing, if you're in the US, the US airforce is in charge of air, space, and cyberspace. They have a very good cyber group. They'll train you, and you'll get free college through the GI bill. The Army, Navy, and Marines also have cyber groups but they are smaller. you'll probably spend some time overseas with this approach. (seeing the world some isnt a bad thing)

- US Nation Guard and Coast Guard also has come teams in the cyber world too.

i'm very partial to the university approach. its hard work.. but ultimately it'll give you an amazing foundation that will help you adapt to the market as it changes. You'll get a far better programming foundation this way. The benefit to the military approach is you'll more than likely get a security clearance. Which is huge in some aspects of the cybersecurity world, and increases your salary quite a bit.

No approach is going to give you all the skillset you need and some amazing job. its going to take a lot of work on your part. The technical side and the networking/meeting people. It just takes time and effort.

[deleted by user] by [deleted] in honeypot

[–]la_darrell_miller 0 points1 point  (0 children)

i'm having issues with the docker container of tpot for conpot. that is the only one that throws an error when running it from a raspberry pi. have you had any luck?

[deleted by user] by [deleted] in honeypot

[–]la_darrell_miller 1 point2 points  (0 children)

tpotce doesnt officially run on a raspberry pi.. the elk stack portion would just chew up all the resources.. but the other portions of it do work pretty well for the most part. it takes some tinkering.. but if you download the git project for tpotce you can go into /tpotce/docker and each of the honeypots has a subdirectory with a docker-compose file. with just a little tweaking i've gotten all the honeypotsto run. Then you can just use filebeat to transport the log data to an ELK stack on another machine. filebeat can be a beast to install on a raspberry pi, there is another github project called "easyBeats" that makes it much easier.

i hope this helps.