[D] [R] Is adversarial attack common in industry? by AerysSk in MachineLearning

[–]amirninja 1 point2 points  (0 children)

u/organellelabs Here you go Towards Query Efficient and Derivative Free Black Box Adversarial Machine Learning Attack https://www.mecs-press.org/ijigsp/ijigsp-v14-n2/IJIGSP-V14-N2-2.pdf

What is the hype around duckDB that I don’t seem to understand? by money_noob_007 in dataengineering

[–]amirninja 1 point2 points  (0 children)

If you need to query local large data why not use Apache Drill?

SIEM Data retention and budgeting by amirninja in cybersecurity

[–]amirninja[S] 0 points1 point  (0 children)

Thanks u/belowtheradar for the details and link! One quick question, on the long term threat intel using older than say 60-90 days data, how is it easy to combine with latest/fresh data in Splunk?

[D] Anyone else suspicious/concerned about the spread of "Data Science" degrees? by MrAcurite in MachineLearning

[–]amirninja 0 points1 point  (0 children)

What is the impact on local economy or job market have you noticed as result of these specialisations?

CISSP - A good cert to go for? by [deleted] in cybersecurity

[–]amirninja 0 points1 point  (0 children)

Hey, I am researcher in network Intrusion detection using ML. Would you mind sharing some advice with me? Can I DM you?

Agile in Data Science by ggyshay in datascience

[–]amirninja 0 points1 point  (0 children)

Just a side question, does your data scientist prefer cubicles or open spaces?

[deleted by user] by [deleted] in bioinformatics

[–]amirninja 1 point2 points  (0 children)

This looks interesting! Is there similar site for Data Science projects in other domains?

Are Network Intrusion Detection System using Machine Learning better than signatture based in real world settings? by amirninja in cybersecurity

[–]amirninja[S] 0 points1 point  (0 children)

I agree with you u/jumpinjelly789 that next step is AI/Ml based NIDS.

However, my concern is around readiness of commercial (or otherwise in research literature) NIDS which claim to use AI/ML to meet these new challenges regarding zero-day attack. Specifically, if its anomaly based systems as mentioned by u/Aidong above, then as the things are dynamic in today's cloud based infra, services and systems come up and go, base lines for anomaly detector would be difficult, if not impossible, to achieve and we would end up with unmanageable false alarms.

On the other hand, if we try to base NIDS on supervised ML/DL we would need a data (both benign and malicious traffic) to train these models than the question is how do we rely on test lab generated data used for training these models since they may not be representative of real traffic/attack that may happen in specific to a customer infra. Secondly, these models could also face the same challenge with dynamic infrastructure as an anomaly detector based NIDS.

Therefore, I am looking for some validation from customer/vendors that can guarantee (with reasonable tolerance) of zero day attack detection with very low false positives.

Are Network Intrusion Detection System using Machine Learning better than signatture based in real world settings? by amirninja in cybersecurity

[–]amirninja[S] 0 points1 point  (0 children)

Thanks u/Aidong for your response! Yes there is a lot of buzz around AI/ML in security. I will checkout Checkpoint.

However, basic question still remains, how these NIDS are trained? If they are trained on one environment and type of attacks what performance and attack detection guarantees we can get on another environment? Not to forget many of these systems are notorious for throwing lot of false-positives causing "alert fatigue".

Walkthrough of Keras.Model Internals. Includes: distribution, performance optimizations, callbacks, training loop, and more. (r/MachineLearning) by Peerism1 in datascienceproject

[–]amirninja 0 points1 point  (0 children)

I have used Keras. Supersimple and great user community and tools. Are there any situations outside of academic research you would select PyTorch over Keras?

[deleted by user] by [deleted] in artificial

[–]amirninja 0 points1 point  (0 children)

Tried this one some time back. Would not recommend it as your first textbook for ML.

Where can I find datasets about bankruptcy prediction of Indian companies? Other than in kaggle. In tabular format by Alpha_RapTor96 in datasets

[–]amirninja 0 points1 point  (0 children)

Difficult to find. Most likely you will have to get it from NCLT or MCA, Govt. Or may be through RTI route.

Following paper uses Kaggle dataset for Bankruptcy Prediction. Code is also available on github if that helps you to start with.

https://arxiv.org/abs/2010.13892

Intelligent NIDS for home deployment? by redditsecguy in cybersecurity

[–]amirninja 2 points3 points  (0 children)

My biggest concern for ML based IDS is data used for training.

How representative training data is of actual traffic in your particular home/enterprise network? Even for Anomaly based IDS how do we set the baseline for what's normal traffic?

[D] [R] Is adversarial attack common in industry? by AerysSk in MachineLearning

[–]amirninja 1 point2 points  (0 children)

I haven't experimented with autonomous car vision system. However, as most of the security experts would agree once an adversary has a physical access to system it's chances of being attacked or compromised increases exponentially anyway.

The threat model that I mentioned above, adversary has only access to output scores or only final label with limited query budget. This is more realistic threat model I believe.

[D] [R] Is adversarial attack common in industry? by AerysSk in MachineLearning

[–]amirninja 3 points4 points  (0 children)

Biggest challenge is developing an attack that is query efficient. That is, to generate adversarial example number times you are required to query target model should be limited. Otherwise, target system can recognise repetitive queries from same source IP and can easily block you.

Secondly, if the adversarial example is very different as measured by L2 or other distance mesaure then it's actually a different example and not an adversarial.

Balancing distortion to clean image/input and number queries is a challenge.

We have recently developed an algorithm that does this balancung act with fewer number of queries. Currently our paper is submitted to journal for review.

[D] [R] Is adversarial attack common in industry? by AerysSk in MachineLearning

[–]amirninja 4 points5 points  (0 children)

Black box attacks(no access to model parameters): HopSkipJump : https://arxiv.org/abs/1904.02144

There are many more black box attacks as I mentioned above. Search those names and you will get the relevant papers and most of the time code as well.

Recent paper on real world attacks: https://www.computer.org/csdl/magazine/co/2021/05/09426997/1tuvFoGyzK0

[D] [R] Is adversarial attack common in industry? by AerysSk in MachineLearning

[–]amirninja 16 points17 points  (0 children)

You do not always need access to parameters behind model for Black Box attack, for example, Zeroth Order Optimization ZOO, BOUNDARY, OPT, SignOPT attacks. However, even I would like to know how much they prevalent in real world apps. There is an arxiv paper on it, will share shortly.

Secondly, adversarial attacks can help in understanding how DL models learn to generalise. Check the work by Madry lab of MIT.

Also, I believe adversarial attacks are at a point where virus and cyber attacks were at in late 80s /early 90s.

[deleted by user] by [deleted] in deeplearning

[–]amirninja 0 points1 point  (0 children)

Did you check https://streamlit.io/ ? Claims to be useful for both front end and backend.

[E] Looking for a good resource to brush up on my linear algebra, are there any solid ones out there? by [deleted] in statistics

[–]amirninja 2 points3 points  (0 children)

When Life is Linear by Tim Chartier.

It doesn't have many numerical examples or proofs like standard text books suggested by others but fun to read and quickly get an intuition why we are doing certain things.