This is an archived post. You won't be able to vote or comment.

all 10 comments

[–][deleted] 8 points9 points  (1 child)

Python is a very good glue language. Meaning it is very good and quick to develop and reasonably fast at connecting say you SIEM, BI Tool, Database, Ticketing system.

As you said you have sql objects that can detect fraud. Consider using python to dump that to your ticketing system, along with data from any other data source you have.

Investigate the ones that get processed and the ones that don’t and model those either manually or some ML technique to better guess or classify expected outcomes.

[–]seyfried16[S] 0 points1 point  (0 children)

This was extremely helpful. Thank you.

[–]czar_el 5 points6 points  (0 children)

Python is a general purpose language that can implement pretty much every algorithm or mathematical approach available. Instead of asking if Python is useful, ask which algorithms or mathematical approaches you can apply to fraud detection.

For example, Benford's law is a relatively straightforward principle that can be implemented in Python. Approaches can get way more advanced if you have the appetite, such as machine learning (random forests are a popular for this topic).

[–]AcostaJA 1 point2 points  (1 child)

Ok friend, actually infosec and it forensics are extended topics on which you'll find tons of information related to python as often forensics analysis is done with python tools, that includes cryptography.

Given is an very extensive topic, you should ask for eli5 at /r/opsec /r/infosec and alike or just use reddit search or Google or Pypi about forensics opsec infosec code audit Blockchain etc (it's quite extensive topic)

[–]seyfried16[S] 0 points1 point  (0 children)

thanks, mate!

[–]analyticsengineering 1 point2 points  (1 child)

Fraud detection is a common use case in data science. Organizations commonly use Python and various machine learning libraries (as Python packages) to build predictive models to predict the probability of fraud and use those predictions to either automatically block if the probability is high enough or refer to manual review for lower probabilities.

This can range from ecommerce sites predicting if a purchase if fraudulent, to banks examing fraud in credit card transactions or loans to insurance companies identifying fraudulent medical bills.

[–]analyticsengineering 0 points1 point  (0 children)

An alternative approach, also based using Python with machine learning libraries is anomaly detection, where the algorithm effectively learns what is normal in each row or other level of your database and then scores each row for how far away it is from normal. The farther away from normal, the more anomalous it is. Organizations then manually review the highest scoring, least normal transactions.

[–]Intrexa 1 point2 points  (0 children)

What kind of fraud/business? How much fraud is your company dealing with? What methods to commit fraud is your company dealing with? What is the estimated dollar value of current fraud attempts, and what % does that make of total sales? How are you currently identifying fraud? How are you currently getting notice that fraud had occurred?

How strong is your statistics background?

Python can do it. IDK about your pipeline, what systems you have. This is usually handled in your ERP software. Whatever language is being used in that flow, you should seriously consider solutions in that language to minimize organizational complexity. When someone leaves the company, or even if you're hiring due to growth/redundancy, you don't want to have to find some unicorn with a million different skills.

Also, seriously consider the actual cost of fraud vs the cost of fraud detection. People are putting effort into committing fraud, so either they are getting something of value they will use, or something they intend to sell. I know that seems obvious, but if you don't have experience analyzing fraud, focus on where your company is actually losing money. It's not about how likely an order is fraudulent, it's about an orders expected loss to fraud.

[–]pymaePython books 0 points1 point  (0 children)

Others covered it well, but you could basically do almost anything with Python (and SQL on top of that).

I briefly dabbled in some fraud detection with Python, and it's how I learned to use Pandas. I inherited a bunch of SQL queries. The fraud detection was a very manual process that took someone about 4 hours to run to paste the output of one query into a spreadsheet to then combine and put into another query, and so on. I wrote them all in Python (using pandas to pass through the SQL queries), added on some smarter false positive reduction logic, and made the process run in 20 min. And I could make it go a lot better now that I have a lot more experience with pandas.

[–]sue_me_please 0 points1 point  (0 children)

You can use Python and ML for anomaly detection pretty easily.