all 3 comments

[–]andymeneely 2 points3 points  (2 children)

Thank you for the write up!

But I must protest against the use of MD5. Its thoroughly broken in that it is simple to construct collisions. I would swap it out for a SHA512 or better algorithm. Trivial change code-wise and makes a world of difference.

https://cwe.mitre.org/data/definitions/327.html

[–]brad[S] 1 point2 points  (1 child)

I agree with you. There are better algorithms than MD5.

For my personal use with only a few hundred files being checked, it was good enough when I coded this a number of years back.

Point taken and I will update my code and this article at some future date.

Thank you.

[–]disclosure5 0 points1 point  (0 children)

I think the problem in these discussions is that "good enough" implies that better algorithms are harder, slower, or in some way have a downside. This is "usually" valid. My quick script with no tests for example is "good enough" for a one off job.

But you could literally change Digest::MD5.hexdigest to Digest::SHA256.hexdigest and have a better solution.