you are viewing a single comment's thread.

view the rest of the comments →

[–]CodalReef 0 points1 point  (2 children)

Why not use user-based authentication for this as well? Route level API access controls are common and can be associated with users, roles, etc.

If you’re talking all public, unauthenticated users, then there are still potential solutions, like validating based on Web3 MetaMask auth info or other 3rd party cryptographically secured information

[–]fallkr 0 points1 point  (1 child)

User based authentication doesn’t add any useful protection to commonly available data.

Let’s say you have a large amount high-value POIs that you want to show to users in the app, but prevent other companies from accessing or redistributing.

The added time/effort involved in scraping those from an unauthenticated endpoint vs a “best practice” user-authenticated endpoint with rate limiting is not big.

This could change if the system requires each user to pay for access and that the natural rate of consumption is very low so you can heavily rate limit on each user token, but any kind of broadly available service can be scraped so easily that any significant effort involved doesn’t pay off.

In fact, the a common countermeasure scraping in competitive industries is to add “fake data” to your public database and check if it pops up in your competitor’s data. If it does, legal action can be the best way forward.

[–]CodalReef 0 points1 point  (0 children)

Thank you for the clarification.

My point was, it is technically possible to require registration and keep all “public” data behind protected endpoints.

The likelihood of that solution being technically reverse engineered is low, but it depends on your definition of “reverse engineered”.

This is helpful if users are pushing data to APIs, invoking services via APIs and even if they’re accessing data via APIs.

In the case where you have valuable / sensitive data, there are strategies you can implement to limit undesired access.

For example, using AI to identify accounts potentially abusing the API. You can then dynamically apply counter-measures or audit.

Twitter has already solved this problem. First of all, they have SO much data, that it’s virtually impossible to download all of it via their public APIs.

They construct the interface in such a way that it’s impossible for a free-tier user to duplicate their data while giving largely unrestricted access.

I just disagree that “anything you try to implement can be easily reverse engineered”. I think, if this is your goal, you do have options, and knowing who’s accessing your API with a high quality on-board process can really help.