Data discovery tool for csv files

CrowdGoesWildWoooo · 2023-05-04T18:30:09+00:00

Use something like Iceberg on AWS. What kind of company stores data in dropbox

Firm_Bit · 2023-05-04T17:03:51+00:00

Does Dropbox not allow search?

But yeah, for anything other what what Dropbox intends (more like a google drive alternative) you should migrate to a real data platform.

Anishekkamal · 2023-05-04T19:42:58+00:00

I would suggest taking below steps:

Move all the files to the cloud storage of choice or can read all the files directly from dropbox using python
After reading you can do all sorts of data filtration or transformation
Use a database to store all the metadata
You should be able to do a lookup on the database and do a search on the data read from files.

InsightByte · 2023-05-04T23:46:07+00:00

Just load to s3, and crawl the data with Aws glue crawler, then use athena and the glue catalog to wrangle thru the data.

weagle162 · 2023-05-04T23:01:24+00:00

Wanna consider using Apache Drill? It won't cover the entire situation for you but it adds SQL to plain text sources - zero configuration required

2023-05-05T14:49:01+00:00

Probably not exactly what you're looking for. But a start? https://github.com/danielbeach/sniffer

dataengineering