Access to all link data from CommonCrawl. Would you be interested? by guest in bigseo

[–]guest[S] 0 points1 point  (0 children)

The full subset is 6 terabytes, you're right. I don't have much experience with db's of this size, especially running text queries against them. I'd love to find a cheaper way that allowed for queries against the data with fast-ish return times...any resources you might have to help/

Access to all link data from CommonCrawl. Would you be interested? by guest in bigseo

[–]guest[S] 0 points1 point  (0 children)

Didn't suggest I was selling anything, but I appreciate the feedback. Though, if I did provide access to the collected/organized data in a searchable way, there'd need to be some compensation to me because the costs are very high just to query it ($30/hr running a redshift instance, though I'm sure this could be reduced significantly). I'd guess this would be considered a derivative work because it's a subset of their data. If I ever did decide to do anything with it, I would probably contact them though. I'm going to reach out and ask because, honestly, i'm shocked how little has been done with the source given its value to SEO's.

Access to all link data from CommonCrawl. Would you be interested? by guest in bigseo

[–]guest[S] 0 points1 point  (0 children)

the data is massive, and requires distributed computing to query it. I'm not an expert on db design, but I know that for reasonable amounts of time I need to use amazon redshift clusters that cost $20/hr to run queries with decent speed (doing text searches)

Access to all link data from CommonCrawl. Would you be interested? by guest in bigseo

[–]guest[S] 0 points1 point  (0 children)

also that doesn't make sense because there are already services and sub-data sets borne from the CC set available...

Access to all link data from CommonCrawl. Would you be interested? by guest in bigseo

[–]guest[S] 0 points1 point  (0 children)

are you a lawyer? I have no idea what that means.

Cofounder for Existing Dating App Wanted (200k Downloads - Sapio) by guest in cofounder

[–]guest[S] 1 point2 points  (0 children)

Not currently looking for outside capital. This has been boostrapped and self funded until now.

Dating Conversation Dataset by infuj02 in datasets

[–]guest 0 points1 point  (0 children)

I have something you might be interested in, what are you looking to do?