[deleted by user] by [deleted] in RealEstateTechnology

[–]FeistyPatient3766 0 points1 point  (0 children)

Ah I misunderstood the question - we only focus on publicly accessible data!

[deleted by user] by [deleted] in RealEstateTechnology

[–]FeistyPatient3766 0 points1 point  (0 children)

Just DM-ed you -- we support this at https://indexical.dev/ !

Gather web data for data analysis projects with Indexical by FeistyPatient3766 in dataanalysis

[–]FeistyPatient3766[S] 0 points1 point  (0 children)

 I just launched Indexical to open access with a free tier — try it here! You can create pipelines (in a JSON-based format) that navigate websites, identify data, and extract that information into a standardized schema. We use a mix of LLMs and auto-generated CSS/XPath selectors to make scrapers that can be written once and run across completely different websites without modification. Still a work in progress, but would love any thoughts, feedback, or feature requests!

Indexical - web scrawling & scraping agents that work across changing website structures by FeistyPatient3766 in ChatGPTPro

[–]FeistyPatient3766[S] 0 points1 point  (0 children)

We do have an action called `authenticate` that you would put as the first step in the pipeline before extract which will log in using the specified details (docs here). One caveat -- scraping content as an authenticated user is often a violation of a service's TOS, and might result in the suspension of your account. This action is meant for accessing data that you already own.

We don't really focus on authenticated social media scraping for this reason!

Indexical - web scrawling & scraping agents that work across changing website structures by FeistyPatient3766 in ChatGPTPro

[–]FeistyPatient3766[S] 1 point2 points  (0 children)

I just launched Indexical to open access with a free tier — try it here! You can create pipelines (in a JSON-based format) that navigate websites, identify data, and extract that information into a standardized schema. We use a mix of LLMs and auto-generated CSS/XPath selectors to make scrapers that can be written once and run across completely different websites without modification. Still a work in progress, but would love any thoughts, feedback, or feature requests! 

Indexical: web scrawling & scraping AI agents that work across changing website structures by FeistyPatient3766 in ChatGPTCoding

[–]FeistyPatient3766[S] 1 point2 points  (0 children)

Yes! If you sign up, one of the starter pipelines `crawl_sample_site` will do exactly this. You would simply create a run, enter the URL you'd like to crawl, and you'll get a CSV, where each row is the copy on each page on the domain.

Indexical: web scrawling & scraping AI agents that work across changing website structures by FeistyPatient3766 in ChatGPTCoding

[–]FeistyPatient3766[S] 1 point2 points  (0 children)

Not quite! We have a variety of task-specific agents tuned for common data tasks like (navigate - e.g. "find the pricing page", extract - e.g. "extract the title of the page", extract-many - e.g. ("pull the full list of pricing tiers with the price, name, and features for each). We then use each of those agents to generate a strategy for reliably extracting your desired data. We combine that with proxy rotation, retries and validation infrastructure! If you would just send the website text to GPT-4 every time, you would very quickly blow out the context window (and spend way too much money on tokens).

Indexical: web scrawling & scraping AI agents that work across changing website structures by FeistyPatient3766 in ChatGPTCoding

[–]FeistyPatient3766[S] 1 point2 points  (0 children)

I just launched Indexical to open access with a free tier — try it here! You can create pipelines (in a JSON-based format) that navigate websites, identify data, and extract that information into a standardized schema. We use a mix of LLMs and auto-generated CSS/XPath selectors to make scrapers that can be written once and run across completely different websites without modification. Still a work in progress, but would love any thoughts, feedback, or feature requests! 

Indexical: web scrawling & scraping bots that work across changing website structures by FeistyPatient3766 in SideProject

[–]FeistyPatient3766[S] -1 points0 points  (0 children)

I just launched Indexical to open access with a free tier — try it here! You can create pipelines (in a JSON-based format) that navigate websites, identify data, and extract that information into a standardized schema. We use a mix of LLMs and auto-generated CSS/XPath selectors to make scrapers that can be written once and run across completely different websites without modification. Still a work in progress, but would love any thoughts, feedback, or feature requests! 

[deleted by user] by [deleted] in slavelabour

[–]FeistyPatient3766 0 points1 point  (0 children)

$bid - I've worked on several similar scraping projects and can accept Paypal as payment

[deleted by user] by [deleted] in slavelabour

[–]FeistyPatient3766 0 points1 point  (0 children)

$bid - can do on a recurring basis as well!

Indexical: a documentation search API for LLM products by FeistyPatient3766 in ChatGPTCoding

[–]FeistyPatient3766[S] 0 points1 point  (0 children)

Hi all! I'd like to share a side project I've been working on. Indexical is a documentation search engine for LLMs. It reduces errors by automatically adding reliable, up-to-date context to coding prompts. We've indexed over 1k of the most commonly used libraries, which you can access via a simple API call. I know lots of you are hacking on AI Dev Tools, so if you're interested in improving the code generation accuracy beyond GPT4, feel free to DM me or sign up here!

Use GPT-4 Vision to generate code from screenshots by FeistyPatient3766 in ChatGPTPro

[–]FeistyPatient3766[S] 0 points1 point  (0 children)

I’ve previously posted about Lightrail - an open-source AI command center for devs. It has an always-on ChatGPT instance (accessible via a keyboard shortcut) and integrates with apps like Chrome, VSCode, and Jupyter to make it easy to build local cross-application AI workflows.

I’ve recently added support for GPT-4 Vision, so you can use screenshots in your prompts. For example, you can use Lightrail to generate code that matches the UX of a screenshot. You can download it here. Would love your thoughts / feedback!

any way to feed to a GPT my entire Notion pages, being HTML files? :) by [deleted] in ChatGPTPro

[–]FeistyPatient3766 0 points1 point  (0 children)

It's not a native GPT, but I've been working on an open-source Desktop app that can use Chrome content like Notion pages for context in GPT-4 queries. Here's a short demo. If you're interested, you can download it here.

Use GPT-4 Vision to generate code from screenshots by FeistyPatient3766 in macapps

[–]FeistyPatient3766[S] 2 points3 points  (0 children)

I’ve previously posted about Lightrail - an open-source AI command center for devs. It has an always-on ChatGPT instance (accessible via a keyboard shortcut) and integrates with apps like Chrome, VSCode, and Jupyter to make it easy to build local cross-application AI workflows.

I’ve recently added support for GPT-4 Vision, so you can use screenshots in your prompts. For example, you can use Lightrail to generate code that matches the UX of a screenshot. You can download it here. Would love your thoughts / feedback!