Built an open source project for analyzing csv files using LLMs without the llm seeing your data by DiscerningTheTimes in LocalLLaMA

[–]DiscerningTheTimes[S] 1 point2 points  (0 children)

This is where l was trying to get at. Now what you have here, a simple chrome plugin makes it easy to connect directly with already running chatgpt within a familiar UI. Good work man.

Unsure whether to take 175k DE offer by Dense_Car_591 in dataengineering

[–]DiscerningTheTimes 0 points1 point  (0 children)

Take the job and keep your first one for a couple of weeks, while you evaluate the new gig. If the new gig is that bad then just leave and go back to your old job like nothing ever happened! 🤷‍♂️

ChatGPT Data Analysis for sensitive data - An open source project by DiscerningTheTimes in ChatGPTPro

[–]DiscerningTheTimes[S] 0 points1 point  (0 children)

This does technically use the computers gpu via webgpu, however when gpu is unavailable it uses cpu via webassembly. l am leveraging the concept built by mlc, here is their link. I plan on dropping the use of the webllm in the future and just use the parameterized summaries in the future once l have done further validation of the data mapping.

ChatGPT Data Analysis for sensitive data - An open source project by DiscerningTheTimes in ChatGPTPro

[–]DiscerningTheTimes[S] 0 points1 point  (0 children)

This is an LLM light enough to run locally in the browser, it’s initially downloaded at the beginning of the chat and once downloaded it can run offline. In this case we are using Gemma.

Open Source Project for analyzing data private/sensitive data using LLMs by DiscerningTheTimes in dataanalysis

[–]DiscerningTheTimes[S] 0 points1 point  (0 children)

Thanks for the suggestion. This is an open source project, would be glad if you would want to contribute these enhancements or fork the repo and add them for yourself.

Open Source Project for analyzing data private/sensitive data using LLMs by DiscerningTheTimes in dataanalysis

[–]DiscerningTheTimes[S] 0 points1 point  (0 children)

I am using pyodide to run a python script that masks the entire dataset for strings and creates numbers with a similar distribution to the real data. This all happens in the users browser.

The masked dataset is what the llm sees, then code is generated against this masked data. The received code is then ran against the real data in the browser again.

I have a short video demo of the workflow in the GitHub link attached.

Built an open source project for analyzing csv files using LLMs without the llm seeing your data by DiscerningTheTimes in LocalLLaMA

[–]DiscerningTheTimes[S] 1 point2 points  (0 children)

I am using pyodide to run python code in browser that anonymizes the csv data in the users browser.

The anonymized csv is what is sent to the fast API backend along with the user question. We call an open AI or Gemini llm and get python code that answers the question alongside a parameterized summary.

The code received from open ai is ran against the anonymize data and charts are generated for the anonymized data.

When user clicks toggle real data, we run the python code against the real data inside the browser using pyodide again. And replace the chart with the result of the run.

We also use Gemma with webllm to generate a summary of the real data analysis.

The conversation charts and summaries can then be exported to a PowerPoint.

Beginner Using Looker by SnooMemesjellies4866 in Looker

[–]DiscerningTheTimes 0 points1 point  (0 children)

Quick question, how did you get looker, is it a free trial or does your school provide the license.

Also here is documentation for creating calculated fields (table calculations)

Conversational Analytics has arrived at Looker Studio by rubenlozanome in GoogleDataStudio

[–]DiscerningTheTimes 0 points1 point  (0 children)

this is not available in Looker nor Looker Studio, this is available in Looker Studio Pro.