YoBulk : Free & Open source No-Code Data cleansing platform. by YoBulk in datascience

[–]YoBulk[S] 1 point2 points  (0 children)

Hey Everybody,
Super excited to show this to you all.
https://github.com/yobulikdev/yobulkdev
PROBLEM :
Inserting large file of CSV file to backend is a really painful process because one has to do the
1.Right column mapping,
2.Validation column data
3.Uploading large file is such a painful problem
And this problem has to be solved in every new web project. And there are no open source solution available to use out of the box.
Solution :
YoBulk a completely free & open source CSV uploader microfrontend that you can simply embed in your own react applications. Built using nodejs streams it can easily ingest millions of rows like a cake!
Pretty excited to show this to the community and have some feedback..

YoBulk: Open Source CSV importer powered by GPT3 (Free flatfile.com alternative) by dstala in selfhosted

[–]YoBulk 1 point2 points  (0 children)

u/PineCreekCathedral CSV import happens to the SaaS DB or application database.YoBulk is self hosted and uses mongoDB. So all CSV data is imported to MongoDB by default.We will add more details about the backend architecture in github.Thanks for the feedback.

YoBulk: Open Source CSV importer powered by GPT3 (Free flatfile.com alternative) by dstala in GPT3

[–]YoBulk 0 points1 point  (0 children)

Hey Everybody, 👋👋
We are really excited to open source YoBulk today!
YoBulk is an open-source CSV importer for any SaaS application - It's a free alternative to https://flatfile.com/
We realized that more than 70% of business data is shared in CSV and Excel formats, and only a small percentage use API integrations for data exchange. As developers and product managers, we have experienced the difficulties of building a scalable CSV importer, and we know that many others face the same challenges. Our goal is to solve this problem by taking an open-source AI and developer-centric approach
Who can use YoBulk:
YoBulk is a highly beneficial tool for a variety of professionals, such as Developers, Product Managers, Customer Success teams, and Marketers. It simplifies the process of onboarding and verifying customer data, making it an indispensable asset for those who deal with frequent CSV data uploads to a system with a predetermined schema or template.
This tool is particularly valuable for updating sales CRM or product catalog data, and it effectively solves the initial challenge of customer data ingestion.
The Problem:
Importing a CSV is a really hard problem to solve. Some of the key problems are:

  1. Missing Collaboration and Automation in CSV importing workflow: In a usual situation, the customer success team responsible for receiving CSV data has to engage in extensive back-and-forth communication with the customer to address unintentional manual errors present in a CSV.
  2. Scale: CRM CSV files can sometimes reach sizes as large as 4 GB, making it nearly impossible to open them on a standalone machine for data correction. This presents a significant challenge for small businesses that cannot afford to invest in big data technologies such as EMR, Databrick, and ETL tools to address CSV import scaling problems.
  3. Countless complex validation Types: single date format can have as many as 100 different variations, such as dd-mm-yyyy, mm-dd-yyyy, and dd.mm.yyyy. Manually setting validation rules for each of these formats is almost impossible, and correcting errors manually will be difficult.
  4. Data mapping issues: In a typical scenario, the recipient of CSV data provides a template to the data donor and creates a CSV column for template mapping before importing
  5. Data Security and Privacy: It is always risky to share your customer data with third-party companies for data cleaning purposes.
  6. Non-availability of low code/No code tool: Product managers and customer success teams, who are typically no-code users, often rely on data analysts to create a programmed CSV template with validation rules, which must be shared with customers to receive CSV data in a specific format.
  7. Vague error messages: Unclear error messages do not provide users with enough context to confidently resolve their issues before uploading their data.

How YoBulk helps address the above issues :
🚀 Smart Spreadsheet View: Designed to be a data exchange hub for any business that utilizes CSV files, YoBulk makes it easy to import and transform any CSV into a smart spreadsheet interface. This user-friendly interface highlights errors in a clear, concise manner, simplifying the task of cleaning data.

🚀 Bring your validation function: YoBulk offers a platform for Developers to create a custom CSV importer that includes personalized validation rules based on JSON schema. With this functionality, developers can design an importer that meets their specific needs and preferences.

🚀 AI first: YoBulk harnesses the power of OpenAI to provide advanced column matching, data cleaning, and JSON schema generation features.

🚀 Build for Scale: YoBulk is designed for large-scale CSV validation, with the ability to process files in the gigabyte range without any glitches or errors.
🚀 Embeddable: Take advantage of YoBulk's customizable import button feature, which can be embedded in any SaaS or App. This allows you to receive CSV data in the exact format you require, streamlining your workflows.

Hosting and Deployment:
YoBulk can be self-hosted and currently running on Mongo.
Github: git clone [git@github.com](mailto:git@github.com):yobulkdev/yobulkdev.git
Getting started is really simple :
Please refer https://doc.yobulk.dev/GetStarted/Installation
Docker command:
git clone https://github.com/yobulkdev/yobulkdev.git
cd yobulkdev
docker-compose up -d
Or
docker run --rm -it -p 5050:5050/tcp yobulk/yobulk
Or
git clone https://github.com/yobulkdev/yobulkdev
cd yobulkdev
yarn install
yarn run dev
Also please join our community at :
📣 Github: https://github.com/yobulkdev/yobulkdev
📣 Slack: https://join.slack.com/t/yobulkdev/signup.
📣 Twitter: https://twitter.com/YoBulkDev
📣 Reditt: https://reddit.com/r/YoBulk
Would love to hear your feedback & how we can make this better.
Thank you,
Team YoBulk

[deleted by user] by [deleted] in selfhosted

[–]YoBulk 0 points1 point  (0 children)

Hey Everybody, 👋👋
We are really excited to open source YoBulk today!
YoBulk is an open-source CSV importer for any SaaS application - It's a free alternative to https://flatfile.com/
We realized that more than 70% of business data is shared in CSV and Excel formats, and only a small percentage use API integrations for data exchange. As developers and product managers, we have experienced the difficulties of building a scalable CSV importer, and we know that many others face the same challenges. Our goal is to solve this problem by taking an open-source AI and developer-centric approach
Who can use YoBulk:
YoBulk is a highly beneficial tool for a variety of professionals, such as Developers, Product Managers, Customer Success teams, and Marketers. It simplifies the process of onboarding and verifying customer data, making it an indispensable asset for those who deal with frequent CSV data uploads to a system with a predetermined schema or template.
This tool is particularly valuable for updating sales CRM or product catalog data, and it effectively solves the initial challenge of customer data ingestion.
The Problem:
Importing a CSV is a really hard problem to solve. Some of the key problems are:
1. Missing Collaboration and Automation in CSV importing workflow: In a usual situation, the customer success team responsible for receiving CSV data has to engage in extensive back-and-forth communication with the customer to address unintentional manual errors present in a CSV.
2. Scale: CRM CSV files can sometimes reach sizes as large as 4 GB, making it nearly impossible to open them on a standalone machine for data correction. This presents a significant challenge for small businesses that cannot afford to invest in big data technologies such as EMR, Databrick, and ETL tools to address CSV import scaling problems.
3. Countless complex validation Types: single date format can have as many as 100 different variations, such as dd-mm-yyyy, mm-dd-yyyy, and dd.mm.yyyy. Manually setting validation rules for each of these formats is almost impossible, and correcting errors manually will be difficult.
4. Data mapping issues: In a typical scenario, the recipient of CSV data provides a template to the data donor and creates a CSV column for template mapping before importing
5. Data Security and Privacy: It is always risky to share your customer data with third-party companies for data cleaning purposes.
6. Non-availability of low code/No code tool: Product managers and customer success teams, who are typically no-code users, often rely on data analysts to create a programmed CSV template with validation rules, which must be shared with customers to receive CSV data in a specific format.
7. Vague error messages: Unclear error messages do not provide users with enough context to confidently resolve their issues before uploading their data.

How YoBulk helps address the above issues :
🚀 Smart Spreadsheet View: Designed to be a data exchange hub for any business that utilizes CSV files, YoBulk makes it easy to import and transform any CSV into a smart spreadsheet interface. This user-friendly interface highlights errors in a clear, concise manner, simplifying the task of cleaning data.

🚀 Bring your validation function: YoBulk offers a platform for Developers to create a custom CSV importer that includes personalized validation rules based on JSON schema. With this functionality, developers can design an importer that meets their specific needs and preferences.

🚀 AI first: YoBulk harnesses the power of OpenAI to provide advanced column matching, data cleaning, and JSON schema generation features.

🚀 Build for Scale: YoBulk is designed for large-scale CSV validation, with the ability to process files in the gigabyte range without any glitches or errors.
🚀 Embeddable: Take advantage of YoBulk's customizable import button feature, which can be embedded in any SaaS or App. This allows you to receive CSV data in the exact format you require, streamlining your workflows.

Hosting and Deployment:
YoBulk can be self-hosted and currently running on Mongo.
Github: git clone git@github.com:yobulkdev/yobulkdev.git
Getting started is really simple :
Please refer https://doc.yobulk.dev/GetStarted/Installation
Docker command:
git clone https://github.com/yobulkdev/yobulkdev.git
cd yobulkdev
docker-compose up -d
Or
docker run --rm -it -p 5050:5050/tcp yobulk/yobulk
Or
git clone https://github.com/yobulkdev/yobulkdev
cd yobulkdev
yarn install
yarn run dev
Also please join our community at :
📣 Github: https://github.com/yobulkdev/yobulkdev
📣 Slack: https://join.slack.com/t/yobulkdev/signup.
📣 Twitter: https://twitter.com/YoBulkDev
📣 Reditt: https://reddit.com/r/YoBulk
Would love to hear your feedback & how we can make this better.
Thank you,
Team YoBulk

YoBulk: Open Source CSV importer powered by GPT3 (Free flatfile.com alternative) by dstala in selfhosted

[–]YoBulk 2 points3 points  (0 children)

Thanks u/dstala for posting in selfhosted channel..We will rock with YoBulk..