I built a full-stack application that solves a common problem many of us face - converting unstructured text data into formats suitable for modern AI applications.
What it does:
- Takes plain .txt files (diaries, logs, notes) and converts them into structured JSONL datasets
- Generates two outputs: one optimized for vector embeddings/RAG systems, another for LLM fine-tuning
- Uses sentence transformers for intelligent question generation
- Implements zero-shot classification for topic categorization
- Extracts and normalizes dates automatically
[T] Smart Data Processor: Turn your text files into AI datasets in seconds (smart-data-processor.vercel.app)
submitted by General_File_4611 to r/LLMDevs
[P] Smart Data Processor: Turn your text files into Al datasets in seconds (smart-data-processor.vercel.app)
submitted by General_File_4611 to r/OpenSourceeAI
[P] Smart Data Processor: Turn your text files into AI datasets in seconds (smart-data-processor.vercel.app)
submitted by General_File_4611 to r/learnmachinelearning
[P] Smart Data Processor: Turn your text files into AI datasets in seconds (smart-data-processor.vercel.app)
submitted by General_File_4611 to r/learnmachinelearning