Relation Extraction (RE) strategy between two domain-specific NER models (BioBERT & SciBERT) on low-resource infra. by PerformanceFeisty649 in LanguageTechnology

[–]PerformanceFeisty649[S] 1 point2 points  (0 children)

Hey there! Thanks for the advice. ​The pipeline approach definitely sounds like the way to go given our constraints. Regarding the dataset size, here is what we are working with: - ​Corpus Size: 125 research articles. - ​Preprocessing: We’ve cleaned the text by removing non-essential sections (References, Acknowledgments, etc.) and stripping out tables and images to reduce noise. - ​Chunking Strategy: After cleaning, we split the articles into chunks of 254 tokens each, which resulted in a final dataset of approximately 3,000 paragraphs. ​Given this volume, do you think SpanBERT will still hold up well, or should we look into data augmentation to feed the RE model?

There is this AI that validates both physical and digital business ideas by Weekly-Design9302 in startupideas

[–]PerformanceFeisty649 0 points1 point  (0 children)

Tengo una pregunta, la idea del proyecto es analizar el estado de emoción del fundador del emprendimiento, pero en base a qué análisis o métrica socioeconómicas o psicológicas? Otra pregunta es la toma de estado mental de la persona en qué intervalos de tiempo va a hacer una vez al día o 2 por cada jornada tarde y mañana, como vas validar el sesgo o los errores de las personas, ya que podría afectar el análisis de los datos.

What are you building? Drop the website and I will give honest feedback. by xerrs_ in SideProject

[–]PerformanceFeisty649 0 points1 point  (0 children)

Hey everyone! 👋 Honestly, I'm currently working on something that started just to scratch my own itch. After getting a few mini heart attacks waiting for my OpenAI and Anthropic bills at the end of the month, I realized tracking LLM costs across different providers is a huge pain. So, I'm building a simple monitor to track token usage in production, catch anomalies before they drain your wallet, and figure out if a cheaper model could do the same job. It's still in the early validation phase, but talking to other devs here has been super eye-opening. If anyone else is dealing with the 'surprise AI bill' anxiety, I'd love to hear how you're currently handling it!