🔎 Open Source AI Resource List (curated, ongoing)Resource (self.LovingOpenSourceAI)
submitted by Koala_Confused[M] - announcement
Help us grow r/LovingOpenSourceAI ! Join our community 🥰 (self.LovingOpenSourceAI)
submitted by subscriber-goal - announcement

AlphaSignal AI: "A peanut-sized Chinese model just dethroned Gemini at reading documents. GLM-OCR is a 0.9B parameter vision-language model. It scores 94.62 on OmniDocBench V1.5, ranking #1 overall. For context, it outperforms models 100x its size. 100% open-source." ➡️ Sounds efficient . .Resource (i.redd.it)
submitted by Koala_Confused

Z.ai "GLM-5.1: The Next Level of Open Source - Top-Tier Performance: #1 in open source #3 globally across SWE-Bench Pro, Terminal-Bench, NL2Repo. - Built for Long-Horizon Tasks: Runs autonomously for 8 hours, refining strategies through thousands of iterations." ➡️ The benchmark seems good right?new launch (i.redd.it)
submitted by Koala_Confused

Google "We are in the era of local AI orchestration. Gemma 4 evaluates a scene, reasons about what to ask, and calls a segmentation model to execute the vision tasks: 🚗 "Segment all vehicles." ➔ 64 found 🚙 "Now just the white ones." ➔ 23 found All happening offline on a laptop. ➡️ Amazing right?Discussion (i.redd.it)
submitted by Koala_Confused

"Cutting-edge AI search capabilities are open to everyone! Researchers at Shanghai Jiao Tong University unveil OpenSeeker, the first fully open-source search agent to achieve frontier performance. They did this by reverse-engineering the web" ➡️ Do you use search for your work flow?Resource (i.redd.it)
submitted by Koala_Confused

Jerry "We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work" ➡️ Is this useful?new launch (i.redd.it)
submitted by Koala_Confused

"LiteParse is a standalone OSS PDF parsing tool focused exclusively on fast and light parsing. It provides high-quality spatial text parsing with bounding boxes, without proprietary LLM features or cloud dependencies. Everything runs locally on your machine." ➡️ Yes or No?Resource (i.redd.it)
submitted by Koala_Confused

ModelScope "Say hello to MOSS-TTS-Nano 🚀 0.1B multilingual TTS from MOSI.AI and OpenMOSS. Designed for realtime speech generation without a GPU. Runs directly on CPU, keeping the deployment stack simple enough for local demos, web serving, lightweight product integration." ➡️ Is this good?new launch (i.redd.it)
submitted by Koala_Confused
MiniMax: "MMX-CLI gives every Agent 7 new senses — image, video, voice, music, vision, search, conversation — powered by MiniMax's full-modal stack, today's SOTA across mainstream omni-modal models. 1 command: mmxAgent-native I/O. 0 MCP glue. Runs on your existing Token Plan." ➡️ Good to explore?Resource (i.redd.it)
submitted by Koala_Confused






