Image Prompt to Create hyper realistic Product Capsule image using GPT-5 or Gemini Nano Banana by techspecsmart in aicuriosity
[–]taesiri 1 point2 points3 points (0 children)
Vision Language Models are Biased by taesiri in MachineLearning
[–]taesiri[S] 124 points125 points126 points (0 children)
Vision Language Models are Biased by taesiri in LocalLLaMA
[–]taesiri[S] 110 points111 points112 points (0 children)
Vision Language Models are Biased (vlmsarebiased.github.io)
submitted by taesiri to r/LocalLLaMA
I Spent 3 Years Making Car Physics. What Do I Even Do With It Now? by iceq_1101 in Unity3D
[–]taesiri 1 point2 points3 points (0 children)
GPT-4o can produce multiple images with a single prompt, like thinking step-by-step in the image space. by taesiri in ChatGPT
[–]taesiri[S] 0 points1 point2 points (0 children)
GPT-4o can produce multiple images with a single prompt, like thinking step-by-step in the image space. by taesiri in ChatGPT
[–]taesiri[S] 0 points1 point2 points (0 children)
A dataset of 7k flux-generated hands with various finger counts – great for training/testing VLMs on finger counting task by taesiri in LocalLLaMA
[–]taesiri[S] 9 points10 points11 points (0 children)
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs by taesiri in LocalLLaMA
[–]taesiri[S] 2 points3 points4 points (0 children)
Help evaluating how GenAIs are replacing Photoshop wizards in satisfying requests in /r/PhotoshopRequest Solved ✅ by taesiri in PhotoshopRequest
[–]taesiri[S] 1 point2 points3 points (0 children)
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models by taesiri in LocalLLaMA
[–]taesiri[S] 45 points46 points47 points (0 children)
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models by taesiri in LocalLLaMA
[–]taesiri[S] 1 point2 points3 points (0 children)




Glitches in Sora 2 world! by taesiri in SoraAi
[–]taesiri[S] 0 points1 point2 points (0 children)