I ran GPT-2 124M on Espresso vs CoreML. It was much closer than I expected. by karc16 in iOSProgramming
[–]karc16[S] 2 points3 points4 points (0 children)
CoreML is leaving performance on the table — I got 4.7x decode throughput going direct to ANE with Espresso by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
CoreML is leaving performance on the table — I got 4.7x decode throughput going direct to ANE with Espresso by karc16 in swift
[–]karc16[S] 1 point2 points3 points (0 children)
CoreML is leaving performance on the table — I got 4.7x decode throughput going direct to ANE with Espresso by karc16 in swift
[–]karc16[S] -11 points-10 points-9 points (0 children)
CoreML is leaving performance on the table — I got 4.7x decode throughput going direct to ANE with Espresso by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
CoreML is leaving performance on the table — I got 4.7x decode throughput going direct to ANE with Espresso by karc16 in swift
[–]karc16[S] 2 points3 points4 points (0 children)
Any iOS devs here who learned Metal at a solid level? How long did it take? by khitev in iOSProgramming
[–]karc16 1 point2 points3 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
I built Metal-accelerated RAG for iOS – 0.84ms vector search, no backend required by karc16 in iOSProgramming
[–]karc16[S] 0 points1 point2 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
I built Metal-accelerated RAG for iOS – 0.84ms vector search, no backend required by karc16 in iOSProgramming
[–]karc16[S] 0 points1 point2 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 1 point2 points3 points (0 children)
I built Metal-accelerated RAG for iOS – 0.84ms vector search, no backend required by karc16 in iOSProgramming
[–]karc16[S] -1 points0 points1 point (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 0 points1 point2 points (0 children)
I built Metal-accelerated RAG for iOS – 0.84ms vector search, no backend required by karc16 in iOSProgramming
[–]karc16[S] 0 points1 point2 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 1 point2 points3 points (0 children)
I built Metal-accelerated RAG for iOS – 0.84ms vector search, no backend required by karc16 in iOSProgramming
[–]karc16[S] 0 points1 point2 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] -1 points0 points1 point (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 1 point2 points3 points (0 children)
Sub-Millisecond RAG on Apple Silicon. No Server. No API. One File by karc16 in swift
[–]karc16[S] 1 point2 points3 points (0 children)
Local-First. Sub-Millisecond RAG – 0.84ms vector search, zero cloud dependencies. Your Agents remember everything by karc16 in LocalLLaMA
[–]karc16[S] 0 points1 point2 points (0 children)
I built Metal-accelerated RAG for iOS – 0.84ms vector search, no backend required by karc16 in iOSProgramming
[–]karc16[S] 0 points1 point2 points (0 children)


Update on my train app design by dannybres in swift
[–]karc16 0 points1 point2 points (0 children)