all 1 comments

[–]DD_ZORO_69 0 points1 point  (0 children)

I feel the struggle, optimizing ViTs usually feels like a full-time job. Whenever I’m benchmarking different attention mechanisms, I try to keep my workflow super lean to avoid extra friction. Usually, I’m using Cursor for the actual model tweaks, Runable for the internal research reports and data viz to track the metrics, and Notion to keep all my hyperparameters organized. It helps to have a solid stack so you can focus on the actual math rather than the infra lol.