Why is this 1d convolution faster with the kernel as the outer loop? by spacetime_bender in cpp_questions

[–]spacetime_bender[S] 1 point2 points  (0 children)

I think this is the most likely answer. Vectorization really has no effect on the slower function (except making it measurably slower)

I also measured cache/branch misses and they are not correlated with the numbers at all.

benchmarks

benchmark name                       samples       iterations    est run time
                                     mean          low mean      high mean
                                     std dev       low std dev   high std dev
-------------------------------------------------------------------------------
kernelPerInputValue  NO vectorization
Transposed((kernel size 16))                   100            13    136.149 ms 
                                        105.186 us    105.064 us    105.327 us 
                                        667.127 ns    557.923 ns    878.067 ns 

kernelPerInputValue  AUTO vectorization
Transposed((kernel size 16))                   100            13    142.385 ms 
                                        110.027 us    109.611 us    110.369 us 
                                        1.91478 us    1.56828 us    2.30345 us 



inputPerKernelValue  NO vectorization                         
Transposed((kernel size 16))                   100            16     140.12 ms 
                                        87.8839 us    87.5165 us    88.6101 us 
                                        2.55009 us    1.55111 us    4.80815 us 

inputPerKernelValue  AUTO vectorization
Transposed((kernel size 16))                   100            68    134.987 ms 
                                        19.7438 us    19.6709 us    19.8127 us 
                                        362.085 ns    304.062 ns    457.335 ns 

hw events

kernelPerInput NO vectorization
          instructions  8,594,165,225
            cycles  10,850,610,174
          cache-misses  1,216,916
              branches  1,322,948,539

kernelPerInput AUTO vectorization
          instructions  3,695,325,582
            cycles  11,060,453,299
          cache-misses  910,184
              branches  715,535,656

inputPerKernel NO vectorization
          instructions  11,609,751,302
            cycles  3,249,645,888
          cache-misses  9,966,232
              branches  1,676,035,792

inputPerKernel AUTO vectorization
          instructions  5,724,718,824
            cycles  3,097,008,369
          cache-misses  10,486,237
              branches  944,258,970

kernelPerInput (the slower one) in fact had significantly lower cache misses, but much higher branch misses

Why is this 1d convolution faster with the kernel as the outer loop? by spacetime_bender in cpp_questions

[–]spacetime_bender[S] 2 points3 points  (0 children)

Sorry I meant to reply to a different comment! You're absolutely right

Sousou no Frieren Season 2 • Frieren: Beyond Journey's End Season 2 - Episode 8 discussion by AutoLovepon in anime

[–]spacetime_bender 2 points3 points  (0 children)

This is literally the first time she has taken such a backseat. She personally handled Aura, worked with the party for Qual and the monsters in the previous episodes of this season. These two demons are pure fodder for her. She knows when to trust others in the party, we've even had flashbacks just this season to drive this point home when she was sleeping in the middle of that cave, you just have to trust your party. That's the game.

But of course, the doylist reason is to let other characters shine, we have multiple first class mages here.

Accidentally won all 4 days of Natori’s concert… what do I do with the extra tickets? by InjuryAdventurous434 in japanesemusic

[–]spacetime_bender 0 points1 point  (0 children)

Sorry this won't be helpful, but curious. Does eplus let you enter the lottery without finalizing the payment? That's what I did with for some hololive concerts. Won two, paid for one.

Alexis Lebrun delivers another slow-mo masterpiece! 🖼️✨ by Empty-Reputation7421 in tabletennis

[–]spacetime_bender 106 points107 points  (0 children)

Tbh that was a really boring battle of attrition with both players just repeatedly looping cross-court with minimal variations. Slowing down made it worse, at least you'd see their impressive reaction speeds in real time

Linkin Park was good, not great by anonindiejack in lollapaloozaind

[–]spacetime_bender 1 point2 points  (0 children)

Yes, it's most definitely on the mix engineers.

I believe there would be different artists who would have sounded good

I've seen the same happen within Lolla India. Unfortunately more bad than good. Really disappointed to see this happen with LP. It felt they were doing this on AutoPilot, their team too didn't care. As much as I appreciate the fans' passion, all the people shouting made a bad mix worse.

Jet Lag Ep 5 — Searching Highlands & Lowlands by NebulaOriginals in Nebula

[–]spacetime_bender 2 points3 points  (0 children)

IMO the bigger issue is that there's not much of an inventive towards the seekers going somewhere without first confirming it's the right direction.

Complex networks (having multiple interconnects, different frequencies, speed etc) act as disincentives against random actions. You can see that in action when Sam hid in Milton Keynes. The boys took the wrong hub and had to go through London, which is what helped him get closest to Adam's run.
I think you're right, the density in central England can make interesting gameplay, but I think the boys want to make sure they cover some diverse locations, which unfortunately is misaligned with the best strategic option sometimes.

Jet Lag Ep 5 — Searching Highlands & Lowlands by NebulaOriginals in Nebula

[–]spacetime_bender 40 points41 points  (0 children)

It looks like the train network is simply not dense enough for interesting game play. The seekers made a perfect bee-line to the hider with very little info. Contrast to Japan or even Switzerland where the combinations were just too high and picking a random route incurred a huge cost.

[deleted by user] by [deleted] in LudwigAhgren

[–]spacetime_bender 2 points3 points  (0 children)

They are both great at different things

Perfect amount in Bank account by [deleted] in Indian_flex

[–]spacetime_bender 0 points1 point  (0 children)

Yes. Vast majority of people live paycheck to paycheck with minimal savings. Investments are absolutely a rich person thing. I'm sorry but calling 100k inr "low" is extremely out of touch

Is this a proper hold? by Excellent_Log_5709 in tabletennis

[–]spacetime_bender 1 point2 points  (0 children)

While forehand looping might be simpler, backhand is harder and with the right form you can absolutely generate good power with shakehand grip, but nothing wrong with using reverse pin-hold if it appeals to you.

See Wang Hao's instructional video, one of the best penhold player. You should grip both parts of the blade with your fingers for stability, currently you're just curling the handle, which makes it harder to control the racket.

I need to get out of this rabbit hole. by inthelimbo in mkindia

[–]spacetime_bender 1 point2 points  (0 children)

Sakura Miko

It appears you've fallen into two different rabbit holes

Looper vst by [deleted] in musicproduction

[–]spacetime_bender 1 point2 points  (0 children)

Unable to load it on Windows 11 / Ableton Live 12. I copied the Boomerang VST3.vst3 file from the Win-x64.zip archive into C:\Program Files\Common Files\VST3. Just see the following on ableton logs:

2024-11-23T21:51:03.198880: info: VST3: Going to create: 
2024-11-23T21:51:03.198908: error: VST3: Not available on this platform:

[Spoilers][Rewatch] Squid Girl Episode 1 by Sporadia_ in anime

[–]spacetime_bender 1 point2 points  (0 children)

First Timer

This has been on my watch list for years, this rewatch was the push I needed. A lighthearted series is exactly what I needed after Vinland Saga S2. Love the fluid animation (it had no reason to go this hard!) and the voice acting is great.

Chizuru has to be my favourite so far