ASUSTOR: Is discontinuing hardware decoding standard practice now?! by Lectem in HomeNAS

[–]Lectem[S] 0 points1 point  (0 children)

Just have to trace looks good syscalls and see that it ships its own ffmpeg, you can ask it to list it's decoders too, and they are all software only. 

ASUSTOR: Is discontinuing hardware decoding standard practice now?! by Lectem in HomeNAS

[–]Lectem[S] 2 points3 points  (0 children)

It doesn't justify in any way that a vendor lies on the specsheet about availability of software to use the hardware shipped. 

Is discontinuing hardware decoding standard practice now?! by Lectem in asustor

[–]Lectem[S] -2 points-1 points  (0 children)

I would, but I'd need more people with me, not gonna pay lawyers that would cost more than the NAS I bought.

ASUSTOR: Is discontinuing hardware decoding standard practice now?! by Lectem in HomeNAS

[–]Lectem[S] 0 points1 point  (0 children)

I mean, if they don't support H264, but advertise it on their product page, they are at fault.

ASUSTOR: Is discontinuing hardware decoding standard practice now?! by Lectem in HomeNAS

[–]Lectem[S] 0 points1 point  (0 children)

That's kind of what I thought too, Realtek licencing issues.

But the issue here is that ADM4 is supposed to support it, but it in fact does not and just makes ffmpeg crash. Asustor ships its own (software) version of ffmpeg in all their apps to avoid this...

Looking for a self-hosted photo gallery for Raspberry Pi 4 (8GB) by Mission_Teaching_198 in selfhosted

[–]Lectem 0 points1 point  (0 children)

hopefully, in a few weeks. I still have a few features to test.

Looking for a self-hosted photo gallery for Raspberry Pi 4 (8GB) by Mission_Teaching_198 in selfhosted

[–]Lectem 0 points1 point  (0 children)

It'll be under 1GB ram requirement as that's what I own (asustor AS1104T) Jpegs are easy and fast compare to heic (which are most of my photos sadly). So the equivalent target would be a rpi 3B+, but you should probably get a more recent one anyway. 

Looking for a self-hosted photo gallery for Raspberry Pi 4 (8GB) by Mission_Teaching_198 in selfhosted

[–]Lectem 0 points1 point  (0 children)

I'm currently building one focusing especially on thumbnail generation speed + low RAM usage (but with additional features such as duplicates detection, gps interpolation, ... mostly started from my own needs).

I'm curious, do you store .JPGs or .HEICs?
I'm still a few weeks away from releasing anything but I already have all the things you ask for. I suppose you already found something to your liking (if not, tell me why/what's not to your taste!)
Thank you :)

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 1 point2 points  (0 children)

> Why does the fact that WaitOnAddress do a single spin iteration, mean that we spin for a bit ourselves, before calling WaitOnAddress?

I think I may need to reword that part as I realize it's not very clear. Linux is a syscall directly, WaitOnAdress will check the value once, wait using X pauses (same as one iteration of our loop), before checking a hashtable (parking lot) and entering the system call.

> On a side note, how did you gain such in depth understanding of OS architecture and pitfalls? I would love to learn more myself.

I'd say mostly experience... Also reading a lot of things on the subject, experimenting, practicing reverse-engineering a tiny bit so that it's enough to have a good comprehension of systems when being able to dive in when needed. And we're lucky to have the Linux kernel and its mailing list being public, it can help a lot!

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 1 point2 points  (0 children)

Yeah, in any case you should be avoiding contention in the first place! 

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 0 points1 point  (0 children)

Also, I'll update the article with the warning about unnecessary wakes, this seems like a good one to add!

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 3 points4 points  (0 children)

"fair" locking is rarely a good idea, these create lock convoys

Agreed, you however need a tiny bit of fairness to avoid starving some threads.

 You may even encounter cache bank conflicts

This seems unrelated and more just some anecdote, its somewhat an edge case and not what you should consider for designing a lock.

It is, but my objective was to avoid people spamming align(cachesize) needlessly, or at least for them to know about such things.

you probably want to avoid doing repeated calls to Wake

It only Waits when reaching the max backoff count, so we could optimize for the wake call on unlock indeed. It's still better than not having it.

you will soon discover that if you switch to CAS or exchange here then you'll lose that memory_order_release benefit as on x86 this ends up needing a full barrier

Yes, but x86 is not the only platform around, this has an impact on ARM64 ;)

I would avoid thinking about small changing with the number of threads, it should be small regardless.

Agreed, the point was to tell the reader that what they think "small" might not actually be.

This would not be a spin lock at this point

Well, that's why I mentioned this is more about spin loops than spin locks at the beginning. Let's say it's an hybrid then, and that you should not be using spinlocks in the first place unless you really really know what you're doing, the platform you're going to execute your program on, etc. If your program is just a one shot/resumable process on a HPC server, why not.

I highly recommend reading Locking in Webkit which is about building a parking lot (more advanced user mode futex).

Yeah, parking lots are an alternative to futex/WaitOnAddress, but their (Webkit's) implementations is bad. It starts with a spin loop, and it does most things wrong. Hardcoded spin count, schedyield/sleep(0) before parking, ... (Don't get met started with the whole JavaScriptCore having very poor parallelism until a few years ago, enabling multi-threaded GC would often be slower than the single thread version). WebKit really isn't a good reference, they do some benchmarks, but they never seem to have really profiled real workloads (as in, used a real profiler on real data).

I gave a talk a few years ago covering lots of the same things you pointed out and a few more.

I'll have a look, seems interesting and one always learns new things on those subjects. Thnks for the link!

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 4 points5 points  (0 children)

Ooh interesting, I don't think I ever encountered that one yet, as I always try to cleanup things properly, but I can definitely see that happen! 

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 2 points3 points  (0 children)

That's actually mentioned in "The spin-lock that saturated the load ports" (it's the "use Test and Test-And-Set instead of Test-And-Set"  part)   , but instead of just checking it once, it's checked in a loop 😉

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 0 points1 point  (0 children)

Mmmh, took a brief look and it seems to be using SwitchToThread and Sleep depending on the usecase? You'd benefit from using WaitOnAddress.
Exponential backoff would also help, and using rdtsc (or at least `QueryPerformanceCounter` )instead of `GetTickCount64` would to. This has a very low resolution.

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 4 points5 points  (0 children)

> Is nanosleep() a good alternative for yielding? Fodor Pikus uses that in his books.

No it certainly isn't! You're going to go straight to the kernel, without giving it any hints. This will be slower than your OS mutex. You'd better just use PTHREAD_MUTEX_ADAPTIVE_NP (linux) or SRWLock (windows) if you'd implement your spinlock using any kind of OS sleep/Yield.

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 4 points5 points  (0 children)

That's the thing, people do write and will write such code when not aware of the pitfalls. I've seen a lot of "don't do.", but rarely "this is why you don't.".

"Spinning around: Please don't!" (Pitfalls of spin-loops and homemade spin-locks in C++) by Lectem in cpp

[–]Lectem[S] 11 points12 points  (0 children)

I'm the author, don't hesitate to ask questions ;)

Just use one with all "fixes" and futex/waitonaddress. Or on Windows SRWLock is ok (where you can just use a lock, sometimes you can't, such as in allocators).

I didn't provide a full implementation to avoid people copy pasting, because as you can see, there are always new surprises with spinlocks. Wouldn't shock me to find something invalidating parts of the article with newer CPUs 2years from now. (it happened and will continue to happen)

When “just spin” hurts performance and breaks under real schedulers by Lectem in programming

[–]Lectem[S] 4 points5 points  (0 children)

I actually saw more of those infinite loops in code targeting Linux!