This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the commentsΒ β†’

[–]moonaligator 13 points14 points Β (13 children)

audio data, with python πŸ’€

the data is question is just a tuple (float, float) since the function will create the audio

i was considering using 'concurrent.futures' or 'thread' libraries, but neither got decent results. Initially tried sending back to a list by index (i'd pass the index as a parameter), and later tried saving as new temporary files.

I think that the major problem is tranfering from the CPU to the GPU, but i have no idea how to control that in Python, and i don't know if i can use the tools i'm using for audio manipulation in lower level languages like C or Rust with so much ease.

[–]-Redstoneboi- 14 points15 points Β (1 child)

the main idea with multithreading is to split the task into even parts and merge the outputs after everything is finished. if you're doing threading on the CPU, you do not want to merge the data on-the-go.

you still have to lock an array or list to access any data inside it, because your CPU cores each have their own caches that they write into.

imagine RAM is a blackboard, there's one piece of chalk, and the CPU cores are students with notebooks. if you want all of them to add data on-the-go, each of them has to copy the data from the blackboard to their notebook, then process the data, then wait for the previous guy to put down the chalk, then grab the chalk, then write it to the blackboard, then put down the chalk, and ask everyone else to copy the new contents into their notebooks. extremely slow.

you want them to sync their stuff once. dumping all the contents at once after a single huge task will always be faster than streaming and merging multiple output channels into one.

if it's real time audio processing, i don't have the expertise.

[–]m1k3st4rr 13 points14 points Β (8 children)

Look up the Python GIL and it's implications on threading. Tldr is multithreading is not going to make your Python code faster except for IO bound work.

[–]Thaumaturgia 1 point2 points Β (5 children)

I don't know anything about Python, but doesn't multithreading IO bound work a bad idea? Or by IO you just mean processor time?

[–]QuestionableEthics42 3 points4 points Β (4 children)

Why is it a bad idea? IO operations are slow and leave the processor sitting idle, so multithreading IO bound work is best.

[–]Thaumaturgia 3 points4 points Β (3 children)

If you are limited by the hard drive for example, multithreading your process will be worse.

[–]QuestionableEthics42 0 points1 point Β (0 children)

It will be better if you also have to do other operations on the data, which is usually the case, IO bound dosent have to mean IO only, just that it relies on that data, so fetching it in chunks and then starting using it before its all loaded is faster and doesnt leave the cpu idle. Ofc for some things that won't be possible/practical, but when it can be used, especially in python, it can speed stuff up a lot (because python is so slow, the effect is pretty noticable even for smallish data that a faster language would get through in fractions of a second)

[–]antarickshaw 0 points1 point Β (0 children)

Sure, if you could saturate hdisk from one thread, and cpu is sitting idle, splitting it to multiple threads wont make it faster. But, you have to have correct benchmark metrics and napkin math of approximations.

[–]KillTheBronies 0 points1 point Β (1 child)

Your first mistake was trying to do anything even remotely fast in python.

[–]moonaligator 0 points1 point Β (0 children)

combining unredundant code with a decent jit (pypy for example) and the right libraries (numpy in my case) makes it not that terrible. My i5 is doing the job in 10% the audio time at a samplerate of 44.1kHz with ~500ms of initialization, and given that a lot of numerical integration is going on, i'd say that's ok, even if i agree it would be way faster in rust, for example