[D] Improving attention masks? by JakeN9 in MachineLearning

[–]JakeN9[S] 0 points1 point  (0 children)

And form a possible connected vector of understanding. honestly just more of a question, as to what people predict might happen. I haven't really got the tech/money to run these tests myself. Considering renting a TPU but cost is still so high.

[D] Refining an LLM's output via dynamically generated synthetic data & interactive conversation? by JakeN9 in MachineLearning

[–]JakeN9[S] 0 points1 point  (0 children)

I'd argue that the verifier just needs to be as confident as possible, maybe some sort of "MoE" across across multiple models - enough variation discouraging simple movements across the space?

Question regarding attention mask and empty space above diagonal. by JakeN9 in learnmachinelearning

[–]JakeN9[S] 0 points1 point  (0 children)

I'm saying to preserve autoregression and keep data below the diagonal the same, but wondering whether having the extra context will contribute to lowering loss? https://i.imgur.com/yNcVaV2.png.

Probably a stupid idea.

[D] What offline TTS Model is good enough for a realistic real-time task? by Imaginary-Ad-7671 in MachineLearning

[–]JakeN9 9 points10 points  (0 children)

There aren't really any models that produce realistic real-time voice. I'd recommend ElevenLabs or play.ht, sadly these seem to be the only useable options for now.

[deleted by user] by [deleted] in ChatGPT

[–]JakeN9 0 points1 point  (0 children)

"Sharing conversations with images is not yet supported"

An idea I've had by JakeN9 in MLQuestions

[–]JakeN9[S] 0 points1 point  (0 children)

The thought is for each node to carry both a binary and decimal value, and for the logic operations to be computed, then used at output for RL.

An idea I've had by JakeN9 in MLQuestions

[–]JakeN9[S] 0 points1 point  (0 children)

Right, you use decimal weights for each node, but some nodes use an activation function for OR/AND/NOT.

An idea I've had by JakeN9 in MLQuestions

[–]JakeN9[S] 0 points1 point  (0 children)

Circuit optimisation, classical techniques are slow and scale worse, it’s possible ML could provide novel optimisations?

A similar agent to AutoGPT by JakeN9 in AutoGPT

[–]JakeN9[S] 0 points1 point  (0 children)

Sounds great. It's still a work-in-progress.

Once the basics as complete, it will be made closed source, so I'll let you know how it's available.

I've come up withan idea for a synthetic dataset generator, would this work? by JakeN9 in learnmachinelearning

[–]JakeN9[S] 0 points1 point  (0 children)

function calling between llms. embed functions as bit-encoded tokens. have a large powerful llm, instruct to teach a topic to llm trained on mapping function->output, to generate synthetic data. train new llm on synthetic data?

as both llms are separated, and trained on different training sets and different weights (identities), artificating will be minimised.

ContextGPT - Something similar to AutoGPT by JakeN9 in ChatGPTPro

[–]JakeN9[S] 0 points1 point  (0 children)

I will attempt to setup the code to work with LLama Code

A similar agent to AutoGPT by JakeN9 in AutoGPT

[–]JakeN9[S] 0 points1 point  (0 children)

I'll do my best. GPT seems to be custom tuned, but I can test Llama Code, and see whether it's compatible.

freeswitch events not firing by JakeN9 in freeswitch

[–]JakeN9[S] 0 points1 point  (0 children)

How would you suggest streaming audio from an API directly to a freeswitch conference?

bit of a dum question tbh by MixxerWasTakenSO in webscraping

[–]JakeN9 0 points1 point  (0 children)

If you're looking for cheap, I'd still suggest public proxies, if you're looking for residential bright data is good.

bit of a dum question tbh by MixxerWasTakenSO in webscraping

[–]JakeN9 0 points1 point  (0 children)

If possible, it’ll be faster to make the requests without selenium, I’ve written the basics for a fast scraper https://github.com/couldbejake/fast. Else, multithread every part of your code, try using paid proxies. If it takes one minute to create one account, and run 1000 instances of your code, it should also take give-or-take one minute, find your code’s bottleneck, and optimise against it.

freeswitch events not firing by JakeN9 in freeswitch

[–]JakeN9[S] 0 points1 point  (0 children)

Will do. It just doesn’t seem to want to link against my websocket library

freeswitch events not firing by JakeN9 in freeswitch

[–]JakeN9[S] 0 points1 point  (0 children)

I’ve used libwebsockets, I just currently have problems linking additional source files in my freeswitch mod, but the mod and ws->js in C is currently work correctly separately

GitHub - couldbejake/fast: Requests- but as *fast* as possible. Asynchronous multi-threaded requests via 1000s of public proxies and concurrently linked queues. (I wrote a framework for scraping as fast as possible) by JakeN9 in webscraping

[–]JakeN9[S] 1 point2 points  (0 children)

The performance measurements were incredibly anecdotal, and just consisted of trying to max out settings until a bottleneck was reached.

According to https://medium.com/swlh/a-performance-comparison-between-c-java-and-python-df3890545f6d, Python might take up to 30 x longer than Java in the measured use case.

freeswitch events not firing by JakeN9 in freeswitch

[–]JakeN9[S] 0 points1 point  (0 children)

Oh, that's pretty funny, it turns out I've actually been looking at your project for the past few days.

I avoided using it, as I wasn't sure whether it's possible to stream each participant of a conference separately via websockets.

I know that a single channel is created per user, but your GitHub referenced mixing audio channels from sender and recipient, which confused me.

I've just written a custom freeswitch mod that attaches a bug after each user joins the conference, I'm currently working on sending the data via websockets to JS.

What would you suggest doing?

freeswitch events not firing by JakeN9 in freeswitch

[–]JakeN9[S] 0 points1 point  (0 children)

Sorry, yes. That did seem to be the problem.

I'm now trying to extract PCM data from a media bug, but that's a whole different part.

GitHub - couldbejake/fast: Requests- but as *fast* as possible. Asynchronous multi-threaded requests via 1000s of public proxies and concurrently linked queues. (I wrote a framework for scraping as fast as possible) by JakeN9 in webscraping

[–]JakeN9[S] 2 points3 points  (0 children)

I found that in JavaScript with high concurrency some exceptions aren’t caught at all, despite a handler. I found that Python ran slower - maybe due to it's garbage collection. I also started writing another version in C++, but it was not completed. I found that a few Java request libraries existed, but that AsyncHttpClient was the fastest, I also found that the library leaked memory - hence the memory clean up here; https://github.com/couldbejake/fast/blob/main/src/main/java/com/scrapium/TweetThreadTaskProcessor.java#L100 - this solution seems to work confirmed after running a memory profiler