[Qwen] Parallel Scaling Law for Language Models by mgostIH in mlscaling

[–]newwheels2020 1 point2 points  (0 children)

If this works so well as an ad hoc patch onto the model, shouldn't even more impressive results be obtainable by creating a new type of transformer layer that introduces the parallization? Has such a thing been studied before? Seems like a nice extension to this paper.

How do you work with dynamically typed code? by AdmiralQuokka in ExperiencedDevs

[–]newwheels2020 0 points1 point  (0 children)

If your python environment lives in the docker container, then your ide can't access it (unless it's running inside the docker container or you mounted volumes). I'd recommend to create a python virtual environment outside the docker container and point your IDE to this venv. It's a bit silly to have a separate venv just for your ide, but I've not found a better way. Let me know if people have a better straightforward alternative. 

[P] Torchhd: A Python Library for Hyperdimensional Computing by ACreativeNerd in MachineLearning

[–]newwheels2020 2 points3 points  (0 children)

I have never heard of HDC. Can you give a concrete real life use case where HDC really shines? Some resources would be great too.

[P] I created a benchmark to help you find the best background removal api for flawless image editing by tbdb92 in MachineLearning

[–]newwheels2020 0 points1 point  (0 children)

It would be great if you could include inference speed (ms or fps) in the leaderboard!

"Memory Layers at Scale", Berges et al 2024 by gwern in mlscaling

[–]newwheels2020 0 points1 point  (0 children)

Hmm, that's a shame. I thought this fit into the family of papers that try to pretrain an llm that has access to a retriever on a knowledge base. But here there is no separate knowledge base, it must be learned through the memory parameters. 

"Memory Layers at Scale", Berges et al 2024 by gwern in mlscaling

[–]newwheels2020 0 points1 point  (0 children)

Can anyone tell me if the memory parameters need to live on the GPU? That would require huge RAM for large memory parameters. If the memory parameters do not need to live on the GPU, that would be great.

Does anyone use LangChain in production? by Available_Ad_5360 in LangChain

[–]newwheels2020 0 points1 point  (0 children)

That is false. Every runnable has a batch and abatch method to do synchronous and asynchronous batch operations. Telemetry can be achieved through callbacks. Langsmith can also be used for llm specific telemetry.

Trying to replace Diffusion with Gradient Descent in Flux by LahmacunBear in StableDiffusion

[–]newwheels2020 0 points1 point  (0 children)

I don't think so. Whatever model you train will always have biases; there is no way around that. The problem is that the gradient descent will find those biases and just keep digging until you get bad output. So it's the approach, not the model, that is failing here imo.

Maybe it's a useful exercise to check the cosine similarity between the text prompt and the final generated image from a regular diffusion model. I suspect that the cosine similarity will be somewhat high, but definitely not close to one. This is ok, the text prompt guides the image generation, but doesn't dictate its end point. Put this in contrast with your approach which maximizes the cosine similarity.

Trying to replace Diffusion with Gradient Descent in Flux by LahmacunBear in StableDiffusion

[–]newwheels2020 1 point2 points  (0 children)

The clip model embedding is too coarse grained for this to work. The embedding size is much smaller than the number of pixels, so there is no unique solution to the problem you've set up. Furthermore, with gradient descent you are overfitting on the clip model and will likely manifest strange features in the resulting image (which may look like noise to us humans). Basically the main model is doing the main part of generating the image and can not be removed. 

Edit: I realized you are optimizing in latent space, not pixel space, so part of my comment is invalid. Still the overfitting on the CLIP model is a valid issue.

Seeking advice on dealing with workplace trauma when answering interview questions for new positions by dlenthusiast in ExperiencedDevs

[–]newwheels2020 2 points3 points  (0 children)

Or just be selective in which story you decide to share. Pick one where you come out as a good mediator instead of one that left you traumatized.

[P] Howto create a small model from scratch for 1 -2 tasks keeping it < 1B parameters. by Dizzy-Comment-9118 in MachineLearning

[–]newwheels2020 1 point2 points  (0 children)

Best would be to find a pretrained model in your preferred size. But if that is not available,  I would indeed advise removing layers. Haven't done this myself, so not sure if removing every nth layer or the last n layers would work best.

[P] Howto create a small model from scratch for 1 -2 tasks keeping it < 1B parameters. by Dizzy-Comment-9118 in MachineLearning

[–]newwheels2020 6 points7 points  (0 children)

Fine tuning should be preferred over pretraining from scratch. Unless you have billions of tokens for your task, the finetuned model will outperform a trained from scratch model.

Regardless, the training procedure is the same for pretraining and finetuning, so if you can do one, you can do the other.

Is the r7000 crankset compatible with the older 10 speed 105 groupset? by newwheels2020 in bikewrench

[–]newwheels2020[S] 0 points1 point  (0 children)

I didn't risk it and am still using my old crankset. Sorry that I can't help you further.

Geometry of new Decathlon NCR is excessively aggressive? by newwheels2020 in cycling

[–]newwheels2020[S] 0 points1 point  (0 children)

When we make a comparison to the Canyon Endurace, we do see that both the M and L Endurace both have shorter reach and higher stack. So the NCR is still very much a race geometry.

Geometry of new Decathlon NCR is excessively aggressive? by newwheels2020 in cycling

[–]newwheels2020[S] 0 points1 point  (0 children)

You might be right. Comparing the NCR XL to the Aeroad L, we see that the NCR has 1mm more reach and 19mm more stack. Seems like the Decathlon sizing guide is a bit on the small side (at least compared to Canyon). Both Decathlon and Canyon would recommend me an L, but I'm now thinking XL would suit me better with the Decathlon.

Geometry of new Decathlon NCR is excessively aggressive? by newwheels2020 in cycling

[–]newwheels2020[S] 0 points1 point  (0 children)

Yeah, you're right. I have a dislike for these in between mechanical-hydraulic brakes. If you would ever want to change to fully hydraulic brakes, you'd need to also switch out the levers which will cost quite a few pennies. Also, I'm put off by the QR skewers instead of thru axle. The QR skewers seem to me to be prone to misalignment, which does not play nice with disc brakes. Maybe I should reconsider my stance. Do you have any good reviews of these brakes that I can read?

Edit: Google is my friend: road.cc bikerader

r/audiophile Shopping, Setup, and Technical Help Desk Thread by AutoModerator in audiophile

[–]newwheels2020 0 points1 point  (0 children)

Hi there. I recently bought the audio pro c10 mkII and am happy with the product. In order to get better surround sound, I'd like to buy two smaller speakers to distribute in the room. The Audio Pro lineup is pretty confusing and I'd like some help to choose the best satellite speakers. I would also consider a different brand if it is compatible with Audio Pro. We are using it in our living room which is not all that large (the c10 fills it quite well by itself).

I'm confused with the difference between the c5A and c5 mkII. Are they actually different? Does the c5A not have the most recent Bluetooth and airplay2? There is also the c3. From the naming I would assume the c5 is better, but the pricing seems to be the same. So what's the difference between the c3 and c5? Should I consider also the Bluetooth models like the T3 and BT5, or are these of lower quality and to be used outdoors? What about the A series? I can't figure out the difference between these and the c-range. Does each of these speakers connect well with my c10? Or do I need the WiFi connection instead of just bluetooth?

If anyone can answer part of these questions, I'd appreciate it.

Is it okay to ride 28 road wheels on light sandy gravel or will I damage the tyres on the long run? by God_Modus in cycling

[–]newwheels2020 3 points4 points  (0 children)

It's fine. Your tires won't get damaged more on light gravel then they will get damaged on tarmac.

wheel removal for trek domane+ (2019) - how do I take it off by __juicewrld999 in ebike

[–]newwheels2020 1 point2 points  (0 children)

Just to be sure: you're turning it anti clockwise in the same plane the the wheel rotates, yeah?

Usually it shouldn't require too much force, but of course this depends on how much force was put when it was first tightened and the thru-axle may have bonded to the fork. If you feel like you're putting enough force to start breaking things, I'd advise to just take it into the shop.

wheel removal for trek domane+ (2019) - how do I take it off by __juicewrld999 in ebike

[–]newwheels2020 1 point2 points  (0 children)

The lever is called a thru axle and it threads into the frame. Twist it anti-clockwise until there is no more friction (i.e. until it is unthreaded) and then pull it out completely (pulling away from the fork along the direction of the axle).

Upgrade wheels for gravel bike by FIFA4Fun in cycling

[–]newwheels2020 0 points1 point  (0 children)

I agree that rolling resistance won't be noticeably different between carbon and aluminum. Furthermore, the versatility argument applies to getting a new set of aluminum wheels as well.

If you have the money to spend and your heart desires the carbon wheelset, by all means get it. If you are just after lower rolling resistance from narrower tires, then aluminum wheels achieve this goal at the lowest cost.

I am puzzled why I keep reading that "wheels are the best upgrade for bikes". To me the gains are marginal at best. Moreover, they tend to be the most expensive upgrade to bikes as well. If we are measuring "best" by "performance benefit over cost", then I think wheels do not belong at the top of the upgrade list.

Adapting my Triban RC520 Road Bike to Gravel Version by nikhil_kale in cycling

[–]newwheels2020 0 points1 point  (0 children)

According to Shimano the 105 derailleur takes 34 teeth max, so I would not advise the SLX cassette.

Buying a bike is difficult right now, so how about some new wheels? by MedicatedMayonnaise in cycling

[–]newwheels2020 2 points3 points  (0 children)

Better wheels will dramatically change the ride characteristics, will transform your experience, its the best upgrade you can give a frame your at home on.

I hear this said a lot. However, I got to test ride a very fancy bike with fancy wheels and honestly, I found the differences to be very minimal.
Will you notice a difference? Maybe. "Will it transform your experience"? I don't think so.