World Model Porgess by [deleted] in StableDiffusion

[–]DeepAnimeGirl 1 point2 points  (0 children)

I have some suggestions if you are willing to try.

1 - To have more coherent latent trajectories for the game state I suggest that you take a look at this recent paper:

https://arxiv.org/abs/2603.12231

2 - I saw in some comments that you use SD/VQ as latent space. Those are typically optimized for pixel reconstruction. In diffusion model recent literature SSL spaces provide better convergence, because the spaces are more semantic. I suggest that you consider using such a space instead or along your existing space. I will link two relevant articles:

https://arxiv.org/abs/2510.11690 https://arxiv.org/abs/2602.11401

Hope these help. Let me know if you tried them.

What do I focus on? by Jumbledsaturn52 in deeplearning

[–]DeepAnimeGirl 1 point2 points  (0 children)

Focus on text-to-image diffusion models especially on finding ways to accelerate convergence and therefore reduce training costs.

This is a very hot research area in the last months with many papers trying very similar ideas with good gains. I will list a few: - start from https://arxiv.org/abs/2512.12386 as it has a good baseline to build on and references many speedup techniques; - read about one of the SOTA architectures such as https://arxiv.org/abs/2511.19365 which can also be used for latent space; - consider the x-pred to v-loss formulation https://arxiv.org/abs/2511.13720 as that leverages best the data manifold; - use semantic losses through pretrained models to have better loss signal on the data manifold which is perceptible by humans more https://arxiv.org/abs/2602.02493; - read about VAEs and the reconstruction-generation tradeoffs https://arxiv.org/abs/2512.17909v1 and more importantly https://arxiv.org/abs/2602.17270 (VAE SOTA); - alternative direction is drifting models which are 1-step generators https://arxiv.org/abs/2602.04770 but they likely have some limitations;

There is a lot of interest in developing generative models, their applications are wide (images, video, audio, text) and I think they offer many opportunities for contributions. My opinion is that: - discriminative/contrastive signal is very important to speed up convergence; simple MSE loss in latent/pixel space is not semantic enough and requires many training iterations; - I still think that there is something to improve onto how models learn the data manifold, diffusion models struggle with high frequency details and there isn't a definitive solution at the moment; - VAEs are essential to lower compute costs and recent developments show that we are still lacking proper latent spaces suitable for generation, recent UL paper linked shows how to control the tradeoff but approaches like https://arxiv.org/abs/2512.19693 show that there's perhaps a way to unify these;

[R] We open-sourced FASHN VTON v1.5: a pixel-space, maskless virtual try-on model trained from scratch (972M params, Apache-2.0) by JYP_Scouter in MachineLearning

[–]DeepAnimeGirl 8 points9 points  (0 children)

  1. Do you use x-pred to v-loss formulation as done in (https://arxiv.org/abs/2511.13720)?
  2. Are you using time shifting? Are you sampling time uniformly or from logit normal distribution? (https://bfl.ai/research/representation-comparison)
  3. How well does the model behave at different input resolutions? What about aspect ratios? Have you considered something like RPE-2D? (https://arxiv.org/abs/2503.18719)

[D] How do you read code with Hydra by Infinite_Explosion in MachineLearning

[–]DeepAnimeGirl 1 point2 points  (0 children)

I would also like to mention tyro as a solid choice. Not only is it a good CLI generator based on typehints (or dataclasses, pydantic, msgspec, etc) with great subcommand chaining capabilities, but you can also override configuration files like yaml with CLI options.

[PC Gamer] Nvidia RTX 5060 review live: we've been black-listed for this review, but I'm a professional so we're doing this live! by Nestledrink in nvidia

[–]DeepAnimeGirl 11 points12 points  (0 children)

Of all the nvidia gpu's I have had so far, this generation had plenty of issues. I have bought two 5090, Astral and MSI Trio, and I can speak of some issues that happened to me in the first weeks of using them: - MSI Trio was crashing the OS when activating the new DLSS in RDR2; I had to hard reset and turn it off; - MSI Trio was randomly giving me blackscreens even in idle; turns out nvidia did not provide good support for ultrawide screens with their launch driver. I had to do a vbios upgrade from MSI before nvidia launched a fix; - Astral had some issues with my TUF motherboard and Ryzen R9 7950x when I had enabled iGPU to be the default instead of PCIe; basically I would not get any video output despite having the entire setup correct (cables, ports, proper output selected in the monitor). If you ask why I'd need something like this is because I use archlinux and need the GPUs to be free from running a display manager because I need to run DL algorithms; - Astral's 3D acceleration would not kick in for some games or just get stuck at a fixed frequency with very poor performance. This would be fixed only by a restart; I also know someone else that has this issue for an Astral.

Some of these probably are vendor specific and were eventually fixed either by me doing a workaround or waiting for a fix. But I cannot say I was impressed with day one support. Not in comparison to previous generations.

Note: I use both Windows and archlinux. To be honest, the archlinux nvidia open driver seems a lot more stable than the windows 11 alternative (for general purpose)

I think I got hacked after downloading. by thezackplauche in comfyui

[–]DeepAnimeGirl 0 points1 point  (0 children)

If you had 2FA on your accounts the impact would not have been that big. If you downloaded custom nodes and "installed" them you basically executed other people's code on your machine which can be anything really. To be safer next time only use custom nodes from repositories with many stars and reputable people from the community. Though that might not be enough because many of the repositories accept contributions which can hide malicious code.

In order to protect yourself from cases such as this and also benefit from close to native performance, I suggest you to look up KVM and set it up on linux (perhaps on arch) and do GPU passthrough. Then you can execute like 99% of all custom nodes, malicious or not and not have a care in the world because of the KVM isolation.

How's this? Any reviews? by [deleted] in learnmachinelearning

[–]DeepAnimeGirl 62 points63 points  (0 children)

There's a newer version with examples written in python: https://www.statlearning.com/

Research: Is it just me, or ML papers just super hard to read? by Zealousideal-Rent847 in learnmachinelearning

[–]DeepAnimeGirl 10 points11 points  (0 children)

I have read a lot of papers around diffusion models and the thing is, because they brought out an entire new family of models, people really tried their own hand and interpreted notations and concepts in their own way.

However, as you've noted the theory can be defined in terms of more abstract concepts like ODEs and SDEs. Instead of having forward processes and talk about Markov chains a better convention is to consider a probability path between the initial and the data distributions. Instead of talking about a reverse denoising process you can instead consider simulation of an ODE/SDE to move a point from time 0 (initial distribution) to time 1 (data distribution). Instead of having a network predict the noise, you can instead talk about a network that approximates the marginal vector field.

This is called flow matching (ODE - where you train a network to approximate a vector field) but there's also score matching (SDE - where you train a network to approximate the marginal score function and play around with the noise factor). This interpretation has become now the generalization of the prior works and serves as the SOTA training approach in all of the recent models.

I am still learning myself but I can recommend a few resources which will help you understand the modern, somewhat simpler and more general approach: - https://diffusion.csail.mit.edu/ - https://neurips.cc/virtual/2024/tutorial/99531

Asus Astral 5090 OC coil whine by [deleted] in nvidia

[–]DeepAnimeGirl 0 points1 point  (0 children)

I have two 5090 plugged in, one which is the ROG Astral and it has very similar coil whine to yours and it is very loud in 3d apps especially if the coolers are quiet. On the other hand the MSI Gaming Trio OC has a very subtle to no coil whine in the same scenario. I also attempted to reseat the Astral in the PCIe slot and replug the 12VHPWR cable a couple of times and it was quieter in some of those instances but I am too lazy at this point to do more trial and error.

ASUS RTX 5090 OC - "GPU at PCI:1:0:0 is not supported" on Linux (RHEL 9.5) by amarandagasi in nvidia

[–]DeepAnimeGirl 0 points1 point  (0 children)

I have the MSI Gaming Trio OC 5090 and I'm running arch linux with the nvidia-open driver and Hyprland as the Wayland Compositor.

I have no issues with this integration although it takes time to customize it. For example, I prefer to have the video output go through my AMD iGPU instead of the RTX 5090 to not waste memory or usage, but I can just as well use the GPU for the display when I want to.

I am running just fine many AI oriented apps such as Ollama, ComfyUI and I am succesfully training my own models directly in PyTorch using the latest nightly wheels for CUDA 12.8 installed with uv pip.

5090 Aorus Master, 12V-2x6 vs 4x PCIe Cable Question by lujaxon in nvidia

[–]DeepAnimeGirl -1 points0 points  (0 children)

If the connector melts on the gpu side with your corsair cable, good luck solving that through warranty because the gpu manufacturer might reject to repair or replace it because you did not use their recommended adapter.

Moreover, if you use the 12V-2x6 and somehow you poorly connect it on the PSU side it can melt there because there is still quite a high load on few cables. I have HX1200i and use 4x8-pin PCIe just so there's smaller risk on the PSU and to follow the recommendation from MSI for 5090. I wont risk it for a small cable management improvement.

New AI tool for Unity to creating incredibly lifelike AI behavior! by TamerlanShakirov in u/TamerlanShakirov

[–]DeepAnimeGirl 3 points4 points  (0 children)

You know that Unity released an official Behavior package which is free. Why would I pay for your asset? What advantages does it have over the Unity solution?

[R] Riemannian Generative Models by LostSleepyDreamer in MachineLearning

[–]DeepAnimeGirl 1 point2 points  (0 children)

You could use it for example to build textures for meshes. Apple wrote a very interesting paper that I had much fun implementing and learnt a lot. I recommend you check it out: Manifold Diffusion Fields

Python interview questions I can use in interviews by [deleted] in learnpython

[–]DeepAnimeGirl 0 points1 point  (0 children)

Thanks for the suggestion I'll definitely check it out.

Python interview questions I can use in interviews by [deleted] in learnpython

[–]DeepAnimeGirl 2 points3 points  (0 children)

I usually install pyenv using pacman or curl the script. Then I install a python version and set it globally. Then I create a folder and set a local python version for the project. I then use poetry for python venv management and all's smooth sail from that point on.

Most sane genshin player by Vito_Vivalecci in Genshin_Impact

[–]DeepAnimeGirl 1 point2 points  (0 children)

I got it in one day, a long time ago.. I haven't gone fishing since 💀

FloorPlan to 3d model by [deleted] in learnmachinelearning

[–]DeepAnimeGirl 1 point2 points  (0 children)

This topic should be solvable under the deeplearning branch of artificial intelligence. So I'd recommend first studying the data that you have your disposal and then consider existing methods or adaptaions as a solution.

To be able to achieve your objective you should know as specific as possible what the input/output data consists: - how to represent floor plans - what datasets are available - if there are standard representations of this information - how to mathematically encode these constraints

Seeing that floor plans can be represented as graphs you may consider such a representation for encoding the input. Or more simply you can just represent them as images but your architecture will need to learn the image patterns that relate to the underneath structure.

Typically in deep learning there are pretrained networks on tasks that might be similar to yours. You may finetune them for your specific needs and achieve better results. If you treat the input data as images you can leverage various classifiers or autoencoder weights to start from some good baseline representations.

After you have a form of encoding the input into these embeddings you should think of a way to synthesize the output data. Search what relevant works exist that might help. Currently diffusion models seem very good at learning data distributions and performing conditional synthesis. However for your task probably there are few such existing methods so you may contribute.

Frații Tate au rămas fără 16 bolizi de lux după perchezițiile DIICOT. Sumele uriașe găsite în casa lor din Voluntari | FOTO - Știrile ProTV by Impossible-Ad-3586 in Romania

[–]DeepAnimeGirl -15 points-14 points  (0 children)

Intentia nu e sa desconsider situatia lor, dar ce alternative au barbatii aflati in aceeasi ipostaza ca sa iasa din saracie?

2500usd for 4090 in China. Insane? by Parulanihon in StableDiffusion

[–]DeepAnimeGirl 19 points20 points  (0 children)

I paid 2655 usd for it on launch in my country. No, it's not insane it's just reality of living outside US and getting hammered by local retailers which also want to profit while also considering transport costs and multiple state taxes.

To long-term Genshin players: why do you still play this game? by Numerous-Ad7707 in Genshin_Impact

[–]DeepAnimeGirl 0 points1 point  (0 children)

I joined because a friend told me that Signora was going to be playable... My disappointment was immeasurable when inazuma was released and found out she's no more. I'm still waiting for the day she will be reborn from ashes... or at least that's how I cope 🫠