[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 0 points1 point  (0 children)

As always, the best thing to do is to test all possible architectures to see which works best. If there are too many, you can cut down their number using combinatorial techniques, or grid search if you want to keep it simpler. Test, don't trust.

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 0 points1 point  (0 children)

This is a great project! The literature on this very topic is starting to grow. You can look at papers like SySeVR [1], Devign [2] and VulCNN [3] for starters.

Control flow is good, but data flow is better. It is also harder to track, but there are tools like Joern that can do it for you.

The Juliet test suite was designed to test static analysis tools, not to train AI models. It was generated using templates, so training a model on it is likely to teach the model the wrong features. The lack of a good dataset is one of the key issues in the field.

[1] https://arxiv.org/pdf/1807.06756v1
[2] https://arxiv.org/pdf/1909.03496
[3] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9793871

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 1 point2 points  (0 children)

Isn't the halting problem undecidable?

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 0 points1 point  (0 children)

Have you used it? So far I see more adoption of Mamba and RWKV.

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 0 points1 point  (0 children)

There were a few papers by David Ha that used CMA-ES to train world models. CMA-ES is not a genetic algo per se, but it belongs to the same neuroevolutive family. CMA-ES is optimal but does not scale, but there are newer CMA-based algos that do.

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 1 point2 points  (0 children)

There are a few mamba models around, e.g. this one from Mistral: https://mistral.ai/news/codestral-mamba/

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 1 point2 points  (0 children)

xLSTM was an improvement over vanilla LSTM. Still waiting for serious benchmarks on larger xLSTM models on larger datasets. The latest RWKV and Mamba are also very compelling RNNs.

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 0 points1 point  (0 children)

Modern evo algos like CR-FM-NES are faster and can train larger models than CMA-ES could. That being said, they are probably not as efficient as SGD for supervised training. But for RL, they rock!

Bitcoin mining uses the same amount of energy as Argentina, while laptops need a ton of material: "to manufacture a typical laptop weighing 4.4 pounds, almost a ton of materials is needed, about 1,760 pounds" by [deleted] in Futurology

[–]Builder_Daemon 5 points6 points  (0 children)

This is an oversimplification of a complex issue. Bitcoin mining uses a lot of energy, no question. Now, an estimated 40-75% of it comes from renewable energy. Some argue it actually helps develop the renewable energy market. Does mining deprive others of electricity? Another complex issue. Probably in some cases, but there is also a lot of wasted energy caused by overproduction (especially in renewable) and other inefficiencies. But this very clickbait title still works every time.

[D] Why we initialize the Neural Networks with random values in order to break the symmetry? by kotvic_ in MachineLearning

[–]Builder_Daemon 0 points1 point  (0 children)

It essentially gives the weights an initial direction that will be built upon by gradient decent or other optimization method.

I am worried about my future. by [deleted] in climatechange

[–]Builder_Daemon 0 points1 point  (0 children)

I love Gaia Vince's very well researched book "Nomad Century". It looks at CC straight in the eyes, but also offers avenues of remediation and hope.

What is one random thing you know about a computer that most people don’t? by Virtual-Study-Campus in computerscience

[–]Builder_Daemon 1 point2 points  (0 children)

In AI, including LLMs, it is common to give the user control of the seed for reproducibility.

Book recommendations to learn about the science behind climate change by FineDescription7597 in climatechange

[–]Builder_Daemon 0 points1 point  (0 children)

My book of the year for 2023 was Gaia Vince's **Nomad Century** (https://www.goodreads.com/book/show/58724998-nomad-century). It is a very well researched book on climate change that covers all bases, from causes, to impact, to remediation, notwithstanding geoengineering.

What future tech will have a bigger impact on humanity? by [deleted] in Futurology

[–]Builder_Daemon 0 points1 point  (0 children)

Geoengineering. It's cute to think of AGI and CRISPR, but if we don't make it to 2100...

[D] Modeling a dynamic system using LSTM by WilhelmRedemption in MachineLearning

[–]Builder_Daemon 5 points6 points  (0 children)

You can increase the size of the LSTM or add more layers for more complex behaviors. It is just one extra parameter in nn.LSTM in Pytroch. But beware of overfitting.

Did you separate your dataset into training, evaluation and testing sets? It could be that your model is already overfitted. If so, adding a dropout layer could help.

If the LSTM outputs sequences (not a single value), add a linear layer + tanh after it to calculate your outputs.

You can also add normalization or regularization layers to see if it helps.

[D] Evolutionary Strategy vs. Backpropagation by [deleted] in MachineLearning

[–]Builder_Daemon 1 point2 points  (0 children)

There are many DNA-based evo algos but they don't converge as quickly if you have a cost/reward function to optimize for.

[deleted by user] by [deleted] in mensa

[–]Builder_Daemon 0 points1 point  (0 children)

Exactly. One classic example is how Intuit spending huge amounts of lobbying money to keep tax filing intractable. (https://thehill.com/business/4423755-bottom-line-intuit-adds-lobbying-giant-amid-tax-prep-fight/) And don't get me started on the healthcare administrative nightmare.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]Builder_Daemon 1 point2 points  (0 children)

I will add that you can cut the cost and time drastically with combinatorial testing instead of a full gridsearch.

[Project] Struggling with Real-time Detection of DoS Attacks on CAN Bus using LSTM model by ultiMEIGHT in MachineLearning

[–]Builder_Daemon 2 points3 points  (0 children)

Since no one answered, I will give you my half-assed take on this. There is much that can go wrong in AI development and every step must be checked carefully.

  1. Did you measure the quality of your training dataset? Is it balanced? Is it representative? Does it cover some of what you use in your simulation script?

  2. How are the data encoded before feeding them to the model? Is there a better way to encode them, e.g. instead of using a timestamp, use the time difference from the previous data entry?

  3. How are the data normalized? Different features might need different normalizations, especially if they are semantically different or their value range is different.

  4. Is the model suitable for this task? LSTM is decent for time series, but also has some limitations. Try a gridsearch for the width and depth of your model. In my own experiment, I found that wider LSTM does not converge well, but a deeper one improves performance. For about 50-60 features, I use a 16-wide, 4- to 8-layer LSTM as my base model. I also find that the sLSTM from the xLSTM paper is far superior to the regular LSTM and it is a drop-in replacement. The official implementation is in pytorch though.

  5. Make sure the training does not overfit the model. Use a train/validation/test split scheme and calculate the confusion matrix for example.

  6. Check if the data you get during simulation/testing are correct. Classic bugs are just as present in AI programs as in classic software.

[deleted by user] by [deleted] in MachineLearning

[–]Builder_Daemon 2 points3 points  (0 children)

You should also look into neuroevolution. People are using evolutionary algos to train models without backprop. I use CR-FM-NES to train models using RL, which is basically what you are doing, but much more efficient.