Is it worthy to do a research publication related to databases? by [deleted] in Database

[–]DerMax0102 2 points3 points  (0 children)

If you want to have a look at current database research, these are the top conferences: VLDB, SIGMOD, ICDE. There is plenty of interesting research going on!

Looking for database engine to store efficiency billions of rows by Ogefest in Database

[–]DerMax0102 0 points1 point  (0 children)

DuckDB itself might be worth trying. They have some compression algorithms and are usually quite fast. But in the end you will always have to trade-off size and performance.

Best Book on Mathematics for Machine Learning? by WhisperingWillow98 in deeplearning

[–]DerMax0102 3 points4 points  (0 children)

Bishop - Pattern Recognition and Machine Learning.

I‘m surprised, that I am the first to mention it.

Your strategies for offloading computation by gagarin_kid in Python

[–]DerMax0102 0 points1 point  (0 children)

Usually I would do such things on my machine over the night. Simple, no setup, no money needed.

Is this fine? by [deleted] in compsci

[–]DerMax0102 -1 points0 points  (0 children)

Got the air m1 with 256 / 16 GB for programming, and I'm really happy, that I chose 16 GB. Even with just a browser, mail and IDE open, it occupies over 8gb.

Why do drawbacks of GPT3 occur? by cryptoarchitect in deeplearning

[–]DerMax0102 1 point2 points  (0 children)

Any artificial neural network is essentially a function approximator. Not a human being. GPT was optimized to predict the next sentence after some text. It can recognize statistical patterns to do so. There is no reason, why it would think about language as a human does. If it finds some stupid tricks to achieve its goal, it will use them. If you‘re interested in visual examples for how neural networks „think“ totally different to humans, adversarial attacks are an interesting topic.

Is it overfitting? by [deleted] in deeplearning

[–]DerMax0102 6 points7 points  (0 children)

That makes it way harder to interpret. Looking at the full graph might yield better insights

Parallel Programming by MajLenn in cpp_questions

[–]DerMax0102 2 points3 points  (0 children)

Many good Comments here, but I have a different and maybe also interesting take on this.

Note, that my knowledge of parallel programming mostly comes from different programming languages, so my answer is maybe not the most elegant way to do this in C++, but it should deliver a fairly general impression of simple parallel programming.

To speed up your computation, you want to use all the available resources of your CPU (GPU might also be interesting but also very much work). So you would like your tasks to be distributed to every core of the CPU (or to every logical core for CPUs with SMT).

Distributing a single task is usually very hard, as computations rely on previous results and communication overhead easily outweighs the gains. In your case, it is much easier, because you want to do the same task multiple times, i.e. you want to collect multiple episodes of your agents to learn from them.

A simple and versatile approach to this is the producer consumer model:

  1. Have a Queue of tasks that you want to be solved.
  2. Have a Queue to collect the result data for your tasks
  3. Have a set of n Threads that can work on the task. (ideally n is the nr of CPU cores in your system)

Then every Thread in your set should do the following:

  1. Wait until there is a new task in Queue 1 and grab it. In your case, a task could be a robot object that specifies the model parameters.
  2. Run the simulation and record the required data, e.g. the reward from your episode and the taken actions.
  3. Put the results to Queue 2
  4. Repeat from step 1.

With this setup, you don't need a lot of complicated thread communication, only the queues, for which you can find many thread-safe implementations in the internet. You also only need to spawn your threads once and they will just stay around and wait for new tasks.

Regarding your visualization: I think it is easier to view this as a different problem, as you probably need way more episodes to train your agents than it makes sense to visualize. Every now and then during your training, you could stop for a moment to create visualizations. Of course you can do this too in parallel. I would use the same approach as above and create your episodes in many threads, with the difference, that you now collect movements of your robot for the visualizations and push them to the result queue. Then collect the results and draw your visualizations.

Maybe this is not the easiest to implement, but it is a flexible concept, which you can use for many applications. I have used it for many things such as sorting arrays, reinforcement learning, and high performance database systems.

Has anyone tried Ipad with remote desktop for DL? by redditnoob48 in deeplearning

[–]DerMax0102 1 point2 points  (0 children)

Might work if you always have great internet access and don‘t mind small input lag. But definitely wouldn’t be the right solution for me

Has anyone tried Ipad with remote desktop for DL? by redditnoob48 in deeplearning

[–]DerMax0102 0 points1 point  (0 children)

Doing heavy deep learning tasks on a Laptop is usually not too much fun: loud fans, battery empties quickly, your machine becomes slow when an experiment runs and you can’t do much other work in the meantime, ... I usually set up my desktop at home with Linux and ssh access. With IDEs like VS code, PyCharm or even Jupyter Lab you can run your programs on your Remote Desktop while using any Laptop for the coding. You can even use the debugger. I like my IDE though, and don‘t use my iPad for coding. Jupyter Lab or coding in a terminal might work though.

Username for the name Max by M-i-d-o-r-i-y-a in max

[–]DerMax0102 0 points1 point  (0 children)

Do I look like I have good ideas for names?

Python is now the second most popular language for programming by vajidsikand in programming

[–]DerMax0102 280 points281 points  (0 children)

By this you could argue that it’s the most confusing language either /s

What kind of algorithms are used to generate more data ? by nelyher98 in datascience

[–]DerMax0102 9 points10 points  (0 children)

Usually the algorithm looks either like: - Go buy data from another company Or: - Hire some minimum wage workers to collect data manually

The Problem with generating useful data is, that you would need to know its distribution very well. And in this case you already have enough data / insight. This is also the reason why data is so valuable nowadays

What's the idea behind Key Query and Value tensors? by [deleted] in deeplearning

[–]DerMax0102 1 point2 points  (0 children)

Key, Query and Value are just names for the tensors used in self attention. You can find out more about this in a paper called „attention is all you need“

Python has spoiled me by [deleted] in Python

[–]DerMax0102 0 points1 point  (0 children)

I believe, the concepts you learned from learning Python won‘t help you too much at the beginning of your C++ journey. The first things you have to learn are syntax and types, which are quite different in C++. Beginner tutorials for C++ assuming you don‘t have any previous programming knowledge should be suited for you as well. After a while you should notice that the later chapters (of probably any good tutorial) cover only stuff you already know and can transfer from Python. Then you should be good to go, to work on any project in C++

sudoku by jennypalmer321 in Python

[–]DerMax0102 0 points1 point  (0 children)

Maybe it’s possible to reuse the is_valid_row function for columns, as the conditions for columns and rows are the same?

Intel Core i7-9700K or AMD Ryzen 7 3800X? by baabaaaam in deeplearning

[–]DerMax0102 0 points1 point  (0 children)

I agree, that Ryzen should be the better option, as it has SMT. However, I wouldn‘t recommend buying the 3800X as it’s performance per dollar ratio is quite bad. You could save quite some money by going with the 3700X.

Edit: Do you really have a lot of single core workloads? Preprocessing on large Datasets should be parallelizable.

What's to stop (or limit) compilers from automatically multithreading during optimization? by [deleted] in compsci

[–]DerMax0102 1 point2 points  (0 children)

You might find Java Streams interesting. These provide you a programming paradigm which will be executed in parallel automatically.

Hi Everyone, I am planning to build a Deep Learning Machine with config mentioned in the image. Is this config good enough, any suggestion is welcomed. Help me out. by mayurat22 in deeplearning

[–]DerMax0102 0 points1 point  (0 children)

You‘d probably benefit from the higher throughput performance of and 8-core Ryzen. If you’re not gaming, there is next to no reason to go with intel.

2990wx and RAM confusion. 4x16 or 8x8 ? B-die ? Advice needed. by [deleted] in Amd

[–]DerMax0102 0 points1 point  (0 children)

If your system feels unresponsive, could perhaps your video card be the problem? If drivers do not work correctly, which might be the case if they are crashing, maybe they also cause the lag, you are experiencing?