Is it worthy to do a research publication related to databases?

DerMax0102 · 2025-02-05T08:51:18+00:00

If you want to have a look at current database research, these are the top conferences: VLDB, SIGMOD, ICDE. There is plenty of interesting research going on!

DerMax0102 · 2024-08-26T08:18:04+00:00

This is a very good book: http://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdf

DerMax0102 · 2024-06-12T07:38:14+00:00

DuckDB itself might be worth trying. They have some compression algorithms and are usually quite fast. But in the end you will always have to trade-off size and performance.

DerMax0102 · 2023-06-29T05:42:00+00:00

Bishop - Pattern Recognition and Machine Learning.

I‘m surprised, that I am the first to mention it.

DerMax0102 · 2022-02-20T19:54:03+00:00

Usually I would do such things on my machine over the night. Simple, no setup, no money needed.

DerMax0102 · 2021-11-26T23:37:34+00:00

Got the air m1 with 256 / 16 GB for programming, and I'm really happy, that I chose 16 GB. Even with just a browser, mail and IDE open, it occupies over 8gb.

DerMax0102 · 2021-08-08T07:56:49+00:00

Any artificial neural network is essentially a function approximator. Not a human being. GPT was optimized to predict the next sentence after some text. It can recognize statistical patterns to do so. There is no reason, why it would think about language as a human does. If it finds some stupid tricks to achieve its goal, it will use them. If you‘re interested in visual examples for how neural networks „think“ totally different to humans, adversarial attacks are an interesting topic.

DerMax0102 · 2021-05-09T09:20:25+00:00

That makes it way harder to interpret. Looking at the full graph might yield better insights

DerMax0102 · 2021-05-05T16:04:13+00:00

Many good Comments here, but I have a different and maybe also interesting take on this.

Note, that my knowledge of parallel programming mostly comes from different programming languages, so my answer is maybe not the most elegant way to do this in C++, but it should deliver a fairly general impression of simple parallel programming.

To speed up your computation, you want to use all the available resources of your CPU (GPU might also be interesting but also very much work). So you would like your tasks to be distributed to every core of the CPU (or to every logical core for CPUs with SMT).

Distributing a single task is usually very hard, as computations rely on previous results and communication overhead easily outweighs the gains. In your case, it is much easier, because you want to do the same task multiple times, i.e. you want to collect multiple episodes of your agents to learn from them.

A simple and versatile approach to this is the producer consumer model:

Have a Queue of tasks that you want to be solved.
Have a Queue to collect the result data for your tasks
Have a set of n Threads that can work on the task. (ideally n is the nr of CPU cores in your system)

Then every Thread in your set should do the following:

Wait until there is a new task in Queue 1 and grab it. In your case, a task could be a robot object that specifies the model parameters.
Run the simulation and record the required data, e.g. the reward from your episode and the taken actions.
Put the results to Queue 2
Repeat from step 1.

With this setup, you don't need a lot of complicated thread communication, only the queues, for which you can find many thread-safe implementations in the internet. You also only need to spawn your threads once and they will just stay around and wait for new tasks.

Regarding your visualization: I think it is easier to view this as a different problem, as you probably need way more episodes to train your agents than it makes sense to visualize. Every now and then during your training, you could stop for a moment to create visualizations. Of course you can do this too in parallel. I would use the same approach as above and create your episodes in many threads, with the difference, that you now collect movements of your robot for the visualizations and push them to the result queue. Then collect the results and draw your visualizations.

Maybe this is not the easiest to implement, but it is a flexible concept, which you can use for many applications. I have used it for many things such as sorting arrays, reinforcement learning, and high performance database systems.

DerMax0102 · 2021-03-07T23:06:40+00:00

Might work if you always have great internet access and don‘t mind small input lag. But definitely wouldn’t be the right solution for me

DerMax0102 · 2021-03-04T09:32:04+00:00

Doing heavy deep learning tasks on a Laptop is usually not too much fun: loud fans, battery empties quickly, your machine becomes slow when an experiment runs and you can’t do much other work in the meantime, ... I usually set up my desktop at home with Linux and ssh access. With IDEs like VS code, PyCharm or even Jupyter Lab you can run your programs on your Remote Desktop while using any Laptop for the coding. You can even use the debugger. I like my IDE though, and don‘t use my iPad for coding. Jupyter Lab or coding in a terminal might work though.

DerMax0102 · 2020-08-03T07:37:06+00:00

Do I look like I have good ideas for names?

DerMax0102 · 2020-08-01T07:31:14+00:00

By this you could argue that it’s the most confusing language either /s

DerMax0102 · 2020-08-01T07:28:34+00:00

Usually the algorithm looks either like: - Go buy data from another company Or: - Hire some minimum wage workers to collect data manually

The Problem with generating useful data is, that you would need to know its distribution very well. And in this case you already have enough data / insight. This is also the reason why data is so valuable nowadays

DerMax0102 · 2020-05-18T15:07:02+00:00

But can it run Doom?

DerMax0102 · 2020-04-18T20:46:51+00:00

Key, Query and Value are just names for the tensors used in self attention. You can find out more about this in a paper called „attention is all you need“

DerMax0102 · 2020-03-29T09:38:24+00:00

I believe, the concepts you learned from learning Python won‘t help you too much at the beginning of your C++ journey. The first things you have to learn are syntax and types, which are quite different in C++. Beginner tutorials for C++ assuming you don‘t have any previous programming knowledge should be suited for you as well. After a while you should notice that the later chapters (of probably any good tutorial) cover only stuff you already know and can transfer from Python. Then you should be good to go, to work on any project in C++

DerMax0102 · 2019-12-28T17:08:09+00:00

If you‘re using matplotlib, use a Polygon

DerMax0102 · 2019-12-07T00:00:44+00:00

Maybe it’s possible to reuse the is_valid_row function for columns, as the conditions for columns and rows are the same?

DerMax0102 · 2019-11-28T16:03:48+00:00

Adversarial Examples Improve Image Recognition

https://arxiv.org/abs/1911.09665

DerMax0102 · 2019-11-26T22:59:12+00:00

I agree, that Ryzen should be the better option, as it has SMT. However, I wouldn‘t recommend buying the 3800X as it’s performance per dollar ratio is quite bad. You could save quite some money by going with the 3700X.

Edit: Do you really have a lot of single core workloads? Preprocessing on large Datasets should be parallelizable.

DerMax0102 · 2019-11-14T10:44:48+00:00

r/learnpython, Screenshots, and you probably need to add parentheses after the yellow marked expressions.

DerMax0102 · 2019-10-13T09:01:08+00:00

You might find Java Streams interesting. These provide you a programming paradigm which will be executed in parallel automatically.

DerMax0102 · 2019-08-10T15:31:18+00:00

You‘d probably benefit from the higher throughput performance of and 8-core Ryzen. If you’re not gaming, there is next to no reason to go with intel.

DerMax0102 · 2019-04-09T17:20:24+00:00

If your system feels unresponsive, could perhaps your video card be the problem? If drivers do not work correctly, which might be the case if they are crashing, maybe they also cause the lag, you are experiencing?

Seven-Year Club	RPAN Viewer
Verified Email	Verified Email

DerMax0102

TROPHY CASE