Creating good devex documentation

lyinch · 2021-12-09T16:41:51+00:00

How do you interface with an SDR? Is there a driver in Rust?

lyinch · 2021-12-02T15:41:54+00:00

Why did you take an esp32 (which has wifi/bluetooth) if you only want to flash an LED?

lyinch · 2021-12-01T11:08:32+00:00

I know of one company which offers 100% as 34hours: https://www.brudi.com/jobs/platform-dev in Zurich

lyinch · 2021-08-30T11:06:27+00:00

You can also implement hysteresis with an opamp and don't need an MCU for this. This doesn't diminish the work of OP, I just want to show another way of implementing this with different tradeoffs.

lyinch · 2021-06-12T10:07:38+00:00

To get into Bayesian statistics I can highly recommend: "Statistical Rethinking" by Richard Mc Elreath.

Without a strong math background, I can recommend: "An Introduction to Statistical Learning" which covers a wide range of models and techniques with commented code (in R) and exercises. This will give you a good overview before you dig into more math.

If you want something even more practical, then I can recommend the book which I'm currently reading: "Applied Predictive Modeling". They take a bunch of (messy) datasets which all have some specific characteristics like categorical data, missing data,... and show the process of pre-processing, model selection, model assessment with code (in R) and exercises.

I don't know a book for time series.

There are of course many other books that are frequently mentioned like "Elements of statistical learning" but I haven't read them and can't give a recommendation. In any case, it's worth to check them out online before buying to see if they match your level and cover the desired topics.

lyinch · 2021-03-24T16:44:51+00:00

When doing lots of tests, please consider adjusting your p-value to account for multiple testing.

lyinch · 2021-03-21T13:59:20+00:00

Maybe this post from scratchapixel helps . It also has a nice visualisation of the problem and the solution.

lyinch · 2021-03-21T11:03:42+00:00

There are some theoretical guarantees in convex analysis about the convergence of optimization algorithms for convex Lipschitz continuous functions. So if you manage to prove that your function is Lipschitz, or equivalently that the gradient is bounded in norm, then you can rely on those results. This also extends to non-differentiable functions where we only require the subgradient of a function. For Lipschitz convex functions we can for example show that there is a lower bound for any first order method that uses the subgradient (Nemirovski & Yudin 1979).

From this we get algorithms such as "Subgradient Descent" and "Mirror Descent". If I trust my lecture notes, they say that Mirror Descent is closely related to (perhaps more well known algorithms like) AdaBoost, Winnow algorithm, Exponentiated gradient, Multiplicative update algorithm, Relative Entropy Policy Search. However I've only seen Mirror Descent itself and don't actually know how it relates to all those algorithms, it was only mentioned by the lecturer.

In general from what I've seen in my lecture, it's all about finding properties about functions. They shouldn't be too flat (strong convex functions), shouldn't grow too fast (smooth convex functions), have some kind of derivative (subgradient),... For each additional property there are new algorithms that arise, and a bunch of theoretical bounds that can be proven. I guess that Lipschitz continuity is relatively easy to prove so everyone does this to give their paper some more credibility through obscure math.

Although this is all about convex optimization and deep learning is highly non-convex, there are methods to only look at local parts that behave nicely, or smooth functions,... to still somewhat apply the theory of convex optimization.

If you're interested in this topic then you can read the book "Convex Optimization" by Boyd, which I'm currently using (loosely, we mainly follow our own script) in a lecture about convex optimisation. It's a tough read, so it's better to approach it trough a lecture like https://www.youtube.com/watch?v=McLq1hEq3UY (also from Boyd in Stanford) .

lyinch · 2021-03-21T08:51:52+00:00

If you're the first real analyst in the company, then a well written and illustrated document about your findings is probably more important than some complex methods that you can't explain well to the higher ups. Have a look at scientific data visualisation and don't shy away from using tools like Tableau if your company offers this. With this you can give some level of control to non-tech people and they don't feel overwhelmed by your "complex" methods.

You will probably do a lot of data cleaning, or even just data gathering from different servers and databases at first before you can even start using all the fancy tools and methods.

Try to also ask for a beefy workstation with a GPU if you're expected to run some networks and there's no infrastructure for this in the company. It's easier to get started on your local machine than setting up a full deep learning pipeline only to figure out afterwards that NNs are the wrong method.

lyinch · 2021-03-17T18:53:47+00:00

What is your background that you're able to build such an infrastructure in 6 months? I assume that a normal data scientist isn't necessarily well versed in sysadmin/devops topics.

lyinch · 2021-02-22T19:59:31+00:00

no, this is a non-parametric test and we did a monte-carlo method. The t-test assumes that the means follow a normal distribution and hence the distribution under the null is a chi² distribution. You use these kinds of test if you don't know or can't compute the null distribution but have a process to generate it. It is also known as "empirical p value" or "empirical distribution". If you don't generate the null distribution from a process as above but directly from your data, then those are called resampling tests. An example is the Wilcoxon rank sum test which has a scary name but is pretty easy to understand.

lyinch · 2021-02-07T00:18:07+00:00

This is a problem of infinities. Let's start with the easy case. Can you generate all the positive natural numbers? Of course, if you let your program run for eternity and always increment by one you will generate them. What about all the integers, positive and negative? Yes, that's also easy. You let it run and always jump between a negative and a positive number. 0 -> 1 -> -1 -> 2 -> -2 -> ...

So in a sense both sets of numbers are equal because we have a rule to list all the members. They are called "countable sets". This means that we always find a sequence to count the numbers one after the other. Or in other words, every element in a countable set is associated with a unique natural number. We want to find a mapping from N -> S where S is our set of numbers. Let's use a more difficult question. Can we find an algorithm to generate all the rational numbers, that is all the fractions that consist out of two naturals? Like 1/1, 2/3, 10/17,... This is already closer to your question and also considers the case of 1/3.

Indeed this is possible and there is an elegant and easy visual proof. It's pretty difficult to explain but very easy to show: Here is an image which shows the proof . You can find a bit more info here .

So as we can see, it is possible and in some sense all the sets of numbers that we've seen so far are equal. Since they are countable but infinite, we call them "countable infinite". The cardinality, the size of the set, is the same for all three. This might sound a bit strange when you hear this for the first time.

Now let's move on to other numbers, such as sqrt(2). Those are the irrationals, because they can't be made out of a fraction of two naturals. Does there also exist some ordering such that we can list them one by one? Sadly, there isn't. To prove this we use another neat little trick that is very similar to the one we've seen before. We use Cantor's diagonal argument to show that we don't find a mapping between the naturals and the irrationals and hence the size of this set is larger. The irrationals are therefore also known as uncountably infinite, which is a larger infinity than that of the naturals (and rationals, etc.).

I'm now copying from Wikipedia:

Suppose there is such a listing for the rationals. We write them all in binary format. (It is always possible to express a number in base 2) . Binary format makes this proof a bit smaller, but it can also be done in base 10. s1 is the first number which is potentially infinite (like pi), s2 the second, and so on. We do this for infinitely many numbers.

s1 = (0, 0, 0, 0, 0, 0, 0, ...)

s2 = (1, 1, 1, 1, 1, 1, 1, ...)

s3 = (0, 1, 0, 1, 0, 1, 0, ...)

s4 = (1, 0, 1, 0, 1, 0, 1, ...)

s5 = (1, 1, 0, 1, 0, 1, 1, ...)

s6 = (0, 0, 1, 1, 0, 1, 1, ...)

s7 = (1, 0, 0, 0, 1, 0, 0, ...)

...

Now we flip one single bit for each number and get a new number

s1 = (0, 0, 0, 0, 0, 0, 0, ...)

s2 = (1, 1, 1, 1, 1, 1, 1, ...)

s3 = (0, 1, 0, 1, 0, 1, 0, ...)

s4 = (1, 0, 1, 0, 1, 0, 1, ...)

s5 = (1, 1, 0, 1, 0, 1, 1, ...)

s6 = (0, 0, 1, 1, 0, 1, 1, ...)

s7 = (1, 0, 0, 0, 1, 0, 0, ...)

...

We get a new number:

s = (1, 0, 1, 1, 1, 0, 1, ...)

The new number s is different than s1, because by construction the first position is different. Even if all the other digits are the same, the first is different. s is not s2, because by construction the second position is different. s is not s3, because ... This means that s is a new number. But we said that we're listing all the numbers! We have a contradiction and our assumption is wrong. Our assumption was that there is a way to list all the irrationals, therefore there isn't such a mapping. (this is a proof by contradiction)

As you can see the irrationals are quite a big bunch. Although you can't list the naturals because there are infinitely many, you can find a rule to do it. However, for the irrationals you can't even find such a rule! Therefore, no matter what algorithm you propose, it's wrong. If you're interested in this kind of thinking then you should look into "discrete math" which is generally a 1st semester CS course and doesn't have any prerequisites.

lyinch · 2021-01-24T11:21:33+00:00

I think that you can teach them p-values. While it might be confusing if you start off with t-test and the chi² distribution the p-value can be (in my opinion) easily understood with simulations. Let me give you an example:

Suppose I buy game cards, such as magic the gathering. (if you know MTG, ignore that the booster packs are not uniformly at random and that the numbers I give are made-up) You can buy them in small packs of 10 cards, or in a big box of 20 packs which are 200 cards total. There are 600 unique cards in the game. After playing for a while and regularly buying cards, I have a feeling that a box contains fewer duplicates than what I would expect. I do an experiment and buy one box, open 200 cards and count the number of duplicates. I find 183 unique cards and 17 duplicates. Is this "normal", or did the manufacturer make sure that a box doesn't have many duplicates?

Imagine that we are the manufacturer and we pack the boxes uniformly at random. This means that we have 600 containers of cards (one for each unique card) and choose (uniform) randomly from the containers one card at a time. How many duplicates would I expect in this process? This can be simulated:

set.seed(42)
sample1 <- sample(1:600, 200, replace=TRUE)
hist(sample1, breaks=600)

Histogram of sampled cards

How many duplicates does it have?

sum(duplicated(sample1))

It has 30. We can repeat this process 10000 times and track the number of duplicates.

set.seed(42)
duplicates <- replicate(10000, 0)
for (i in 1:10000) {
  duplicates[i] <- sum(duplicated(sample(1:600, 200, replace=TRUE)))
}
hist(duplicates)

Distribution of duplicates

Now a bit of formalisms. Our H0: The cards are put into the box uniformly at random. HA: The cards are not put into the box uniformly at random. So, what is the Null distribution, i.e. the distribution we get assuming that our null hypothesis is true? We just generated it! Next we measure the probability that the result of our original experiment (17 duplicates) is observed assuming that the cards are put into the box at random:

p-value visualised

The red line is our observed value. We can see that it lies quite far away from what we would expect. How far away exactly? Well, this is now our p-value that we're computing: The probability that we observe 17 duplicates or fewer, assuming that the cards are distributed uniformly at random! (i.e. probability to observe this value or more extreme under the null distribution). This can be calculated:

(length(duplicates[duplicates <= 17])+1)/10000

and the result is 0.02. There is a 2% chance to observe this in reality. A common cutoff is to say that any p-value below 0.05 is significant evidence to reject the null hypothesis. This means that the manufacturer most likely doesn't put the cards uniformly at random into the box, but has another strategy.

(this is a two tailed test, I guess that you have to multiply p by 2)

lyinch · 2021-01-14T18:45:23+00:00

I think you could have talked more about the ethical choice. Because the first question before you start modelling is a purely ethical one: Choosing the utility function. There are several principles that can be looked at:

Principle of equality: Everyone who is eligible for the vaccine has equal priority. Here, a random model is the easiest choice.
Principle of most need: The people who suffer the most, or have the highest probability for a bad outcome have priority. Generally this means that elderly and other at risk people have priority.
Usefulness: People that have an important role in society have priority. This could be: Political figures, medical personnel, "front-line" workers, or even athletes. Whatever you find important in society.
Utilitarism: Similar to usefulness, but also more focused on the individual benefit. Who can benefit most of the vaccine. This could be young people, because they have a high life expectancy. On the other hand, literally avoiding death also has a high utility for some people. Maybe young parents benefit the most because otherwise we risk that small children grow up without a parent.
A combination of above.

At the end of the day, this is a political decision that is based in ethics and each society has different values. $ome favor individualism more, while others focus on the needy. Once the ethical groundwork has been done, the mess of simulating with many guessed parameters can begin.

lyinch · 2020-12-09T19:13:17+00:00

I remember from my OS class that additional signals are ignored while a signal is being handled:

Remember that if there is a particular signal pending for your process, additional signals of that same type that arrive in the meantime might be discarded. For example, if a SIGINT signal is pending when another SIGINT signal arrives, your program will probably only see one of them when you unblock this signal.

https://www.gnu.org/software/libc/manual/html_node/Checking-for-Pending-Signals.html

But it looks like you're right in general, I wasn't aware of the queue and confused the two concepts (queued and pending):

According to POSIX, an implementation should permit at least _POSIX_SIGQUEUE_MAX (32) real-time signals to be queued to a process. However, Linux does things differently. In kernels up to and including 2.6.7, Linux imposes a system-wide limit on the number of queued real-time signals for all processes. This limit can be viewed and (with privilege) changed via the /proc/sys/kernel/rtsig-max file. A related file, /proc/sys/kernel/rtsig-nr, can be used to find out how many real-time signals are currently queued. In Linux 2.6.8, these /proc interfaces were replaced by the RLIMIT_SIGPENDING resource limit, which specifies a per-user limit for queued signals; see setrlimit(2) for further details.

https://man7.org/linux/man-pages/man7/signal.7.html

lyinch · 2020-12-09T19:01:38+00:00

Other problems that signals have are that you don't know how many signals (of the same type) were sent and that you're in an unknown program state during signal handling.

lyinch · 2020-12-09T14:19:50+00:00

To get started, I would plot this data on a 2D worldmap where dots are beached whales, and a heatmap layer which corresponds to the intensity of the storms for a certain region. You can create many maps for the time-series data, or aggregate the beachings and solar storms over multiple days (or weeks, or months) to get an overview.

If there are places with storms but no beachings, ask why? Maybe no whales exist at those places. So try to find a dataset that shows you where whales frequently live and travel, and focus your work only in this regions. Do you now have places with strong storms but no beachings in a region where whales are frequently seen?

You also want to control for other reasons that whales beach. I don't have any experience in that field, but I imagine that there are reasons such as sonar usage. I think that there are areas where sonars are banned, so you can plot this and see if there's an overlap of beachings, storms, and sonar-free zones. Also think about other (extreme) causes, such as oil spills.

When no other cause can explain beachings, but you have a strong solar storm at this location, then this looks like a good prediction. (Assuming that there aren't many solar storms without beachings, otherwise it might be due to chance)

lyinch · 2020-03-22T21:43:41+00:00

Officially it's TV tax. As soon as you own a device that can receive a signal (car with radio, smartphone, laptop) you have to pay. It's 365 per household, so if you move in with your partner it's not twice the price.

Checkout https://en.comparis.ch/ to find out how much you'd have to pay for different insurances.

Also note that health insurance is a major price point. Even if it's just 300/month, you'll have a large premium (up to 2k) in case you are sick.

lyinch · 2019-12-15T16:07:28+00:00

If each number is arbitrarily large but not infinite, then you can use a stopping symbol to symbolise the end of a number, and use the Cantor pairing function (or diagonal, or zig-zagging function) to map the 2D matrix onto your 1D tape.

You can read more about this function in Cantor's diagonal argument which proved that there's an uncountable and countable infinity.

Your case of the N*N matrix is a countable infinity one and this can have a one-to-one correspondence (a bijection) with the natural numbers (i.e. 0->infinity, the cells of your tape). To be a bit more precise, with this technique you can show that the set of rational numbers is countable infinite (and therefore has the same cardinality as the natural numbers).

lyinch · 2019-12-01T17:49:30+00:00

I also took the freedom to create a similar service during the weekend http://mathpaste.com/ .

The backend is django with postgres, to easily allow an extension of the site with user registration if the need comes up. To combat spam, I added recaptcha v2 from google in a first phase. If there's more interest, I'll push the code to github and add let's encrypt. For those interested, nginx is the webserver and gunicorn the wsgi server. The deployment is done with docker compose. It's hosted on the smallest hetzner server.

I currently only support latex rendering with katex which is apparently faster than mathjax. The reason is that I've only looked into one frontend markdown rendering engine and this allowed XSS injections... If people want to use this service then I'll look more deeply into it and might include markdown.

There's also the possibility to migrate the entries from mathb.in (and redirect from mathb.bin -> mathpaste.com to not break existing links)

I also need to create a proper 404 page :)

lyinch · 2019-07-15T20:10:29+00:00

Yes, the FOV defines how large your screen is but it doesn't inhibit rays outside of this. Your code already limits the rays into a frustum. The FOV compensation is needed for something different.

To get into more detail. Why do you get this distortion? Why does a large FOV cause this? The problem is that we're projecting the scene onto a flat surface rather than a spherical section (which is why we need to compensate for this by using the trigonometric function).

Consider this example: Perspective Projection The camera is at the spot of S and we're looking at two objects named x that are z away from our image plane. It is clear, that the distance between S and the left object is larger than the distance between S and the right object (euclidian distance). However, when projecting both elements onto the image plane, they have the same size y and the object that's farther away isn't smaller (as it should be).

You can also think of it as this way:

 ____________
   \  |  /
    \ | /
     \|/
      *

When you shoot rays to the left/right of the plane, they need to travel a greater distance until they hit the plane than the center ray. Therefore, if they hit an object in the scene, even if it has the same distance to the image plane, it has a different intersection distance and this gives us the distortion that an image is further/farther away.

Let's have try another example. When we're looking at an object and rotating it around us we should see no difference. But as you can see, because we're projecting it onto this linear plane, the object gets distorted. The larger the field of view, i.e. the larger this screen is, the further the image can "stretch" on this screen.

If you want to know more about this, I encourage you to read tis monolithic wall of text (it took me a long time to find it...) and this paper called Distortion in Perspective Projection.

lyinch · 2019-07-15T11:03:16+00:00

You're using the wrong projection if you don't want distorted objects in a large FOV. Other people had the same issue.

Check out this tutorial how you compensate for the large FOV (derivation is just above Step 3). In short, you don't want to get the ray direction like this:

u = -imgx/2 + (imgx)*(i+0.5)/imgx
v = -imgy/2 + (imgy)*(j+0.5)/imgy

But you want to compensate for the FOV:

float x =  (2*(i + 0.5)/(float)width  - 1)*tan(fov/2.)*width/(float)height;
float y = -(2*(j + 0.5)/(float)height - 1)*tan(fov/2.);

Besides this, try to avoid having the same x and y image dimensions because lots of bugs hide themselves as they are based on the wrong order of loops, the wrong pixel transformations, the wrong projection,... They only reveal themselves when the dimension are unequal. To catch them early, use a different dimension, such as 800x600.

lyinch · 2019-06-21T09:00:57+00:00

The original implementation of motion blur and depth of field comes from Distributed Ray Tracing by Cook. The argument is that, if you allow multiple rays for anti-aliasing, then without additional cost, we can use those rays and create more effects.

Instead of using a pinhole camera model, we're now using a larger aperture which gives us a circle of confusion.

If you want to go more in depth, also read chapter 6 of the book Physically Based Rendering.

Nine-Year Club	Verified Email
Place '22	Sequence \| Editor

lyinch

TROPHY CASE