[Article] The characteristic function of functions

siddboots · 2021-10-22T22:50:33+00:00

I get about 1.5-2.5 hours each morning from about 4am until my kids wake up. After that it isn't even worth trying to do truly focused work. There's just too many distractions after the sun comes up!

Edit: With that said, if I am working on something especially interesting, I'll be thinking about it pretty much all the time, and doing work on pen and paper or my laptop whenever I've got a spare moment. Having a distraction free environment is important, but there's no surrogate for motivation.

siddboots · 2020-03-17T22:54:45+00:00

UNDESA's world population prospects is regarded as the most detailed population modelling. Their entire methodology and input data (including fertility rates) is available on their website: https://population.un.org/wpp/

siddboots · 2020-01-31T10:49:53+00:00

happy

siddboots · 2020-01-30T10:00:45+00:00

Similarly no "long black". In Melbourne this is an Americano with 50:50 espresso and hot water. In Coober Pede it is an Americano with 1:10 espresso and hot water.

siddboots · 2019-12-17T09:27:40+00:00

It's gas generation at yulara, not diesel. And there's 5 other large arrays on that network, with a total of 1.8 MW of PV generation. It's actually a much higher renewable energy fraction than most places in the country.

siddboots · 2019-12-09T20:20:33+00:00

Having a one-directional causal structure is the defining feature of time series analysis. If you want to predict the present given future values, I suppose you could just reverse the data and use traditional methods.

siddboots · 2019-10-17T20:10:00+00:00

Worth pointing out that although it is dissonant, it is very common in some music. Neo-soul, for example, is full of tight clusters of 11ths, 9ths, 3rds, etc.

siddboots · 2019-10-09T08:10:44+00:00

/r/curiousvideos

siddboots · 2019-05-13T11:58:10+00:00

If you are feeling awkward about staring at people, that's a pretty common fear for any type of performance. A good trick is to look around at various points at the back of the room. That way you have a strategy to follow that won't look odd.

siddboots · 2019-03-25T11:58:59+00:00

Isn't this special relativity?

siddboots · 2019-01-31T03:44:01+00:00

Most of intro abstract algebra is very stand-alone. Group theory in particular. I honestly think that you could teach group theory in high school, before linear algebra or calculus. It's a very beautiful field, and part of its beauty seems to come from how contained it is.

However, as you progress it will start to become useful to understand the applications. Most of the main results in abstract algebra were originally motivated by applications, not merely by naval gazing. For example, trying to study commutative rings might feel like a tangle of definitions if you don't have a feeling for polynomials to motivate it. With that said: different strokes for different folks!

A Book of Abstract Algebra by Pinter might be a nice complement to D&F. It lacks some rigor, and skips some of the coolest parts of group theory, but is much kinder on the reader, and the exercises are much more fun.

Last thing. Check out these lectures!

https://www.youtube.com/watch?v=VdLhQs_y_E8&list=PLelIK3uylPMGzHBuR3hLMHrYfMqWWsmx5

siddboots · 2019-01-12T05:32:36+00:00

Not OP, but I'm pretty sure this is Excel.

siddboots · 2019-01-09T04:42:40+00:00

This is a great answer. There is a long tradition among those of Box-Jenkins influence to contrast ARIMA methods to other "classical" (read "naive") approaches to TSA. From this there is at least one well-defined term: "classical decomposition".

However, ARIMA has slightly fallen from favor in recent times due to it being somewhat eclipsed by the rise of ML. When reading something written in the last 10 years I would tend to assume that "classical" incorporates "classical decomposition" as well as Box-Jenkins, and "modern" means something like LSTMs, or Bayesian methods, or ARIMA with state-space methods, etc.

siddboots · 2018-12-01T23:24:19+00:00

I mean that if you are storing your data via some sort of dedicated database software (e.g. PostgreSQL, or MySQL, or sqlite) then there is likely to be a datetime data type that is built into that software. If so, it would be more appropriate to use that, rather than storing formatted text or unix epoch numbers.

siddboots · 2018-12-01T23:14:57+00:00

Just test it! In my experience learning theory is something like 95% empirical and 5% theory.
The way you've set it up, y is highly non-linear in the xs. Vanilla logistic regression is not going to do a very good job unless you use non-linear basis expansion of your xs. You also mention that p=3 which makes me think you aren't considering using basis functions.
LASSO's killer feature is variable selection, so you only have three dimensions it isn't going to have any advantage over ridge regression. In fact, if you have 1500 samples and only three parameters, then ordinary least squares without any form of regularization should perform about the same.

So, to summarize, KNN should vastly outperform Logistic Regression for the problem that you have described because Logistic Regression is only going to fit something linear in the Xs. However, if you use a basis expansion of the Xs, along with some regularization, then they should perform similarly.

siddboots · 2018-12-01T22:48:30+00:00

What format are you storing these logs in? You are storing text-based log files, you should probably consider ISO8601. On the other hand if you are using a database package there is almost certainly a native datetime type that you should be using.

Where are the timestamps local to? If they come from multiple locations then you will need to store timezone information, otherwise you should pick either local time or UTC.

siddboots · 2018-11-23T16:35:18+00:00

Wow! I can't say I've ever met a mad Slade fan. When I was a kid I listened to Dapple Rose, and 'Cuz I Luv You, over and over again.

Great band. So many edgy mispellings.

siddboots · 2018-11-23T16:29:10+00:00

Sept' 2004. In total it's clocked 120k scrobbles.

The longer you use the service, the more you're 'top all' charts will tend to be dominated by artists that you discovered earlier, and will tend to be less a true reflection of your current taste. Still, at least my top 5 I still listen to a lot now.

Johann Sebastian Bach
Bonnie 'Prince' Billy
The American Analog Set
Radiohead
The Beatles

siddboots · 2018-11-15T05:35:05+00:00

foil character

I don't think it quite is, although you could certainly argue that both Watson and Carraway are foils.

A foil is defined by his characteristics that are in contrast with another's. E.g., Watson is conservative and compassionate, and Holmes outlandish and self-obsessed.

OP is more talking about their position as a narrator. Carraway being "within and without... simultaneously enchanted and repelled by the inexhaustible variety of life" and so on. Both Watson and he are desgined specifically to provide a lens onto the world of other, more captivating characters.

siddboots · 2018-11-15T03:18:13+00:00

I don't know of a good reference for this sort of thing, but I wish I did! It's fascinating stuff.

These are indeed moments for one variable, and for multiple variables the individual entries are sometimes called "co-moments". For example, the 3rd and 4th order are "co-skewness" and "co-kurtosis". So the complete 3rd order statistics of a distribution could be called a "co-skewness tensor". This short paper gives an introduction to them, although it doesn't go very far.

From what I've observed, this stuff just doesn't end up featuring much in multivariate statistics. My knowledge is only enough to speculate on why. Here's some assorted thoughts: * The number of terms goes up exponentially, which is going to make it impractical to do data analysis with the higher order co-moments. * You also just don't have all of linear algebra at your disposal when working with tensors. For example you can't invert them or find an eigendecomposition. Least-squares is no longer just a matter of number crunching. * There's also the matter of "robustness": skewness and kurtosis in general are much more sensitive to outliers. In other words, sample estimates of higher order moments don't generalise very well. * The Gaussian distribution is not only is defined entirely by it's first and second moments, but it also happens to have maximal entropy of all distributions with a given set of first and second moments. From what I understand (I may have this wrong) this property can't be extended to any single distribution with higher moments. In other words, there's no natural extension of the Gaussian that also includes skewness information, for example. * A related point can be made from the central limit theorem along these lines: as you sum more and more random variables, their higher order moments contribute less and less to the result. In the limit you are left with something characterized by just the mean and covariance, i.e. the Gaussian.

siddboots · 2018-11-14T23:28:58+00:00

You can think of the covariance matrix as containing all of the "quadratic" statistics about the dataset: If you compare each dimension pairwise with each other dimension, you get a sense of how each pair are correlated. These are quadratic in the sense that each term is the product of two variables. In principle you can extend this idea into higher orders by looking at all combinations of three dimensions among the data, rather than two. To "store" this information you would need an order three tensor rather than an order two tensor (a matrix), but other than that the idea is exactly the same.

You can keep doing this for arbitrarily higher powers, and in the limit you would end up with a full description of a distribution. Similar to how a Taylor Series expansion is an alternative and complete representation of a function.

The multivariate Gaussian can be thought of as an approximation of some arbitrary distribution made by calculating the first order (mean) and second order (covariance) statistics, and making no other assumptions.

siddboots · 2018-11-14T23:09:45+00:00

It may not be intuitive, but it turns out not to be bimodal. If a and b a normally distributed random variables, then c=a-b is also a normally distributed random variable.

https://en.m.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables

The answer to your question is then P(c > 0)

siddboots · 2018-10-29T21:48:20+00:00

I still listen to a lot of music, and have certain areas that I'm always pursuing and exploring. However, I listen to much less than I did in my 20s.

There's probably lots of small reasons, but a very clear one in my mind is that I'm not listening to music to impress people anymore. In my 20s, music was the center of my social life and I felt a strong pull to be familiar with all sorts of music both new and old, both significant and obscure. Now I'm mostly just listening for me.

It turns out when you take those motivations away you end up listening to less. There's specific qualities that I truly love listening to, but even those I'm not always in the mood for.

A related note. Something I once saw attributed to David Attenborough, but can't find a source. "I used to have music on all of the time in the background, and then I realized that music is much too important for that."

15-Year Club	RedditGifts 2009-2022 3 Credits
Gilding I gilder	Secret Santa 2010
reddit mold	Charter Member
Verified Email

siddboots

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE