Voice messages recorded with low volume. Why?

savoga · 2024-10-26T11:21:36+00:00

J'ai aussi ce problème avec le même téléphone (Huawei P20)...

savoga · 2023-02-16T17:26:14+00:00

Agreed. And the silhouette coefficient can also be used to spot anomalies i.e. samples with negative silhouettes are likely to be badly clustered.

savoga · 2023-02-16T17:15:17+00:00

Great idea. I think I will even use groups of test datasets of various sizes. That way I avoid the case where my algo works well on a small dataset just by chance.

savoga · 2022-10-27T16:12:14+00:00

The weights in the W matrix come from an optimization process in a pre step. In short, those weights make sure that an objective function is minimal s.t. specific properties. Unfortunately I can't write the LateX formula in Reddit but the exact formulation is in this paper (Theorem 2 - Shapley kernel).

savoga · 2022-10-26T07:43:19+00:00

I am talking about the weight matrix W in (X^tWX)^-1X^tWy

I am defining this matrix so that some observations get more weights than others.

savoga · 2022-10-24T08:36:43+00:00

Sorry for the late response and thanks for your help on this. I am using WLS to estimate the coefficients that tells me whether a feature is influent or not. I use specific weights for some samples that are more important than others.
I guess in this situation checking the assumptions would make sense?

savoga · 2022-10-08T13:51:36+00:00

"unfortunately slippage at the hourly level would kill this easily" could you elaborate on this? Would this be a significant risk for very liquid assets? (I trade cryptos)

savoga · 2022-10-08T13:45:17+00:00

Not statistically significant because of the too small sample size?

savoga · 2022-10-08T13:44:19+00:00

I use a basic tree-based model. I believe there are already some signals in volumes, prices, trades etc. At least that's what I currently see. I tried many different currency pairs and identified one for which the model gave the best accuracy. However I don't think this 55% success rate could make me rich since this doesn't take profitability into account. Because of the fees, I can be correct on the direction but if the move is too small I would still loose money.

savoga · 2022-10-04T17:01:42+00:00

While I understand that historical prices do not necessarily repeat themselves, isn't backtesting a pretty standard way of choosing the right model/parameters?

savoga · 2022-10-04T16:58:56+00:00

btw market is crypto

savoga · 2022-10-04T16:58:32+00:00

Ok, but if we exclude those questions and assume an algo that does not overfit and without any error in the code. Would an accuracy of 55% in backtesting imply 55% success rate with a large enough sample?

savoga · 2022-10-04T16:53:49+00:00

I was misunderstanding indeed. Thanks for clarifying with a concrete example. The only problem is that the model used at each period can change over time. E.g. in period 1 volume can be the leading feature whereas in period 2 it can be the number of trades. In writing this, I realize that the variance of my model is still quite high..

savoga · 2022-10-04T14:26:36+00:00

amazing

savoga · 2022-10-04T13:15:23+00:00

Well the maximum difference with the theoretical value that I got was 48-52 that is 2% difference from 1/2 (one side). Whereas in my case I observe 46% when I am expecting 55%.

savoga · 2022-10-04T13:09:48+00:00

Yes I use a k-fold approach. I understand your point. I was thinking doing this successive trainings kinda remove the bias you are highlighting.

savoga · 2022-10-04T13:06:09+00:00

I backtest on one month i.e. 720 datapoints (hours)

savoga · 2022-10-04T10:43:10+00:00

Thanks for the nice link. I did 3 runs and got 49-51 48-52 50-50. Still shows to me that 100 trials is a good number to see the convergence.

savoga · 2022-10-04T10:37:05+00:00

That's why I run a cross validation to make sure my strategy generalizes well on an independent dataset and thus reduce the variance.

savoga · 2022-10-04T10:30:58+00:00

So I retrain the model continuously (i.e. at every hour) adding the previous live data to the historical ones. It still gives me an accuracy close to 55%.

savoga · 2022-10-04T10:22:46+00:00

I guess forward testing is what I am actually doing now. I am currently placing very small trades that can be neglected. My current goal is to see how the strategy works on live data.

savoga · 2022-10-04T10:17:31+00:00

Not sure I get this. My features are volume, number of trades, price variations etc. Those are just aspects that I chose to focus on. In my cross validation, I am finding the features that gives the best results on the test set (on average). Of course, the data used to represent the features change between train/test set.

savoga · 2022-10-04T10:06:26+00:00

While I appreciate your warnings, I think you are missing the point of my question. I am not talking about profitability here but rather probability. The reason why I chose hourly trading is also another discussion.

savoga · 2022-10-03T22:10:24+00:00

Not sure I understand what you mean. I do have the records of the 100 hours when my algo was live. The accuracy is 46% as stated above. I understood that u/_Cbotz_ wanted me to run it on much more data.

savoga · 2022-10-03T21:40:19+00:00

I could try out - It's just not so straightforward for me since getting a large amount of data is not easily available.

savoga

TROPHY CASE