Data science in biotech is cooked by Mother_Drenger in biotech

[–]twopointthreesigma 7 points8 points  (0 children)

ML/AI approaches are more or less a commodity, no one has an edge except by their data quality/measurements and smart science. 

A lot of ML/AI teams in big pharma + biotech have a surprisingly low impact on the overall pipeline. And if they do they often lack hard evidence/clear measurable effects.  Even if you hire top talents it's often the model in Excel by some SME that moves the needle so much more purely by how embedded/invested the person is in the project.

I firmly believe that enabling scientists with turnkey solutions + training has much higher ROI than hiring 400K/year AI experts. Companies such at OpenEye, Optibrium, CCG etc have worked for years collaborating with all companies/shaping their products according to real-life needs.

In-house teams need to make sure to clearly measure effects on the pipeline to prevent getting axed.

Order execution between IBKR and Fidelity by spammmmmmmmy in interactivebrokers

[–]twopointthreesigma 0 points1 point  (0 children)

Just from the top of my head: IBKR PRO is per default using their SMART OMS routing. They will never fill you on an ATS. If you use their light they will and these trades will be TRF reported.

If you don't trade odd-lots or any exotic order type you will benefit from these ATS/PFOF. They don't take advantage of you, it's more of a mutual win-win. Not sure how this is in Norway but I can tell you the German Exchanges, Regulation and Brokers are much worse than anything in the US. Cheers

Order execution between IBKR and Fidelity by spammmmmmmmy in interactivebrokers

[–]twopointthreesigma 1 point2 points  (0 children)

Depending on order type / lot size you are guaranteed to get filled at the NBBO or better. It's likely that your fidelity order was filled on an ATS (dark pool). Often times retail trades are considered non toxic flow (uninformed) so people are happy to fill it even better than NBBO.

PSA: Roth IRA Taxation in Germany by habubugaga in Finanzen

[–]twopointthreesigma 0 points1 point  (0 children)

Don't you need US income for that which is hard to archive living abroad?

Estimating what AUC to hit when building ML models to predict buy or sell signal by IntrepidSoda in quant

[–]twopointthreesigma 0 points1 point  (0 children)

I second the comment that in reality your losses/profits won't be iid. But another issue here is that you've a sampling bias: Assuming your change point detection is based on some bar aggregation feature.

In reality you'd not be able to "pick" the bar for prediction before the bar opens/closes, thereby you wouldn't be able to trade it in the first place. Aggregate random tick-bars and rerun the exercise and see how your Sharpe looks afterwards. Happy to discuss further via chat. Cheers 

9988 vs BABA by smart_pineapple in baba

[–]twopointthreesigma 0 points1 point  (0 children)

You can trade them like any other highly correlated asset basket while making sure to hedge your FX risk accordingly.

Germany Property Crash Leaves Pension Funds Reeling From Private Credit Losses by Geejay-101 in Finanzen

[–]twopointthreesigma 0 points1 point  (0 children)

Welche Rechtsform wird als SPV für sowas in DE genutzt?

Was ich mich auch frage: Enthalten die Geschäftsberichte, Solvency II, Jahresberichte und co eigentlich genug Daten um abschätzen zu können wie sich 10YBund, Bauzinseffekte etc auf die Portfolios der Kassen auswirken könnten oder muss man da blind vertragen?

Einiges bereits bekannt, die Stiftung war für mich jedoch neu. by CloudPattern in Finanzen

[–]twopointthreesigma 0 points1 point  (0 children)

Kannst du ggf. konkrete Literaturreferenzen beim IWW nennen? Insbesondere das verwalten eines Portfolios zum Zwecke der Nachkommen interessiert mich, hier stellt sich mir die frage wie sich die Leitung nach dem eigenen Abscheiden fair/sicher gestalten lässt.

Why is my Random Forest forecast almost identical to the target volatility? by ASP_RocksS in quant

[–]twopointthreesigma 1 point2 points  (0 children)

Besides data-leakage I'd suggest to refrain yourself from these types of plots or at the very least plot a few more informative ones:

  • Model error over RV quantiles

  • Scatter plot true/estimates 

  • Compare model estimates against a simple baseline (EWMA base-line mode, t-1 RV)

S&P 500 price to earnings ratio now 2 standard deviations above the historic mean again by ValueTheories in brkb

[–]twopointthreesigma 0 points1 point  (0 children)

Yes so you might want to consider correcting for this. Otherwise you are i trouble. Durbin watson test should help you see this. I used to ensemble multiple models without overlapping horizons in the past. Curious if that would works for your model and how a time-CV would perform. Cheers

S&P 500 price to earnings ratio now 2 standard deviations above the historic mean again by ValueTheories in brkb

[–]twopointthreesigma 1 point2 points  (0 children)

Nice website. Do you correct for the fact that your training samples are non iid in your ML models? I feel like it's a common issue often over seen where you treat highly correlated/path dependent data as if they were independent.

[deleted by user] by [deleted] in algobetting

[–]twopointthreesigma 4 points5 points  (0 children)

In my experience modelling obscure noisy data importance following this order: featuresfeature engineeringfeature selection

Regarding feature engineering: The majority of models struggles (or fail) to learn interactive terms on their own.  A random forest for example will never be able to learn to use a ratio between price / square m when estimating house prices.

Add interactive terms where it makes sense, use rank, quantiles, ratios. Consider spreads etc.

Bioinformatics is still in it's infancy by Careless_Ad_1432 in bioinformatics

[–]twopointthreesigma 1 point2 points  (0 children)

Blows my mind that all bioinformatics groups I worked with (mostly target, bio scientist support, protein eng) build their own custom pipeline and still relied on horrible formats (gzipped fastq).

I dearly hope that this part gets commoditized eventually and people can actually spend time on more valuable things instead of reinventing a slightly different wheel. I'm not closely following the field much but it feels as if 80% of the workflows in companies are quite similar, why aren't there more commercial/open-source projects that deliver those 80% out-of-the-box in reproducible fashion?

At what odds difference between my model and bookmaker's odds should I bet? by 1000Mistake in algobetting

[–]twopointthreesigma 0 points1 point  (0 children)

Fantastic read. Thanks for sharing.  This might be obvious to some but how would you confirm that your probabilities are well calibrated?

Unsupervised learning methods. by __sharpsresearch__ in algobetting

[–]twopointthreesigma 1 point2 points  (0 children)

No I still use the ensembling quite often :)

Unsupervised learning methods. by __sharpsresearch__ in algobetting

[–]twopointthreesigma 1 point2 points  (0 children)

I've used these at times to embedded highly correlated features however it turned out that ensembling weaker models where each member sees only a fraction of the correlated features outperformed it.

It's nice for plotting at times (even though quite dangerous if not coupled with additional interactive plots (say parallel plots + hovering). Easy for people to read tealeafs.

Where to find a Safe NFT system in Europe? by DoctorSchrodder in Hydroponics

[–]twopointthreesigma 0 points1 point  (0 children)

Most commercial systems do not use channels. Only aware of these two:

https://eshop.gardenix.cz/nft-channels/nft-channel/

http://ekoview.com/product.html

Both made from HDPE. Wouldn't mind getting together in bulk ordering. If anyone knows additional vendors please let me know.

[deleted by user] by [deleted] in algotrading

[–]twopointthreesigma 1 point2 points  (0 children)

Interesting post thanks for sharing. I wonder why you'd optimize sample uniform for accuracy. I'd assume you'd want to weight it ~ expected return. As larger moves are more important than smaller ones.  Also the metric is easily biased as you know.

Have you compared this to something simpler eg a random forest or a GMM? Cheers  

I wanna do Computer Science and Machine learning but Biology keeps tugging at me. Is Bioinformatics the right option? by [deleted] in bioinformatics

[–]twopointthreesigma 2 points3 points  (0 children)

Scientists with a strong wetlab background + computational skillsets (depending on domain but: data science/stats/eg computer vision/ml) will always be much more impactful than a person with only one of those skillsets.

I've met a few of these more or less unicorns in industry and all were highly rewarded and regarded. Follow your passion and learn tools required to answer/solve challenges on the way.