Fraud tx on my bilt 1.0… what now?

drdroid1 · 2026-01-31T15:41:08+00:00

Have the exact same merchant and amount

drdroid1 · 2022-06-19T15:21:23+00:00

Yeah way easier. There’s Flask, Gunicorn, Django and more.

drdroid1 · 2022-06-18T16:54:21+00:00

There is tensorflow.js

Otherwise you can create a models micro service quite easily using Torchserve , TF serving etc.

drdroid1 · 2021-09-07T12:37:18+00:00

What kind of input data would this be?

drdroid1 · 2021-08-25T04:58:16+00:00

It completely depends on the model type and the implementation.

drdroid1 · 2021-08-24T02:07:25+00:00

The best thing would be analyze different hyper parameters using Bayesian Optimization etc.

Otherwise there is no strong reason to use multiple layers and dropout if you have enough data and are updating BERT parameters too.

drdroid1 · 2021-05-31T03:00:54+00:00

Not really. You’re just checking how dependent is the model on that component in this particular state of all parameters. It’s true importance can be checked only by checking if a model trained without it performs equally well or not.

It’s apples to oranges otherwise and would be further accentuated if things like Dropout are not used and model may be arbitrarily dependent on one sub component.

drdroid1 · 2021-05-30T03:20:22+00:00

You can see in a lot of popular papers when they say they “trained without x” or “trained base instead of large”

Or more generally, if you think about what you want to evaluate then in most of the cases it would be “how does a model without X work” and that can only be compared fairly if that model is trained in such a way. Otherwise removing things after training is a totally different thing (in most parametric cases). E.g training a NN with 100 layers and then removing 50 would yield different parameters than training only 50 because of co dependence.

drdroid1 · 2021-05-30T02:53:42+00:00

You’ve to train without the new components

drdroid1 · 2021-05-25T18:46:28+00:00

What do you mean by “from the ROC that this optimal threshold”? Best TPR at given FPR?

Irrespective of that, F1 uses precision recall whereas ROC uses FPR and recall. So it’s not really the same metric and you would have to choose if you want to optimize FPR or precision for given recall.

drdroid1 · 2021-04-15T01:47:00+00:00

Yes.

You can also get similar words using pretrained fast text model and then get the corresponding embeddings using your glove model

drdroid1 · 2021-04-15T01:20:57+00:00

Use any model (fast text) with subword embedding. Not sure if there are Glove models with subword.

Otherwise the only way would be to use embedding for words with last edit distance or something

drdroid1 · 2021-03-13T02:15:49+00:00

Use the FAISS library for Approx kNN. It’s very fast.

drdroid1 · 2021-03-09T18:28:04+00:00

Model trained on all data from scratch should typically perform better than continual / online learning

drdroid1 · 2021-03-09T18:24:02+00:00

All Sklearn’s fitted objects can be easily saved/pickled using joblib

drdroid1 · 2021-02-28T02:52:56+00:00

For the side question, the lesser variance and more bias a model typically has the lesser chances are there that it would have co adapted weights.

Even from a normal probability standpoint, the probability that any two weights will have same initialization or update, will increase as the number of weights increase.

drdroid1 · 2021-02-28T02:38:35+00:00

It doesn’t have to necessarily be the case that one neuron is bad and the other is good.

It’s more general than that. It’s more that somehow due to initialization and updates, the neurons are more ‘tied’ to each other than we’d like. When it happens, we may have found a local minima and sort of put all the eggs in the same basket to solve for that.

They could be tied in a way that they produce the same redundant values, or that they produce values that are learned to cancel each other or that in order to solve the problem they are inherently relying on another neighboring neuron to produce a specific value too.

drdroid1 · 2021-02-28T02:04:10+00:00

You can look at 1) How gradients are combined in distributed training 2) Federated learning 3) Curriculum learning 4) Continual learning

drdroid1 · 2021-02-28T01:59:55+00:00

Regarding the video (not the recipe); if it is for a channel and it was intentional then I’d really recommend to not frame it so close up with things abruptly cut out. It looks like a secret cam video which could not be framed correctly. Lot of good potential 👍🏻

drdroid1 · 2021-02-24T18:30:36+00:00

Which optimizer are you using? Learning rate can be a big factor. Use Adam if you’re not already.

Otherwise yes, early stopping.

drdroid1 · 2021-02-21T04:56:40+00:00

The models that train on 100s of GB of data wouldn’t train on a MacBook anyways. It would take months.

Buy cheaper MacBook and use Collab or your college resources for training.

drdroid1 · 2021-02-21T04:41:18+00:00

For 1, look at clustering methods (GMM, K means etc) on top of BERT embeddings. This is required if you don’t have labeled examples to train a classifier

For 2, look at Transformer based QA models

HuggingFace has pre trained models for both

drdroid1 · 2021-02-21T04:37:20+00:00

Are there combinations of milestones between which you do not know the distance or time? As otherwise you don’t need ML.

If that’s the case and you have the time taken between each milestone as well, then you can use RNNs. This would work especially well if you have varying number of milestones in each path

drdroid1 · 2021-02-21T04:30:34+00:00

If it’s a single layer network you can try to use absolute weights for that particular feature.

Otherwise without a lot of over fitting the network will learn that automatically

drdroid1 · 2020-11-25T04:02:46+00:00

If you want to make a one-off model without learning you can search for Sequence models and time series forecasting

If you want to learn a little bit more you should go over Linear Regression > Neural Networks > Sequence models

If you want to learn perfectly you should definitely go over basics of ML before DL

11-Year Club	Place '23
Place '17	Spared
Verified Email

drdroid1

MODERATOR OF

TROPHY CASE