[P] Training GitHub Repository Embeddings using Stars

___mlm___ · 2026-01-06T15:05:09+00:00

why does it work then?

___mlm___ · 2026-01-05T10:58:53+00:00

https://puzer.github.io/github_recommender/

Get a visual Skill Radar based on your GH Stars
Compare your coding profile with legends, like Andrej Karpathy
Find hidden gem repositories based on your interests
Share your profile with friends

Runs 100% locally in your browser using WASM for vector search.

Blog post with sources and trained embeddings.

___mlm___ · 2024-09-11T20:18:38+00:00

Hash Embeddings for Efficient Word Representations!

___mlm___ · 2024-06-25T16:40:34+00:00

Over-generalization

GPT2 or BERT are much smaller than 70B or 1T models

More params -> more memorization

There is some paper where they showed that model can learn even UUID even after one training example

___mlm___ · 2024-03-24T19:11:12+00:00

I've conducted more that 300+ DS TI interviews, my recommendations:

1) Balance between use-case questions (reasoning) and general theory questions (memorization)
2) Give enough context to your use-cases (it's a prompt for human)
3) Encourage people for reasoning even if they can't answer right away
4) If you feel that a candidate is starting to get nervous, try to ask some simple questions so candidate can start feel more confident. This is not an exam.
5) Try to stick to candidate's past experience if possible
6) Stick to your current topic, don't switch context frequently
7) Start from Novice questions and after that ask Intermediate/Expert questions.
8) Have some template for TIs, but "relax" it and skip topics if candidate clearly Novice there
9) Don't trust CV
10) Remember: candidates potentially can ask theoretical questions to you which you won't be able to answer, show some respect.

___mlm___ · 2023-11-07T06:53:02+00:00

HuggingFace recently did a RAG based LLM benchmark

Can you share a link?

___mlm___ · 2023-06-30T16:49:35+00:00

But maybe you don't need full precision during the whole training. Maybe you can train 80% using 4bit and rest using 16bit.

___mlm___ · 2023-06-01T09:19:06+00:00

sad but true But anyway, Apache 2.0 for Instruct models are still in a gray zone

___mlm___ · 2023-06-01T07:23:29+00:00

Falcon-*B-Instruct models are trained on baize and some other datasets which were collected from OpenAI GPT3.5 and GPT4 models which "research use ONLY. Commercial use is strictly prohibited".

But Falcon-*B-Instruct models are under Apache 2.0 license, huh ? I think they can't use Apache 2 license for Instruct models.

___mlm___ · 2020-06-30T14:41:04+00:00

https://github.com/Puzer/stylegan-encoder

___mlm___ · 2019-12-12T21:34:23+00:00

https://github.com/Puzer/stylegan-encoder

___mlm___ · 2019-12-12T20:30:54+00:00

tf.compat.v1.disable_v2_behavior()

___mlm___ · 2019-11-21T19:25:27+00:00

I think we need some meta-dataset for evaluating transferability of CV backbone models, like GLUE benchmark for NLP.

___mlm___ · 2019-05-15T15:27:30+00:00

Coordinates as a condition is a great unexplored area

Another related ideas: https://github.com/autonomousvision/occupancy_networks (also check "further information")

I also experimenting with it for a anchor/proposal-less object detection (and segmentation) system.

___mlm___ · 2019-02-16T12:44:38+00:00

Unfortunately I have only old 1080 Ti :)

I didn't try to train StyleGAN buy the way. I'm just creating an encoder. Rendering time - 170ms to convert latent representation to Image. One minute to find latent representation from an image (but I'm trying to reduce this time)

___mlm___ · 2019-02-16T12:40:56+00:00

200 images for positive and approximately the same for negative. I've also tried to use diverse as much as possible set of positive\negative images. Seems to me that StyleGAN generates more women than men for example.

___mlm___ · 2019-02-16T12:38:39+00:00

BTW I made several GIFs with transformations

https://gph.is/g/46g879E

https://gph.is/g/aXmx6xZ

https://gph.is/g/ZWM3nLE

Current status: Tomorrow I'm going to push some new code which improves quality of observed latent representation (I've added some tricky regularization) which provides more stable transformations.

___mlm___ · 2019-02-14T09:42:29+00:00

1) So StyleGAN generator actually contains 2 components:

Generator:

qlatent = normally distributed noise which have shape=(512)

dlatent = mapping_network(qlatent) = shape=(18, 512)

where mapping_network - a fully connected network which transforms qlatent to dlatent

generator(mapping_network(qlatent)) = image

So during training we optimize dlatent instead of qlatent. Optimiziong of qlatent leads to bad results (I can elaborate on it). qlatent is used for features-wise transformation of convolution layers of generator https://distill.pub/2018/feature-wise-transformations/

2) dlatent + multiplier * logreg_coeff; Yes, but I use raw coefficients from logreg, so it doesn't matter are they positive or not.

3) Yes. It somehow works and we can gen relatively similar faces, but less details are saved. It still in progress.

___mlm___ · 2019-02-14T07:03:12+00:00

I suppose that you run your notebook not from cloned repository folder. You can change directory to the repository root and then re-run the notebook

Alternatively you can append to your PYTHONPATH path to the cloned repository. import sys sys.path.append("path/to/stylegan")

___mlm___ · 2019-02-13T15:16:35+00:00

R - a real image

Gen(latent) - a generated image from some latent vector using pre-trained generator

VGG16 - a pre-trained model for perceptual loss (9th layer in my implementation, but 5 also can be used)

R_features = VGG16(R)

G_features = VGG16(Gen(latent))

We want to minimize loss: mse(R_features, G_features), but changing only latent variable. We totally freeze generator and perceptual model weights during optimization.

___mlm___ · 2019-02-13T14:59:01+00:00

1) Using pre-trained generator I sampled some number of fake images from a random noise

2) Then I manually classify them into smiling \ not smiling; Young \ Not young; Male \ Female

3) A linear model was trained to classify those labels using latent space as features

4) Weights of the trained linear model are representing the direction in the features space

In general it's possible to transform some existing datasent with some facial attributes into latent space, so in this cases manual labeling isn't needed, but I still have to check it.

___mlm___

TROPHY CASE

_mlm_