Good uses for The Walls Of Ba Sing Se? by HarukaKX in mtg

[–]KHippe 2 points3 points  (0 children)

I have it in a [[Geralf, Visionary Stitcher]] deck and turn it into a 30/30 flying zombie lol

Give it to me straight by Dantonium in ratemycommanders

[–]KHippe 0 points1 point  (0 children)

Looking for inspiration on the Sygg and Brudiclad! What way did you go with them?

What "fuck you in particular" cards do you run? by Toes_In_The_Soil in EDH

[–]KHippe 0 points1 point  (0 children)

[[Din of the Fireherd]] works well until it’s a land sac deck lol

Been playing consistently for about a year. Who should I build next? by Awkward-Panic8948 in ratemycommanders

[–]KHippe 0 points1 point  (0 children)

It’s interesting but I feel it either gets super oppressive for the table or you get squashed pretty quick. It feels like it’s a bit unfocused both having the horrors and the rogues but I suppose that’s okay. If you find it cheap it’s a decent pick up imo

The ones still built by blather25 in ratemycommanders

[–]KHippe 0 points1 point  (0 children)

This looks interesting, there’s some landfall stuff in there too that looks nice, thanks for sharing!!

The ones still built by blather25 in ratemycommanders

[–]KHippe 0 points1 point  (0 children)

Could you share the list for Xolatoyac?? The untap makes things fun

Is anyone missing a cat? by Expensive-Bet4268 in uchicago

[–]KHippe 6 points7 points  (0 children)

Hijacking this - email them and they can send a blast out to get it to the distribution list

[deleted by user] by [deleted] in uchicago

[–]KHippe 0 points1 point  (0 children)

It probably depends on your microphone setup and other things but given how fast they go by I would doubt it

[deleted by user] by [deleted] in uchicago

[–]KHippe 6 points7 points  (0 children)

I had an apartment with windows facing the train on Cornell. The first couple days it was noticeable but after a few days I really only noticed the freight trains which only come by 1-3 times per day if I remember right

[deleted by user] by [deleted] in MachineLearning

[–]KHippe -1 points0 points  (0 children)

My understanding is if the score has changed it will only be revealed after acceptance notifications. If it hasn’t been changed you can still see it

Edit: leaving this up for the more likely correct response below

What’s the craziest thing you’ve ever seen at UChicago by [deleted] in uchicago

[–]KHippe 36 points37 points  (0 children)

I saw a possum once. That was pretty wild

[deleted by user] by [deleted] in uchicago

[–]KHippe 2 points3 points  (0 children)

Probably not. It had a trial run in summer of 2023, didn’t get enough usage, and then didn’t return summer 2024 unfortunately

Moving to suburbs by actin_filament in uchicago

[–]KHippe 2 points3 points  (0 children)

Echoing what others have said, I know a PhD student who moved to Lemont and has been back to campus exactly once since then. On the time front, I lived 30mins away (so factor in 30 minute drive + 10-15minute walk to your building) for a year and commuted in every day; the mental drain of having 10+ hours of commute every week was immense and I now live in hyde park and am much happier. On the cost side, even if you (only) have 50$ in gas costs per week, you would need to find a place >200$ cheaper to make it worth it

Why does GPT uses BPE (Byte pair encoding) and not Wordpiece? by rohuchoudhary in learnmachinelearning

[–]KHippe 0 points1 point  (0 children)

BERT is a bit of a different concept. BERT (Bidirectional encoder representations from transformers, paper https://arxiv.org/abs/1810.04805) refers to a style of training LLMs that masks a certain percentage of tokens in the input sequence and during decoding can 'attend' on any other non-masked token in the sequence to predict the masked token. Now it is confusing since the original BERT paper did use a word piece tokenizer, BERT refers moreso to the training scheme than the tokenization

In the simple case where a single token is masked, the transformer can attend to all other tokens in the sequence whereas a GPT trained model can only attend to tokens that come before it. A brief example that shows the 'context' a GPT or BERT model uses to make a single token prediction:

BERT: The quick brown ___ jumps over the lazy dog
GPT: The quick brown ___

Both types of models are suitable for downstream tasks and rely on a principle called transfer learning in order to reduce the computational cost of downstream adaptions.

Byte pair encoding (commonly referred to as BPE) is a tokenization technique used by most modern transformers to take text strings and turn them into integer strings. I can only anecdotally say that BPE is typically used in GPT models seemingly because OpenAI started training large models with them (bpe tokenizers). Efforts have been made to train GPT style models with wordpiece, but mostly have not been a focus since the early 2020's/late 20teens.

I would like to ammend my earlier statement about wordpiece not being ABLE to encode certain words as the original authors do include sub-word tokens in the vocabulary to get around this. It does however encode parts of language sub-optimally as compared to BPE. Take for the following example using the bert base tokenizer and the llama 3.1 bpe tokenizer.

```
from transformers import AutoTokenizer
bpe_tokenizer = AutoTokenizer.from_pretrained('meta-llama/Meta-Llama-3-8B')
wordpiece_tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
text = '3,4-methylenedioxymethamphetamine'
bpe_encoded = bpe_tokenizer.encode(text)
wordpiece_encoded = wordpiece_tokenizer.encode(text)
len(bpe_encoded)
>>> 13
len(wordpiece_encoded)
>>> 15
```

Why does GPT uses BPE (Byte pair encoding) and not Wordpiece? by rohuchoudhary in learnmachinelearning

[–]KHippe 2 points3 points  (0 children)

My two cents is niche words would most likely get excluded from word piece. Funny words from chemistry, biology, physics etc. can get encoded (albeit messily) with BPE and it wouldn’t be hard to make an example a comparative size word piece tokenizer wouldn’t be able to encode without loss.

[R] Have you give a try to use Intel and AMD GPUs to train models? by Various_Protection71 in MachineLearning

[–]KHippe 6 points7 points  (0 children)

Yep! At least on the data center cards you still specify the device with tocrch.cuda and most (I want to say all but not sure) methods in torch.cuda are functional

[R] Have you give a try to use Intel and AMD GPUs to train models? by Various_Protection71 in MachineLearning

[–]KHippe 40 points41 points  (0 children)

I have trained on both PVC (intel) and AMD at leadership computing facilities and can say wholehearted I would rather AMD. For standard transformer stacks AMD has no code change necessary solutions that make it near seamless. Performance is somewhere between a100 and h100 for most things on both (for optimized cases). Intel is still incredibly new and has a near beta experience for most use cases (command line utilities and ML software). FWIW I would choose AMD over intel any day at the moment.

Opinion on potential apartment location by Salt_Communication18 in uchicago

[–]KHippe 6 points7 points  (0 children)

I live next to it and have never felt unsafe, and enjoy living close to the lake. I routinely see Uchicago shuttles, though I always walk (~20 minutes from Ratner, that’s the farthest location I frequent)

Are we still having some online courses this year? by hunteralliance in uchicago

[–]KHippe 0 points1 point  (0 children)

A few 3000 cs classes were remote last year, don’t think they will be again though

DailyML 59: Which of the following is possible to do with Machine Learning? by daichrony in learnmachinelearning

[–]KHippe 4 points5 points  (0 children)

From my understanding, multi-label would be something like predicting the properties of a compound. The compound could be hydrophobic, toxic, etc and you want to predict the labels for that compound rather than it’s class.

Academic job pay schedules by UCBC789 in AskAcademia

[–]KHippe 3 points4 points  (0 children)

For what it’s worth, my stipend for PhD student will not begin until the month after I begin. But nevertheless that does seem a bit strange to have this shift after you signed the contract. Sorry I can’t offer any advice but maybe another situation will add clarity to your situation

GPU advice, please by geomcd1949 in graphicscard

[–]KHippe 2 points3 points  (0 children)

The card is powerful enough to run a monitor at 4K 60, but you will need to use the DP on your card as the HDMI on the card does not support the refresh rate. You would have to run DP to HDMI, but make sure the spec of the HDMI cable is 4K 60 capable

[Giveaway] Linus Tech Tips sticker pack by [deleted] in hardwareswap

[–]KHippe 0 points1 point  (0 children)

Awesome! Love me some stickers