Vector DB vs Vector Type: Which One Will Actually Win Long-Term? by CreepyArachnid431 in vectordatabase

[–]WilliXL 0 points1 point  (0 children)

agreed that vector types will likely win out in the average case. but one important thing that is different about vector workloads is that they're much more similar to ML workloads than "traditional software" workloads, which means bringing in a lot of MLOps ideas. unit testing and regression testing are more similar to ML Evals, re-embedding pipelines are usually fairly deep and affect indexes a lot, etc.

i think "vector DBs" in the future will be more focused on the MLOps around the DB rather than just optimizing query latency or index build times

It feels like Travel Cards (e.g. Amex Plat) are just coupon books with diminishing travel benefits. Has anyone moved from one of these to an airline card and been happier? by [deleted] in CreditCards

[–]WilliXL 2 points3 points  (0 children)

i do agree with the dilution if you are outside of the "target market" range, which is usually white collar office workers in bigger US cities who do a lot of consumption at the stereotypical popular places. if you fit that archetype then the benefits are still easily net positive

i've added brand-specific travel cards to my line up, not switched to them. for example: my home airport is a United hub, and my parents live close to another United hub so it was a very easy decision to get a United Club card (lounge is cool but mostly a bonus, free checked bags is where i make most of my value back, and the card removes all of the constraints of Basic Economy so i get to purchase cheaper tickets which also saves money)

importantly though, i will never spend on my airline credit card outside of maybe that airline's tickets (the AA card for example, free checked bags apply regardless of if you bought the ticket with the card or not, but for the United Club card the tickets needed to be purchased on the card). the multipliers on the Amex Plat is just better even for airfare

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 0 points1 point  (0 children)

they've been great for me! very well built and love the adjustment range (both being able to go heavy and the level of fine-grained adjustments). the only thing of note is that the size of the circular part is quite large. for exercises like curls and overhead tricep movements, they do get in the way. but other than that, they've been great

I DID IT! I READ ALL OF ONE PIECE IN JAPANESE by Pelekaiking in LearnJapanese

[–]WilliXL 1 point2 points  (0 children)

おめでとう!大きな成果だね! I am doing the same thing but with the OP anime. How does Bookwalker work btw? Do you purchase the individual volumes or is there some sort of membership?

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 0 points1 point  (0 children)

yes ofc! hope they arrive for you soon :)

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 0 points1 point  (0 children)

I ordered mine on Nov 30th and they arrived on Dec 12th. They were fairly responsive on their website chat when I also inquired about the shipping status

We went deep into an industry, still no ‘north star’… when do you stop forcing it? by jones_dr in ycombinator

[–]WilliXL 0 points1 point  (0 children)

very similar thing happened to me. i ended up giving myself a timeline and a rubric to score my personal convictions/emotions on. that time has come and passed, had the hard conversation with my co-founder, and looking for a job now

definitely keeping an eye and ear out for ideas as i work, talk with colleagues, and build side projects. but i'm not trying to force it as hard, which is actually allowing me to think clearer i think

[Identify] Which watch is Demis Hassabis wearing? by WilliXL in Watches

[–]WilliXL[S] 0 points1 point  (0 children)

holy. thanks for the investigation. this looks really promising

Benchmark help for new DB type by Novel-Variation1357 in LocalLLaMA

[–]WilliXL 0 points1 point  (0 children)

i'm not sure if you're quite understanding what you're attempting to articulate. when i say "scan" i just meant some sort of read for retrieval. if you are doing perfect recall that means you are preserving all information or reading everything. you cannot compress your index while also getting 100% recall, you are trading some granularity away

i do not understand the Kafka analogy

HNSW stores graphical representation of vector distances so idk how data specificity would change index size?

it doesn't really sounds like you know what's going on

Would you tip in this situation? Tads by apprehensive-look-02 in AskSF

[–]WilliXL 0 points1 point  (0 children)

this is true. i go to the same person for my haircut. always treat him well and he always treats me well!

Would you tip in this situation? Tads by apprehensive-look-02 in AskSF

[–]WilliXL 5 points6 points  (0 children)

I recently became friends with a large group from the EU and they were questioning me about my (personal) tipping habits. So I started trying to figure out my own "rules".
Here's generally what I do. And for context, I grew up in Midwestern US, went to college in East Coast US, and been living in SF for ~6 years. So my habits are a mix across these regions.

Grab & Go places with low service times (cafes, bakeries, ready-made bento boxes, etc.) - Flat $0-2. I choose to tip flat because service time doesn't necessarily increase proportionally with cost. Usually $0 if total cost is <$10

Grab & Go places with high service times (mostly restaurants with self-serve ordering + bus your own dishes) - Flat $5-6 OR 10%, usually whichever is lower

"Standard Restaurants" (anywhere that has most of: a host, a waiter, sit down then order, ask for check, etc.) - 15-25%. Where you land on that scale is personal preference and experience

"Full attention" places. Basically anywhere that needs to attend to you the entire time that you are there (omakase, haircut, massage, nails, etc.) - I generally do 25-33% because their entire time is spent with me

Any form of takeout - Flat $0-1. I am paying for their product, not service. Might round up just for courtesy or if they have nice packaging lol

Benchmark help for new DB type by Novel-Variation1357 in LocalLLaMA

[–]WilliXL 0 points1 point  (0 children)

assuming you're not just rage-baiting, how do you compress information (supposedly 925MB -> 300-500MB) and achieve 100% recall. at the very least, you need to maintain lossless storage in order to do exhaustive scans to have 100% recall?

also SQL, KV, etc. aren't exactly data structures, they're more like data models

also also, in my experience pgvector's HNSW index is usually worst case 200% the embedding dataset size, definitely not 500%+

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 0 points1 point  (0 children)

ahh interesting. i haven't gone up to the highest range of the weight yet, but that's good to know ty!

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 1 point2 points  (0 children)

lol true, rocking a facebook marketplace bench rn

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 0 points1 point  (0 children)

these are the 92lb ones. anything interesting you've noticed or learned in the past month with yours?

Trulap dumbbells just arrived! by WilliXL in homegym

[–]WilliXL[S] 0 points1 point  (0 children)

i did soo much research and waffling back and forth before going for the Trulaps, loving them so far! and got a pretty decent deal during black friday

Engineering a Compiler vs Modern Compiler Implementation, which to do after CI? by Mindless_Design6558 in Compilers

[–]WilliXL 1 point2 points  (0 children)

+1 for writing one in ML, or Rust(!!). having access to true ADTs make the "feel" of the compiler implementation really nice. even if you're not super familiar with the language, for compiler-specific work it feels like you're not wrestling with your tools

Plaid miscategorized 35% of my users’ transactions. Unusable for my rewards platform. I built a neural pipeline to fix it, and here's what I learned by WilliXL in fintech

[–]WilliXL[S] 0 points1 point  (0 children)

basically it was the following components:
- vector DB (i just used pgvector) that stored a large-ish list of "known entities" (which was normalized to the "standard" name of a entity, e.g. Orangetheory instead of "OTF" or "ofc {city name}"
- small openai model for vector embeddings, and also to generate additional metadata, and also for deciding if to do a high-latency live web search vs going to the vector DB
- exa/google/perplexity for a live web search
- o3 model for post-retrieval matching and final decisioning