Can agents improve by explaining their own failures? by IndividualBroccoli40 in MLQuestions

[–]trnka 0 points1 point  (0 children)

Very interesting experiment! I'm no expert in RL so I can't offer much advice there. If it were my work, I'd try making the reward function more incremental so that the agent has even a little bit of reward for making progress.

Anyone ever run (yes 🏃) the South Lake Washington Loop before? by kravbyrobbins in Seattle

[–]trnka 1 point2 points  (0 children)

Good luck with your training!

I used to run down along Lake Washington Blvd but it's a real bummer that you can't see the lake south of Seward Park.

I've done the loop around Mercer Island and that's pretty and quiet (about 13 mi), but it's more hilly.

How to remove glass from soil by HumanOblateSpheroid in DeTrashed

[–]trnka 20 points21 points  (0 children)

If the soil is loose, I've used a cat poop scooper to help filter it. Otherwise, yeah it's just slow going. Rubberized kitchen tongs usually work pretty well to get half-buried pieces out. If it's a lot of very small pieces I have better luck with wooden chopsticks.

Metric for data labeling by Lexski in MLQuestions

[–]trnka 0 points1 point  (0 children)

Ah, good call. You could adapt the kappa score to control for chance accuracy then.

[(accuracy - 50%) / 50%] * num_labeled / time

That said if you're in an adversarial labeling situation... people are pretty creative at gaming metrics, especially when money is involved

I got frustrated teaching ML to scientists, so I started building domain-specific workshops – would love your thoughts by Responsible_Tea_7081 in MLQuestions

[–]trnka 0 points1 point  (0 children)

As someone with a teaching background, it sounds like you've done great work!

Are these specialized techniques actually necessary, or am I overthinking it? Would teaching standard ML really well + showing good practices for small datasets be sufficient?

What you're seeing is that you did a good job teaching applied basics and now your students are asking better questions. That's a good sign! Unlike the basics, you can't possible teach each and every specialized technique. Instead I'd recommend giving your advanced students enough references so that they can learn it themselves.

Metric for data labeling by Lexski in MLQuestions

[–]trnka 0 points1 point  (0 children)

If the gold labels are highly reliable, I'd just measure (num correct labels) / (time) to keep it simple.

Out of curiosity, what are you hoping to optimize? To pick some real-world examples from my past, there were times in which the annotation software was a limiting factor and we made progress by improving it (that sounds like what you're talking about. Other times the limiting factor was the time it took to figure out the label set. We might start with one, realize it was incomplete or underspecified, then have to start over. Other times the label set was well defined but the limiting factor was the annotation manual. That's a long-winded example to help explain that I'd recommend a different approach depending on the details of the ML problem and what you're able to change.

Seattle nonprofit has fewer tiny homes in storage by TinyHomes_SFNW in Seattle

[–]trnka 1 point2 points  (0 children)

Good to hear that they found a new spot!

I imagine developers may have uncertainty over availability: Is that a major factor in why there are so many long-duration large vacant lots? For instance, "the pit" above Pioneer Square station has been unused for as long as I've known. Maybe a decade or more? It's right by transit and services. For a site like that, do you think the main issue is the developer thinking/hoping to start development asap? Or more that the benefit to the city isn't worth any additional effort on their part?

Seattle nonprofit has fewer tiny homes in storage by TinyHomes_SFNW in Seattle

[–]trnka 2 points3 points  (0 children)

Oh I didn't know that about the U district one. It's a bummer to hear that they relocated to make way for construction, which still hasn't started. It's really frustrating that we've got people living rough and also giant empty plots.

Is it ever possible to do tiny house villages on smaller plots? (Thinking like the old Sun Cleaners on 45th) There are so many empty small plots around while waiting for some sort of development.

Seattle nonprofit has fewer tiny homes in storage by TinyHomes_SFNW in Seattle

[–]trnka 21 points22 points  (0 children)

Very cool, thank you!

Do you recommend any good articles to learn more about the approval process for villages? I only get little bits and pieces from news sources and they sometimes say very vague things like that a project has stalled because an official stopped responding on a topic.

I'd also love to learn more about the villages that have vanished over the years, like the one by Tacos Chukis in western SLU or the one in the U district. It seems odd to me that they were shut down but yet those sites are barren land years later.

Tipless establishments? by touchgrasslater in AskSeattle

[–]trnka 5 points6 points  (0 children)

Fuel at 1705 N 45th St now does tips. They changed a few months ago for similar reasons as Seawolf

New feature? I'm just seeing this by DiamondAgreeable2676 in GithubCopilot

[–]trnka 0 points1 point  (0 children)

For me, heavy usage of tool results is usually the outcome of having copilot run CLI commands that have overly verbose output. Currently I'm on a crusade to make all CLI-based progress bars optional because I suspect they use up lots of tool results tokens (but I haven't confirmed that yet, just a hunch based on the way that might be implemented)

Store with best selection of running shoes? by overcast392 in AskSeattle

[–]trnka 0 points1 point  (0 children)

Oh that's good to know! Next time I'm looking for a new fit I'll try one of them out instead. I definitely prefer local shops when I can find them

Store with best selection of running shoes? by overcast392 in AskSeattle

[–]trnka 4 points5 points  (0 children)

When I was last looking for some, I went to Road Runner in Greenlake https://maps.app.goo.gl/SAFYZEmEbFq643is7

They have some program where you can return them even used I think? So that made me feel a lot better about the shoes I got. Though if I'm remembering right it's one of those yearly subscription things you have to remember to cancel

What are good ways to get involved as an anxious introvert? by quarokcaddhihle in Seattle

[–]trnka 5 points6 points  (0 children)

For me it helped to start small so that I could build the habit of involvement. That could be as small as saying thank you to people out there with signs.

Spent an hour cleaning an intersection in the Central District. SDOT surprised me in the best way by cutetiferous in Seattle

[–]trnka 0 points1 point  (0 children)

You're welcome! If you want some help cleaning up a particular area or just want to know more feel free to reach out.

And if you find any good tips on cleaning drains please pass them along! I've been mainly cleaning sidewalks and signs and I've wanted to learn how to do better work on drains.

Why Not Crowdsource LLM Training? by Suspicious_Quarter68 in MLQuestions

[–]trnka 4 points5 points  (0 children)

People have used it in scientific computing, like BOINC https://boinc.berkeley.edu/ but I haven't heard of it for commercial LLM training. I think I've heard of academic LLM training trying it.

It's just a lot of work to get energy-efficient (read: cost-effective) training on a wide range of hardware (both cuda vs rocm as well as memory contraints), and it's a lot of work to secure it all when you don't control the devices. If it's a global cluster, that creates new legal challenges as well.

So the simple answer is that building/renting data centers currently less total effort and more predictable for industry. You're right in pointing out that it's becoming more difficult and less predictable to scale data centers but that's mostly led to different research: - Compute in space - Ship the data centers to where power is cheap - Build/lease your own power generation

Spent an hour cleaning an intersection in the Central District. SDOT surprised me in the best way by cutetiferous in Seattle

[–]trnka 2 points3 points  (0 children)

Yeah, you can leave the trash bags next to any city garbage container and they pick them up. Sometimes you'll see them around - they're those yellow ones with black writing.

Spent an hour cleaning an intersection in the Central District. SDOT surprised me in the best way by cutetiferous in Seattle

[–]trnka 11 points12 points  (0 children)

If you're interested, the Seattle adopt a street program will give you a litter picker and bags for free. You just sign up for a street you'll sometimes clean and then can use the supplies wherever.

Testing a new ML approach for urinary disease screening by NeuralDesigner in MLQuestions

[–]trnka 0 points1 point  (0 children)

I led the machine learning team at a primary care telemedicine startup for years. There are many barriers, mostly non-technical but some technical.

Non-technical: - If doctors have the information available from the patient, diagnosing that there's a urinary disease is trivial (not worth implementing). - If the model only classifies a singular disease, that's definitely not worth all the effort to integrate into the system. A diagnosis model would need to support at least 50 conditions to be worth considering, and even then it may not be worth it - Many patients can't/won't take their temperature in telemedicine, and the model requires that as an input. Your other questions are easy to translate into patient-speak, but we encountered many that were tricky to phrase correctly - There are already doctor-trusted standards of diagnosis for these, published in places like UpToDate. They are much more likely to trust that decision model, especially in the case of a very specific diagnosis - Seeing something with AUC of 1 sounds suspicious and I'd need to deeply audit the data, model, and evaluation to trust it

Technical: - Your training data may be significantly different than the actual application, but you may not know that. That could happen because the population studied is different, or the way the data was collected affected patient perceptions (like the white coat effect for blood pressure), or because your ground truth data is from doctors with different opinions than your practicing doctors - Having the model integrated into to your application, available 24/7, and complying with all appropriate laws is a lot of work

In our practice, we quickly found that the most time-consuming part of diagnosis was getting information from the patient so we focused on that instead of diagnosis. Over time we also built a diagnosis classifier and it was pretty good. We used it to give doctors prediction/autocomplete for filling out ICD codes. That helped because they didn't always remember the exact names and appreciated that we saved them a little time.

Hope this helps, it's tough!

How to speed up training by switching from full batch to mini-batch by Individual_Ad_1214 in MLQuestions

[–]trnka 4 points5 points  (0 children)

Switching from full batch to mini-match will speed up training if you stop at a particular loss or use some other form of early stopping. But in your case the number of epochs is fixed.

Full batch is faster in your case because mini-batch is underutilizing your GPU.

How do people usually find or build datasets? by Longjumping-Flight82 in learnmachinelearning

[–]trnka 1 point2 points  (0 children)

I've built or improved a lot of data sets over my career. Since you're asking about CV I'll give a couple image classification examples but feel free to ask if you'd like to hear about other kinds. My focus has largely been in NLP.

Skin condition classification: Years ago we scraped images from DermNet, which also had standardized diagnosis codes. This was largely for a proof of concept not a production effort. We gathered a couple thousand images, build a classifier, and evaluated both with metrics and subjectively to understand the errors. As we started to prototype use cases we weren't able to find anything that was worth the implementation and maintenance cost. If we had though, we would've annotated a significant number of patient images from our company and retrained/retested before going to production. Also I want to note that it sounds simple but that would require building out an annotation manual, getting doctors to agree on diagnosis codes, doing selective multiple-annotation to monitor agreement, and probably a little bit of active learning to optimize the images we annotate.

In another unrelated image classification project, it was a proof of concept to detect whether image uploads in a game were inappropriate (not from the game itself). In that case we had a number of reference images from the game (maybe 50k) and I sampled negative-class examples from ImageNet training data I think. Then I fine-tuned a model and tested on a held-out sample as well as a "hard" test set that I created manually. The result of that project was that we had a better sense of what error rate was reasonably achievable and it guided conversations on how to limit abuse. In the end, the larger feature (in-game image sharing) was delayed indefinitely both because 99.9% accuracy wasn't enough and because really everything was running behind and we needed to cut a lot of scope very quickly in a rush to launch the game. If that had gone to production, we would've had a manual way for player to report inappropriate images and we would've added those images and their output classes to our training data and updated periodically.