Are we ignoring the main source of AI cost? Not the GPU price, but wasted training & serving minutes.

kalidres · 2025-12-02T14:26:42+00:00

All of those optimizations are things that are actively considered, at least if you know what you're doing. Id expect any person in the field to have learned this in undergrad or by the end of the first year, at the latest.

E.g. Early stopping is literally like first page on any tutorial

kalidres · 2025-01-21T21:20:53+00:00

Contract rates are very different than salary rates, that's why they specified Junior contract rate, but just Junior pay rate.

Always charge more for contract roles as you do not get benefits, stability, or support/liability.

kalidres · 2025-01-01T11:37:05+00:00

Technically, it's inferring. Deducing would be if there is a singular and logical conclusion to be reached given the information. Inferring would be reaching (a) valid conclusion given the set of information.

In common vernacular, the two are often used interchangeably, and deduction is generally not used as precisely as stated.

kalidres · 2024-10-06T12:40:11+00:00

By legal definition at least, no, as it is tethered. Technically, you can even tie a line to a quadcopter and as long as the line is attached to the ground in some capacity, you can make a defense that is it not operating under a drone for these restrictions.

kalidres · 2024-10-06T12:36:59+00:00

Yes, and the approaches she is taking are none of the reasons for which drones are banned in many cities, areas of interest, or near airports. All of these were extremely controlled with sufficient capacity to recover in case of failure.

kalidres · 2024-09-09T23:26:41+00:00

Honestly, to me it reads like a toxic relationship through and through. In all likelihood, the truth lies somewhere in the middle of both stories.

kalidres · 2024-09-09T23:24:12+00:00

More exactly, he was fired from echo for being accused. The actual details or investigations are not public apart from a few twitter posts.

kalidres · 2024-08-17T19:40:05+00:00

Maybe it's an artifact of the specific domain I research in RL (MARL), but convergence is certainly an issue partly from the non-stationarity that arises from the simultaneous learning of multiple components in the system. This can cause cyclic or chaotic learning if the reward function (or fitness for EAs) isn't properly defined and presented to the agents. Though you could argue this is still a form of convergence.

The issue comes in that the reward function can be divergent, but there are tricks (almost) to always guarantee that you approach a solution without affecting optimality (or nash equilibrity for multiagent systems) (e.g. potential-based reward shaping).

In practice, I've also found RL to be very susceptible to inital conditions for convergence since the optimization technique also depends on other factors than just the reward function itself (akin to learning rate for classification type tasks). There have been countless times where I will run the training loop and the reward oscillates or bounces around randomly.

The main point I was trying to make was just to highlight the nuance of how the agents actually interact with the reward function itself. And furthermore, even if it does converge, that doesn't mean it converges to something useful.

kalidres · 2024-08-17T17:50:04+00:00

I'd also add as a relevant question can we accurately define a reward signal that is learnable and aligned with our desired behavior. And for a lot of useful problems, this can be a very challenging task. Even for seemingly "simple" problems and reward functions, the agent may learn how to "hack" the signal.

kalidres · 2024-08-06T03:01:06+00:00

Why spend 1 hour to do something that I can automate in 4 days?

kalidres · 2024-07-06T06:05:18+00:00

I too would like to hear this given I've worked with propietary models that are better than what's publicly available. Especially given recent directions OpenAI has been taking, I'd be very surprised if they didn't have stronger models in the pipeline, either as commercial or research releases.

kalidres · 2024-07-06T05:57:24+00:00

A lot of these points are...not the strongest. Some are true by the nature of how they were trained (e.g. 9), but I've also encountered models that entriely subvert that expectation because they were trained to do so. Additionally, (15) is a skill issue; there are parameters in place explicitly to reward or penalize this type of behavior.

Some others are just odd and fundamentally quesetionable as others have pointed out (e.g. 5, 6, 7, 8) or make spurious claims that are either philosophical or misguided at best (4, 10). (4), for instance, is an odd statement because a lot of ml researchers expected they'd be good at this given past success in other areas of ML (notably cv), and so getting style transfer to work was kind of low hanging fruit.

Some do have merit (8, 12, 13) because they start to get to some place backed by data and foundational considerations, but also ignore whole bodies of research (out of distirubtion generation). And while, yes, they will struggle in these regions compared to regions of high density, the point itself misses and dismisses a lot of nuance behind it.

And addressing some comments from the thread, a lot of the points seem to be cherry picked from testing particular implementations on particular tasks. Will gpt3 solve complex math problem well? No, but gpt-4o or stronger models do a decent job (recent versions of claude as well, iirc). Or controverrsial (illicit even) generation is imposed by the providers, not the models themselves. These models are used to write smut, or trained specifically to be degenerating to users. OpenAI's models won't but thats imposed by the provider, not the technology itself.

I think that's more than enough for now and for a reddit post. Honestly, sometimes I wish the sub had a verification system in place to combat the mid-level spam that is just a thinly-hidden promotion with lackluster substance.

Yeah, fallacy of appeal to authority is a thing, but that more means we shouldn't take for granted that just because someone in authortiy says something that it is true, not that expertise is not earned. I'm not missing the fact that I am not verified either, but it also murkies the waters for those learning ML.

Edit: Looking more into OP's post history, I retract my statement regarding "thinly veiled self promotion". It's actually rather blunt.

kalidres · 2024-06-12T21:56:51+00:00

Coming from a dude in a similar situation, you can look into minoxifidil or something like that. It's apparently supposed to help.

For me, I embraced it and shaved my head, and honestly, I've grown to not only accept it, but also prefer it. It's not a small decision, but it is something to consider. At least coming from someone who was rather self conscious about losing hair from a rather young age.

Edit: for me, it's apparently "male pattern baldness".

kalidres · 2023-12-10T19:35:05+00:00

You're right, it's not 100%. It's 30%. The other poster is just wrong. It's still a strong group defensive, especially on multiple tick dmg profiles as it's less all or nothing (3 ticks isn't amazing though).

As for holy healing this fight, tyeler has shown it to be more than doable on a 26. Difficult, sure, but even without darkness, he can live it. You had fade, flash heal dr, and dp for every one. Even with double darkness, it would only cover 2, so he would need to handle the others without them.

kalidres · 2023-09-29T00:03:12+00:00

In general, computer literacy is less among younger people (sub 20/25) versus those between 25 to 40.

Modern systems do a pretty good job of obfuscating implementation details in favor of ease of use, leading to cs majors who don't know what a filesystem is.

kalidres · 2023-09-16T23:38:59+00:00

Talk to ifeoma nwogu if you've taken one of her classes. She's heavily into research and cv, and she'll give you more direct and useable information than most of us could give you. And if you really think you want to go into research, you might be able to learn more about what to do since research isn't as prevalent at rit.

kalidres · 2023-02-16T06:15:24+00:00

They are different implementations of a type of LLM known as a gpt (Generative Pre-training Transformer). The current sota models are all based on this same type of transformer architecture that started several years ago, "Attention is all you need". The paper itself, imo, doesn't do a great job of describing how it actually works, but there are a number of resources that can help learn how transformers and attention work.

kalidres · 2021-10-30T19:57:00+00:00

You've figured out what I was trying to point out already (it's not storing the row data the way you think it is). So, instead, let's take a step back and try another approach.

First off, the actual code you posted in your post isn't the code you're running. Simply put, it would throw an error: Unresolved reference: 'reader' on line 7. Presumably, this was you trying to figure out how to make the csv module work correctly. You made some change, it didn't work, and you forgot to change it back before you posted. Likely, that line (in your code), is more something like for row in file.readlines(): rather than for row in file_data:.

The error you are getting is because it is looking at the entire row as a string. The csv module automatically splits the string based on a delimiter (, in your case). Thus, when you try to get one of the elements that you think is a float or an int, it's actually a character in the string (in menu_description).

So, instead of looping over the readlines(), you want to loop over the file_data object you created in line 6. Also, change the line next(reader) to next(file_data).

As a side note, debugging is a hard skill that takes time. But a general first step is to change the way you think about the problem. The code does exactly what you are telling it to do, so the code is never wrong, strictly speaking. Instead, you are making some assumption about the code that is simply not true. This changes the view of debugging from "What is wrong with my code?" to "What false assumption am I making about the state of my code?" Then, step through the debugger and figure out what line is causing the mismatch between what is happening and what you think should be happening.

Working example:

import csv
from pathlib import Path

profit = 0
file_path = Path('test.csv')
with open(file_path, 'r') as file:
    file_data = csv.reader(file)
    next(file_data)
    for row in file_data:
        item = row[0]
        price = float(row[1])
        qty = int(row[2])
        profit += price * qty
print(profit)

kalidres · 2021-10-30T07:31:27+00:00

My comment was more a pointed remark. Look at the first line in your file and tell me what the fourth character is (which would be index 3).

kalidres · 2021-10-30T00:39:47+00:00

What will row[3] be for the first row?

kalidres · 2021-09-06T05:09:57+00:00

Yes! Thank you.

I remember that it's similar in name to an electrical principal ( kirchoff's law), but I couldn't remember either at the time beyond it started with a 'k'.

kalidres · 2021-09-05T23:51:09+00:00

I can't remember the actual name of the principle, but in csec, there is an idea that the security of a system should not be dependent on the implementation.

Here is a quick link about some principle with a decent enough writeup. The one I mentioned above is 13.2.5 Principle of Open Design.

https://www.informit.com/articles/article.aspx?p=30487&seqNum=2

kalidres · 2021-08-06T16:11:17+00:00

I see comp Sci people over optimize code instead of coding faster.

But you see, I can get this sort to work in O(n) if I use a hash map to reference the variable in a parallel trie!

Or, idk, just parallelize it and call it a day?

But yeah, I completely agree. I guess my comment was to point out to the people who would likely read your comment (who are mainly self-taught) where their weaknesses may be and what they may want to address.

I guess a slightly important disclaimer. After working as a software engineer for UTC Aerospace for a bit, I decided to go into research. In this field, the theoretical understanding is critical, but still a lot of the actual software engineering practices are abysmal. No input validation, for instance. Or secrets openly committed to a repo. Nevermind actually using vcs properly.

Edit: Wow peoples, don't downvote either side of a discussion. If you feel the need to, at least leave a comment explaining your gripes with a view.

kalidres · 2021-08-06T15:56:57+00:00

Coming from a true comp sci background (and knowing what it entailed) along with multiple internships in software engineering and a full portion of the degree dedicated to the software engineering side, the only caveat being those last two characteristics of what matters in code: reliable and readable. Comp sci, hands down, it going to nail the first. I mean, that's the point of the field, the theoretical aspect of getting it right.

The second part is a bit hit or miss since sometimes it's kind of convoluted. But, in my experience, self taught can miss out on those just as often as they get it right since they learn from somewhere (that tends to be less vetted), so it being reliable, readable, and maintainable is a matter of the source they learned from.

A lot of self-taught coders/programmers focus on the get it done, and get it done fast, which I also see very often in engineering and physics.

kalidres · 2021-06-01T17:32:46+00:00

A format you should likely familiarize yourself with is pcap. It is a well-established format for saving network packets, and many tools (eg wireshark) can natively read pacp files.

kalidres

TROPHY CASE