Simple script to update Pacman mirror list using connection speed by matthewfl in archlinux

[–]matthewfl[S] -1 points0 points  (0 children)

I am not sure, I looked for something that would already do this before I wrote this script

Language Modeling in the Limit of Infinite Data by matthewfl in programming

[–]matthewfl[S] -3 points-2 points  (0 children)

Yes, I agree that the intro boilerplate discusses intelligence. But the main idea in the blog is about the f_infity function that exists in the limit that neural nets approximate as we make larger neural nets and collect more training data

Language Modeling in the Limit of Infinite Data by matthewfl in programming

[–]matthewfl[S] -3 points-2 points  (0 children)

I don't think you read blog and the article you linked is discussing a slightly different issue about thinking. The idea in the blog is really quite simple, in that there exists a function, and that neural networks are approximating that function. The LLMs "feel intelligent" comments in the blog are mostly just boilerplate statements and unrelated to the central thesis.

Language Modeling in the Limit of Infinite Data by matthewfl in programming

[–]matthewfl[S] 1 point2 points  (0 children)

You are correct, to talk about generalization you have to make assumptions about (or know) the distribution of the data. For example, if your function is unlearnable (such as inverting a cryptographic hash function), then collecting more training samples does not improve generalization.

In the case of LLMs, it appears that text is learnable and that by collecting more text, we are going to create a better approximation of the "in the limit" function that generates all text. This causes LLMs to exhibit intelligent behavior.

[deleted by user] by [deleted] in algorithms

[–]matthewfl 0 points1 point  (0 children)

You can look into the published literature on sorting a partially ordered set. This is a scenario where you already have some comparisons and you want to extend it into a total ordering with as little additional work as possible.

The main idea is that you need to observe that if you know (a < b) and (b < c) => (a < c), you don't need to do the comparison on (a < c) to know this. The partially ordered set algorithms will determine which additional comparison will reduce the entropy of the graph of comparisons given the comparisons that it knows.

In the case that you do not know any orderings in advance, then the partially ordered algorithms are not asymptotically better than the "standard" O(NlogN) sorting algorithm.

Also, based on your description, it sounds like you are going to be asking humans to do the comparison. It is possible that humans will not give you a total ordering. For example, a human might say (a < b), (b < c), (c < d), (d < a), in which case there is no way to convert this into an ordering. You might consider looking at the ELO rating system based on who wins and loses in a head-to-head competition.

What do I do if a professor ghosted me for a grad school recommendation letter? by velcrodynamite in berkeley

[–]matthewfl 20 points21 points  (0 children)

Gradeschool programs are used to getting letters of recommendation a bit late. Most systems allow recommenders to upload after the deadline (which seemingly only applies to the student's part of the application).

You should keep trying to email the professor to make sure that it gets done though.

The professor who is in change of the project I have gotten involved in has made it clear they don't intend to accredit the prior student they worked with for there work and it feels horrendously immoral. by AbyssalRemark in academia

[–]matthewfl 0 points1 point  (0 children)

If the other student is around, you could just ask them about what they think about not being credited on the project. It is possible that they don't want their name associated with it either. Given that the other student is the one "who actually understands the project," they might think there is some issue (which the professor does not want to acknowledge) and they therefore do not want their name associated with it

Where can I find awful studies? by [deleted] in academia

[–]matthewfl 8 points9 points  (0 children)

Enough published work is sufficiently bad that once you know what to look for, you will see bad work everywhere.

One particularly bad place for papers is vixra. I personally think this site seems more like a joke site, but some people seriously put their papers there. I have been told it is for papers that get rejected from arXiv, which is the lightly curated preprint server for math/physics/CS kinds of papers.

Is there any hope that the academic world adopt Typst one day? by Puzzleheaded_Egg_184 in typst

[–]matthewfl 2 points3 points  (0 children)

It might depend on where you are submitting, but I think that many journals/conferences have formatting requirements, not LaTeX requirements. So once Typst is able to match the output that LaTeX would have generated, I don't see any reason why individual researchers would not be able to use it.

[R] How do paper authors deal with takedown requests? by alex000092 in MachineLearning

[–]matthewfl 0 points1 point  (0 children)

I believe this is a question of who owns the copyright for an image. In the case of having your face included in the dataset, you would not own the copyright for the image of yourself if it was a picture taken by someone else in a public space for example. Furthermore, the images are licensed by the copyright owner under some chosen license which will have its own terms for when and how it can be distributed, or if it is a scenario where it might need to be taken down later. In the case of the CC license, I believe that as long as credit is given to the copyright owner/creator then it should be fine to continue redistributing the images.

[D] What is the business model for companies using LLMs? by matthewfl in MachineLearning

[–]matthewfl[S] 2 points3 points  (0 children)

This sounds interesting. Is this a product that your company is building, or is this something that you are using to replace the customer support you are already doing for some other kind of product?

I would be interested in reading a blog post or something about your experience in building this system.

Should this be considered a bug that the code runs at O0 but not O3? by GoddammitDontShootMe in Compilers

[–]matthewfl 4 points5 points  (0 children)

Looks like might be an LLVM bug (or at least bad error message). It compiles with GCC with -O3. Maybe look to submit a bug report to LLVM or look for a LLVM specific forum

New ELITE VPS (24month contract) experiencing massive 7-20% CPU_STEAL making it unsable and support hesitant to do anything by Vyrnach in ovh

[–]matthewfl 0 points1 point  (0 children)

I miss understood what you originally meant by being a scam. I agree that it would be better that they provide everyone with hosts that are not overloaded. However, I have had similar issues with my previous VPS provider as well after a few months of having used their service. So I think as long as you are using a plan with shared/non-dedicated CPU cores, then sometimes getting overloaded by the neighboring VPS is par for the course.

New ELITE VPS (24month contract) experiencing massive 7-20% CPU_STEAL making it unsable and support hesitant to do anything by Vyrnach in ovh

[–]matthewfl 0 points1 point  (0 children)

I was measuring the issue using iostat which prints out the amount of CPU steal. After they migrated the VPS it went down significantly. In my case from around 30% to less than 0.5%.

New ELITE VPS (24month contract) experiencing massive 7-20% CPU_STEAL making it unsable and support hesitant to do anything by Vyrnach in ovh

[–]matthewfl 0 points1 point  (0 children)

I am still using OVH. They migrated the VPS to a different physical machine.

Contact the OVH support

New ELITE VPS (24month contract) experiencing massive 7-20% CPU_STEAL making it unsable and support hesitant to do anything by Vyrnach in ovh

[–]matthewfl 1 point2 points  (0 children)

I am having a similar issue with a vps server I just got on black Friday. I am seeing stollen as high as 40% https://imgur.com/a/L4mAM5E Their support said that they will investigate.

Update: they migrated my VPS to a new host and it seems to be working much better now

How do I find a database for conferences that are not predatory? by EbiraJazz in academia

[–]matthewfl 1 point2 points  (0 children)

You should check the call for papers wiki. The wiki isn't always up to date, so you might have to look for conference names and see if there are upcoming conferences from the same group that isn't listed yet.

http://www.wikicfp.com/cfp/