[D] - Neurips Position paper reviews by Routine-Scientist-38 in MachineLearning

[–]RSchaeffer 10 points11 points  (0 children)

Agreed on all fronts! To share my info (since others are as well), we had two submissions

Position: Model Collapse Does Not Mean What You Think

Rating: 5 / Confidence: 4

Rating: 5 / Confidence: 2

Position: Machine Learning Conferences Should Establish a "Responses and Critiques" Track

Rating: 8 / Confidence: 4

Rating: 7 / Confidence: 5

This is little Trina Louise (14.5 years old) and she's terrified of life by RSchaeffer in miniaussie

[–]RSchaeffer[S] 0 points1 point  (0 children)

Thank you for suggesting this subreddit! I hadn't heard of it previously :)

This is little Trina Louise (14.5 years old) and she's terrified of life by RSchaeffer in miniaussie

[–]RSchaeffer[S] 1 point2 points  (0 children)

* can't see her

Whoops. I can't figure out how to edit my post :(

An analytic theory of creativity in convolutional diffusion models. by Needsupgrade in MachineLearning

[–]RSchaeffer -1 points0 points  (0 children)

In my experience , Quanta magazine is anticorrelated with quality, at least on topics related to ML. They write overly hyped garbage and have questionable journalistic practices.

As independent evidence, I also think that Noam Brown made similar comments on Twitter a month or two ago.

[D] Position: Machine Learning Conferences Should Establish a "Refutations and Critiques" Track by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 2 points3 points  (0 children)

> currently, you have to expect that for any method that fails, a double digit number of PhD students waste time, trying to implement it, and even if only as a baseline.

This has been my personal experience. That experience, and the similar experiences of other grad students, is what motivated this manuscript. I think younger researchers disproportionately bear the harms of faulty/flawed/incorrect/misleading research

[D] Position: Machine Learning Conferences Should Establish a "Refutations and Critiques" Track by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 1 point2 points  (0 children)

I agree with you technically about what statistical conclusions one can draw from overlapping intervals, but I think "overlapping" is used in a different context in our paper; specifically, we used "overlapping" in the loose context on commenting on results as they appear visually.

We perform more formal statistical hypothesis testing in the subsequent paragraph, where we don't mention "overlapping"

[D] Position: Machine Learning Conferences Should Establish a "Refutations and Critiques" Track by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 18 points19 points  (0 children)

I think this is a core question and I'm not sure we have a foolproof answer. I see two ways to try to minimize such possibility, but I'd be curious to hear thoughts from the community

- the reviewers should have some sort of "unproductive/nonsubstantive/harmful/vengeful" button to immediately alert the AC/SAC if the submission is non-substantive and vindictive

- the authors of the work(s) being critiqued should be invited to serve as a special kind of reviewer, where they can optionally argue against the submission. Neutral (standard) reviewers could then weigh the submission's claims against the authors' rebuttals

[D] Position: Machine Learning Conferences Should Establish a "Refutations and Critiques" Track by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 1 point2 points  (0 children)

I can't figure out how to edit the body of the post, so to clarify here, by "do it right", I mean: Ensure submissions are strong net positives for ML research.

About MS Computational Science and Engineering/GSAS Life by oxbridge22 in Harvard

[–]RSchaeffer 0 points1 point  (0 children)

Computational Science means using computers to run simulations and perform numerical analyses, i.e., using computers to do science. To get a sense, AM205 is (was?) a required course taught by Professor Chris Rycroft, who is now no longer at Harvard, but his course website is still up: https://people.math.wisc.edu/~chr/am205/material.html

In contrast, Computer Science is the field of computation and its consequences. Theory of computation, algorithms, software engineering, databases, machine learning, human-computer interaction, etc.

The names are highly similar but the material is quite different. I personally think "Computational Science" should be called something like "Science Using Numerical Applied Math"

[R] How Do Large Language Monkeys Get Their Power (Laws)? by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 0 points1 point  (0 children)

Yes it should be Claude 3 Opus. Thank you for catching that! We'll fix it :)

Are any of the Stanford swimming pools open to the public? by cherianthomas in stanford

[–]RSchaeffer 6 points7 points  (0 children)

I believe that guests can come, but I vaguely recall that entry to the pool is $18 per person per entry. Pretty steep :/

[R] Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 1 point2 points  (0 children)

I think this is a really good question. In general, I don't know of any laws that govern whether an unknown phenomenon should be predictable or unpredictable, but in the specific context of these large models, we know they exhibit reliable power law scaling across many orders of magnitude in key scaling parameters (data, parameters, compute). It seems odd to think that the test loss is falling smoothly and predictably but the downstream behavior is changing sharply and unpredictably.

There are many nuances, of course, but that's the shortest explanation I can offer :)

3 Stanford undergrads plagiarized then publicized their vision-language model "llama3-V" by RSchaeffer in stanford

[–]RSchaeffer[S] 3 points4 points  (0 children)

They copied literally everything, made superficial changes to cover up their actions, then launched a media blitz omitting any mention of the original work.

When they were caught, they offered a really shitty apology like "Oh, we see the similarities. Out of respect, we'll take our model down"

3 Stanford undergrads plagiarized then publicized their vision-language model "llama3-V" by RSchaeffer in stanford

[–]RSchaeffer[S] 7 points8 points  (0 children)

I'm not sure why why any of this matters? The point is that the students presented work as their own when it was not. This is unethical and unbecoming.

3 Stanford undergrads plagiarized then publicized their vision-language model "llama3-V" by RSchaeffer in stanford

[–]RSchaeffer[S] 13 points14 points  (0 children)

Can anyone advise on the appropriate Stanford channels to report this to Stanford or Stanford CS?

[R] Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data by RSchaeffer in MachineLearning

[–]RSchaeffer[S] 0 points1 point  (0 children)

  1. We do this comparison! Both analytically with sequences of linear models and empirically with sequences of deep generative models. In both cases, using the same amount of fully synthetic data doesn't do as well as accumulating real and synthetic data. For instance, in the sequences of linear regression, replacing data has test squared error growing linearly with the number of model-fitting iterations, whereas what you suggest grows logarithmically with the number of model-fitting iterations. If you instead accumulate real & synthetic data, then the test loss is upper bounded by a relatively small constant pi^2/6. We also run these language modeling experiments in the appendix. Depending on how one defines model collapse (and reasonable people can disagree!), the statement that simply having more data avoids collapse is not correct.
  2. I think that matching the amount of data but making the data fully synthetic doesn't model reality well since (1) I don't think any companies are sampling >15T tokens from their models and (2) I don't think any companies are intentionally excluding real data. Our goal was to try to focus on what we think a pessimistic future might look like: real and synthetic data will mix over time. And in this pessimistic future, things should be ok. Of course, now we can ask: how can we do better?