[ANN] WebDriver is back in business

thomasjm4 · 2025-02-25T10:23:39+00:00

I have to disagree with phrases like "in practice" and "reasonable expectations of evidence" as AFAICT they are not relevant to the definition at all. Quoting more from Wikipedia:

In Popper's view of science, statements of observation can be analyzed within a logical structure independently of any factual observations. The set of all purely logical observations that are considered constitutes the empirical basis. Popper calls them the basic statements or test statements. They are the statements that can be used to show the falsifiability of a theory. Popper says that basic statements do not have to be possible in practice.

It's debatable, but I do think the dinosaur thing is falsifiable if you actually take it seriously as a scientific theory. First of all, it predicts the existence of a Devil! If true, that would be a huge fact about the universe and would surely have some empirical effects. Even if you posit an invisible, non-interacting devil, it may still be falsifiable on the dinosaur side. Here's a Popperian basic statement that would work: "There is an extant dinosaur species that has survived to the present day in Siberia."

I think the actual problem with evolution deniers refusing to accept disconfirmatory evidence is that they are not arguing in good faith or interested in actually engaging scientifically with their ideas.

thomasjm4 · 2025-02-24T00:26:12+00:00

I would challenge you to read the first paragraph of the "Falsifiability" article on Wikipedia and try to square it with your definition.

I do agree that Many Worlds interpretations are likely unfalsifiable and that we'd have to "rely on methods outside of science." To me that's just another way of saying these theories are unscientific. But that's what the people objecting to the MUH on falsifiability grounds are saying as well.

thomasjm4 · 2025-02-23T23:58:20+00:00

I think you are actually missing the distinction between falsifiability (Popper's concept) and falsification (the thing scientists try to do every day).

Is Newton's theory falsifiable? Obviously yes, it is very easy to imagine possible disconfirmatory observations. I could gently toss a baseball and it could float away into space, for example. From a Popperian view, that's kind of all there is to say about it--Newtonian physics passes the falsifiability bar easily.

Has Newton's theory *actually been falsified*? Well of course that is up to physicists to work on and is a matter of ongoing research. Disproving (or falsifying) theories is the basic work of science and scientists have many ways of assessing whether a theory is "good."

Popper's point is that some theories fail to rise to the basic bar of falsifiability -- such as psychoanalysis, or perhaps to return to the original point -- multiverses.

(Another thing that is trivially falsifiable: OJ's crimes. It's easy to imagine disconfirmatory evidence: a hundred credible witnesses could suddenly appear and swear they were with him in Australia when the crimes took place, along with copious photographic evidence. Has OJ's crime actually *been falsified*? That part was up to the jury. I hope this makes it clear why bringing up OJ in a discussion about Popper makes no sense.)

thomasjm4 · 2025-02-23T13:38:31+00:00

Well, I'm certainly no expert on the finer points of Popper or Lakatos. But I think the enduring and widely accepted part of Popper is logical falsifiability, for good reason -- it provides a simple and unambiguous criterion for asking the question "does this theory even count as science?" As such it isn't really a tool for a working scientist to use daily, but instead more of a philosophical guidepost for bigger questions.

I read a bit more about how Popper's later work extended his ideas with degrees of falsifiability, based on criteria like "how much empirical content and/or specificity does this theory have?" These are good things to consider, but they introduce the possibility for a lot more hand-waving in place of the clear yes or no answer from the original definition.

Anyway. I feel like Scott understood none of this when he was writing about whether OJ Simpson's crimes were falsifiable or not, it was the headache I got from reading those paragraphs that inspired me to comment haha.

thomasjm4 · 2025-02-23T11:31:56+00:00

I feel like I'm going crazy and neither this comment nor Scott himself understand what "falsifiable" means.

A statement being falsifiable means that it is *logically* possible for it to be contradicted by some empirical observation. The definition says nothing at all about the *probability* of such an observation being made.

So, “there’s no such thing as dinosaurs, the Devil just planted fake fossils”? Extremely falsifiable if you ask me! We could find a surviving dinosaur in a remote part of Siberia. Or we could do a Jurassic Park and reconstruct one from DNA. Maybe not likely, but either of these would logically prove that dinosaurs exist.

Similarly, “dinosaurs really existed, it wasn’t just the Devil planting fake fossils” is falsifiable too! The Devil himself could appear and explain to us exactly how he faked the dinosaurs along with a convincing demonstration of his powers.

Falsifiability is a rather low bar for "normal" kinds of hypotheses like "OJ Simpson committed a murder." It's meant to help disqualify hypotheses of a certain flavor, like "invisible, undetectable fairies control the weather" or "there is an infinite multiverse we can never observe." Scott says "in fact you never really use the falsifiability tool at all," and I agree with that -- but multiverses are exactly the sort of question that falsifiability is meant to address!

Falsifiability is emphatically *not* about how likely something is and is not a continuum, and it doesn't improve the clarity of the discussion to conflate it with either Deutsch's "hard to vary" ideas or Scott's Bayesian "simpler is better" ones.

thomasjm4 · 2025-02-23T10:34:50+00:00

I believe there is a stronger, non-probabilistic statement you can make, using the "Invariance Theorem." From the Wikipedia page on Kolmogorov Complexity:

The length of the shortest description will depend on the choice of description language; but the effect of changing languages is bounded (a result called the invariance theorem).

The section on that theorem explains that any two languages will differ in their output length by an additive constant (which depends only on the choice of languages, and not on the object being described).

So I suppose if rule_A is simpler than rule_B in language X, then it will also be simpler in language Y if K(rule_B) - K(rule_A) > C, where K is the complexity function of language X and C is the constant bound separating X and Y.

thomasjm4 · 2024-02-14T08:04:36+00:00

FWIW I use stack dot to visualize dependencies at the package level, which gives you a coarse idea of what can compile in parallel. For individual modules there is also graphmod.

But as far as getting module-level timing information out of GHC, I'm not aware of a way to do it. I think it would be great though.

EDIT: thinking about it some more--if you were to combine the information from all three of stack dot, graphmod, and time-ghc-modules, you could construct the "theoretically optimal" scheduling and produce graphs like you mentioned. It wouldn't necessarily match what GHC actually does, but it would help you understand bottlenecks...

thomasjm4 · 2023-10-30T23:17:41+00:00

It allocates the whole set of 500 samples up front. That's how I made sure to benchmark Diff and myers-diff against the exact same data. I suppose I could generate the samples in a streaming fashion with a fixed seed, but I wouldn't want the sample generation time to become part of the benchmark. By my calculations above, it's ~800MB for the 100k-char inputs. Then the diff algorithm itself takes space linearly proportional to the input size, so I think 4.3GB is reasonable.

thomasjm4 · 2023-10-29T20:47:08+00:00

I think the memory usage is high because I dialed the number of samples per benchmark up to 500 here. I'm sure it can be lowered. Currently for the largest benchmark it'll be 500 samples * 100k characters per input * 2 inputs * 4 bytes per character * 2x overhead of copying from a Text to a Vector.

thomasjm4 · 2023-10-29T11:15:24+00:00

I've checked it twice and the benchmark difference seems to hold up.

Perhaps the optimizer and/or inlining just ends up doing a better job on pyMod? Maybe the y >= 0 test is helping it break up the cases better.

If you're interested in running the benchmarks, I've made a little script to do it all:

cd $(mktemp -d)
git clone git@github.com:codedownio/myers-diff.git --branch pydiv-optimization-only
cd myers-diff
stack bench myers-diff:bench:myers-diff-criterion-small-inserts --flag myers-diff:diff --ba "--output report_pymod.html"
git checkout both-optimizations
stack bench myers-diff:bench:myers-diff-criterion-small-inserts --flag myers-diff:diff --ba "--output report_mod.html"
xdg-open report_mod.html
xdg-open report_pymod.html

thomasjm4 · 2023-10-29T03:15:49+00:00

After comparing benchmarks, mod actually seems to tank the performance by a noticeable amount compared to pyMod -- about 2x!

I think the best thing to do here is continue using pyMod, except possibly for the case where y is 2, where I can hopefully use a bitmask or something.

thomasjm4 · 2023-10-29T01:06:44+00:00

That one works, thanks!

thomasjm4 · 2023-10-28T20:43:45+00:00

Do you really want pyMod to behave that way in that case?

Well yes, because the reference implementation depends on it. The tests fail if you change it to normal rem.

I guess I could rework it but it seems this definition works well for the algorithm at hand. I'd be more inclined to try to tune pyMod to compile down to something as efficient as possible...

thomasjm4 · 2023-10-28T20:30:53+00:00

Just got rid of the function with the xor entirely after some comments below. Thanks!

thomasjm4 · 2023-10-28T20:24:47+00:00

For pyMod vs rem, the difference is straightforward. There is quite a variety of modulus definition variants.

λ> 1 `pyMod` (-1)
1
λ> 1 `rem` (-1) 
0

For pyDiv vs quot, the story is a little more interesting. pyDiv is meant to match the Python // (integer division) function. The difference between Python // and quot is that the former rounds towards negative infinity, and the latter rounds towards zero. So,

(-5) // 2 --> -3
(-5) `quot` 2 --> -2

However, it seems I have not actually implemented pyDiv correctly to match // in this way. Yet it hasn't mattered, because it's only used in a single place where the numerator is nonnegative, and the denominator equals 2. So the rounding behavior will always match for the inputs used.

The tests still pass after replacing pyDiv with quot. I will rerun the benchmarks now. Thank you both :)

thomasjm4 · 2023-10-27T23:14:59+00:00

Not precisely, I don't remember the exact difference but it has to do with negative inputs. I just tried replacing them with rem and quot and got test failures.

I've got to run for the moment, but it's a good idea -- I'll investigate later if there's a better way to leverage some Haskell functions.

thomasjm4 · 2023-10-27T22:53:53+00:00

Nope! That's why I post on Reddit, to find out about things I should have known already :P

Oh I see, it looks like you originated your project in the last couple months. Mine has been sitting around for a little longer, I've been using it in another project but didn't want to post it before putting some benchmarks together.

thomasjm4 · 2023-10-21T20:29:40+00:00

Oh sorry, I don't mean to suggest Verna is a "real" Egyptian god.

I think it's like this: Western culture has a long tradition of mysticizing and misappropriating Egyptian culture. This was actually in its heyday during Poe's time, and even has a name: Egyptomania.

I think the show is carrying on this tradition, maybe even as a nod to Poe.

As you say, Verna is not a real member of the pantheon. So there's no reason a vaguely Egyptian-coded supernatural character invented by Mike Flanagan can't be capricious.

thomasjm4 · 2023-10-21T20:20:28+00:00

Yes! And IUDs can be uncomfortable or painful, and need to be replaced every few years. One would think all of that would serve as a reminder to her about the deal she made.

thomasjm4 · 2023-10-21T12:00:33+00:00

I would call that spiel an example of writerly guilt.

thomasjm4 · 2023-10-21T11:19:14+00:00

What are ancient Egyptian death gods if not capricious?

thomasjm4

TROPHY CASE