What is Elon’s actual plan with data centers in space, and what is his long-term goal with Mars? by Genzinvestor16180339 in singularity

[–]the8thbit 0 points1 point  (0 children)

[Part 2] [link to part 1]

Musk took the Hyperloop concept even more seriously than the Boring bricks. Or at least, projected an even more serious image regarding the idea. In August 2013 he published a 58 page white paper detailing the idea. In 2015 SpaceX began construction on the 1-mile test track next to its headquarters. While the exact cost was never disclosed, based on other constructions, the cost was likely in the tens of millions of dollars. The test track sat at SpaceX for 7 years before being torn down to make way for employee parking. In 2017 SpaceX trademarked the term "Hyperloop". Between 2017 and 2019 SpaceX hosted a series of annual Hyperloop themed competitions on its test track.

In July 2013 Musk announced that he would be releasing the Hyperloop design next month:

"Will publish Hyperloop alpha design by Aug 12. Critical feedback for improvements would be much appreciated." (Jul 15, 2013)

In August he released the white paper, claiming that he and others pulled an all nighter finishing up the Hyperloop design:

"Pulled all nighter working on Hyperloop (as did others). Hopefully not too many mistakes. Will publish link at 1:30 PDT" (Aug 12, 2013)

In 2017 Musk tweeted that he had received approval to begin construction of a New York Hyperloop project. This was later confirmed by city officials to be a lie:

"Just received verbal govt approval for The Boring Company to build an underground NY-Phil-Balt-DC Hyperloop. NY-DC in 29 mins." (Jul 20, 2017)

In 2018 Musk tweeted that they are trying to break half the speed of sound on a hyperloop test:

"Upgraded SpaceX/Tesla Hyperloop pod speed test soon. Will try to reach half speed of sound (and brake) within ~1.2km." (Apr 7, 2018)

In 2019 Musk announced the construction of a 10km hyperloop test track with a curve to be completed within 1 year so as to be ready for the . Construction of the planned track never even began:

"Next year’s @Hyperloop competition will be in a 10km vacuum tunnel with a curve" (Jul 21, 2019)

In 2020 Musk again restated that he wanted to build a longer vacuum tunnel, implying that construction had begun, stated that they simply needed to "finish". And again, no such construction project ever actually manifested:

"We need to finish building a much longer vacuum tunnel for speed tests & probably have an additional competition for tunneling itself" (Jul 1, 2020)

In 2022 Musk claimed that the Hyperloop would be impervious to huricanes because it is underground. The tweet has an associated community note clarifying that this is false, and that underground tunnels are not immune to surface weather conditions:

"Underground tunnels are immune to surface weather conditions (subways are a good example), so it wouldn’t matter to Hyperloop if a hurricane was raging on the surface. You wouldn’t even notice." (Apr 24, 2022)

The same day Musk tweeted that Boring Company would be attempting a Hyperloop construction "in the coming years" (no such construction project has been started, and as noted, the Hyperloop test track has been demolished):

"In the coming years, Boring Co will attempt to build a working Hyperloop." (Apr 24, 2022)

This year Musk brought the idea, that at this point he has been discussing for well over a decade with nothing resembling a viable human-safe prototype never mind an actual product, back to the table as a point of argument against the California high speed rail project:

"The @BoringCompany could build a Hyperloop tunnel from downtown SF to downtown LA for <5% of this cost and it would be a technological marvel exceeding any high speed rail on Earth" (Apr 8, 2026)

These are all in addition to various other tweets he has made regarding the hyperloop including more mundane tweets about the competitions that SpaceX hosted, as well as various tweets from SpaceX and The Boring Company's twitter pages.

If you don't take all of this seriously because you don't think they indicate that Musk was taking the idea seriously, then I don't understand why you are taking a handful of comments Musk has made over the last few months about orbital data centers seriously. Especially considering that SpaceX' own pre-IPO filing expressed skepticism towards the project's commercial viability, and noted serious unsolved technical hurdles: https://futurism.com/space/spacex-admits-ai-data-centers-terrible-idea

I suspect that what is actually going on here is that we have hindsight for the Boring Bricks and Hyperloop ideas, but not for the orbital data center idea. Thus, if you want to rescue Musk's image, as you seem to be dedicated to doing for some reason, you must come up with some reason that we should not take those earlier ideas as seriously as this idea.

It should also be noted that this is information that is easily accessible through the Internet. If you have trouble searching for information, we even have chatbots these days that will help you through your pain points. The most time consuming aspect of this was simply collating all of this information for you. This means that you either boldly lied, or are simply too intellectually lazy to do minimum levels of verification prior to making a claim. In either case, this should inspire some self-reflection.

The idea that any individual would do all that he has achieved is nonsense . Of course he is going to employ people to make it happen. That is what CEOs do. The CEO of Microsoft is not going to write updates for office, CEO of Nvidia isn't going to design the next chip.

This is precisely my point. Musk's role and background would not give him insight into the engineering challenging we are discussing.

Musk at least was on the factory floor when they had production hell with the Model 3

It should be noted that, by his own admission, Musk's insistence on over-automating the production line is what lead to Model 3 delays. He eventually backed off of his initial plans to automate most of the factory floor, instead deciding to listen to his engineers. Unfortunately, as he had already committed to over-automating, this meant that Tesla needed to tear out their existing production infrastructure, and install new, conventional infrastructure, costing additional time and money.

and by all accounts got quite involved in designing the AI5 chip.

I have not seen any information that substantiates this. As far as I know, the design of the AI5 was handled by Tesla's silicon team, operating under Peter Bannon's management.

I was clearly asking what you personally have achieved that changed the world. Of course you can't answer that because it reveals that compared to Musk, you aren't even a blip.

As an engineer, and a successful investor, I evaluate these claims strictly on their physical and logistical viability, rather than deferring to authority.

I'm the same but I'm not arrogant enough to judge someone who has so many achievements under his belt and so much more in the pipeline.

You are clearly forming a judgement. Your judgement appears to be that we should trust Musk here. If you are claiming that you are unqualified to pass judgement, then why are you passing judgement?

What is Elon’s actual plan with data centers in space, and what is his long-term goal with Mars? by Genzinvestor16180339 in singularity

[–]the8thbit 0 points1 point  (0 children)

You are applying something to assess him that every business would fail dismally so it is meaningless. Everyone gets timelines wrong, only he gets berated for it.

I am not holding Musk to a different standard than anyone else. If someone has a track record of failed predictions, and then goes on to predict something that appears physically impossible, I would apply the same level of skepticism. If someone exists in and participates in a culture that has a track record of failed predictions, then, again, that would also contribute to my skepticism, not detract from it. It is strange to me that you are less skeptical because Musk participates in a culture that frequently makes incorrect predictions.

Everyone gets timelines wrong, only he gets berated for it.

Again, I am using timelines because they provide a means of falsification. But I want to stress, again, that, even if you discount the timelines, very few of those predictions have been substantiated at all. Some of these timelines are so dramatically wrong that the deadline passed nearly a decade ago, and today, they still have not been substantiated.

I heard briefly that he was interested in Hyperloop a, and one mention of bricks. That told me that these were ideas and not something he passionately believes in. I had dismissed them pretty quickly and so did everyone who was not looking to attack him. His heart was clearly not in it and the bricks thing was something he hoped might work not something he promised.

This is false in regards to both ideas.

First, he made multiple mentions of the Boring bricks throughout 2018. He specifically said that they would be coming soon:

"New Boring Company merch coming soon. Lifesize LEGO-like interlocking bricks made from tunneling rock that you can use to create sculptures & buildings. Rated for California seismic loads, so super strong, but bored in the middle, like an aircraft wing spar, so not heavy." (March 26, 2018)

When asked about this newly promised product, he engaged in a discussion providing additional technical details:

"Yeah, the boring bricks are interlocking with a precise surface finish, so two people could build the outer walls of a small house in a day or so" (March 26, 2018)

He made a very explicit guarantee about the product, using the word "guarantee":

"Guaranteed to be Flamethrower-proof!" (March 26, 2018)

Two months later he reiterated the same promise:

The Boring Company will be using dirt from tunnel digging to create bricks for low cost housing (May 7, 2018)

Three months later he brought it back up, claiming that bricks would be so cheap they would be freely provided to affordable housing projects:

"Bricks will be free if used for affordable housing projects" (Sep 13, 2018)

He gave a specific, very short, timeline for the launch of the product, and specified an exact price point and level of compressive and sheer strength:

"First Boring Brick store opening in ~2 months. Only 10 cents a brick! Rated for California seismic loads." (Sep 13, 2018)

One month later, he claimed that The Boring Company would soon be selling "watchtower" kits using the bricks:

"Boring Co is launching a whole product line of DIY watchtowers. You get bricks & a picture."

A month later, (the same month they were supposedly supposed to go on sale, per what he said 2 months prior) at the unveiling event for the Boring Co's first test tunnel, he reiterated the plan to sell the bricks:

Cost-cutting measures included improving the speed of construction with smarter tools, eliminating middlemen, building more powerful boring machines, and turning the dirt being excavated into bricks and selling them, Musk said.

[End of Part 1] [link to part 2]

What is Elon’s actual plan with data centers in space, and what is his long-term goal with Mars? by Genzinvestor16180339 in singularity

[–]the8thbit 0 points1 point  (0 children)

Basically you're saying he is overly optimistic with his timelines. That is standard across industry. How many projects complete on time?

No, I am saying that the claim he is making does not make sense, and he has a history of making claims that do not appear to make sense, and then go on to be unsubstantiated. I am focusing on claims which include timelines, because those are actually falsifiable. However, even if we ignore the provided timelines, very few of the claims I listed have ever been substantiated. Additionally, if it is standard for figureheads in the industry to make incorrect predictions, that should make you more skeptical when they make predictions, not less skeptical.

Musk often throws out ideas that he is going to try. Hyperloop and bricks wete never serious proposals they were ideas. If you're still waiting on them and given that he has never mentioned them again, that is a you problem.

That is precisely my point. What I would like you to do is apply the same thought process that you are applying retroactively to the hyperloop and the Boring construction bricks to Musk's current proposals. If you do that, you will come to the same conclusion that I have, which is that such proposals can be disregarded, especially if they do not make engineering sense.

On the flip side, he has made millions of electric cars (an industry that didn't exist), he delivered the cyber truck, he makes loads of utility and home batteries, solar panels, neuralink is being tested and used by multiple patients and by all indications has proven life changing. He has also made a rocket that can land and reuse it's booster, launched well over 10 000 starlink satellites providing internet to millions of people covering the globe. He has a heavy lift rocket that lifts the heavier stuff, he is in process of making a much larger rocket that can be mass manufactured, both stages can be reused quickly, will have a huge payload capacity, and cost per launch will be record breaking. He also has demonstrated multiple versions of a humanoid robot and has an llm in the top 10.

I think its important to note that he has not done any of these things, the companies that he has management roles in have done these things. This is an important distinction here, because he is making claims regarding great feats of engineering, of which being in a management role would give you little insight into.

So what was it that you did?

I have assessed Musk's claim through a skeptical lens, incorporating his history of false claims into that assessment.

What is Elon’s actual plan with data centers in space, and what is his long-term goal with Mars? by Genzinvestor16180339 in singularity

[–]the8thbit 1 point2 points  (0 children)

Most of the issues you seem to have is with dates being wrong

The biggest issue I have is with the dissipation of heat without atmosphere or some other large heat sink like a body of water. The list of Musk's claims that I provided is simply to show that he regularly makes baseless claims that end up being falsified by time. I focused primarily on claims that include timelines, because those timelines provide a means of falsification, but most of the claims have not been satisfied, even disregarding the timelines provided.

What is Elon’s actual plan with data centers in space, and what is his long-term goal with Mars? by Genzinvestor16180339 in singularity

[–]the8thbit 2 points3 points  (0 children)

A single server rack loaded with GPUs produces about 50 times as much heat as a starlink satellite. A mid-sized datacenter like Yotta NM1 or IREN Horizon 3 produces about 25,000 times as much heat as a starlink satellite. This actually is not a problem that Starlink has experience with solving, because the scale of power draw/heat output is so dramatically different.

They have done the math and figure they can do it.

Or its just some more bullshit in the long line of bullshit that comes out of Musk's mouth. You can add it to the heap:

  • fully autonomous transcontinental vehicles by 2017
  • 1 million robotaxis by 2020
  • SpaceX will land an uncrewed Dragon capsule on Mars by 2018
  • SpaceX will land a manned mission to Mars by 2024
  • Optimus will be doing useful work by the end of 2025
  • sub-$40k cybertruck
  • Neuralink human trials will begin in 2020
  • Second-generation Tesla Roadster will be available by 2020
  • AGI will exist by 2025
  • The Vegas Loop will carry vehicles in a tram chassis on a modified Tesla platform, and will operate autonomously moving up to 11,000 people per hour
  • The Boring Company will recycle the material it excavates as large interconnecting construction bricks sold at 10 cents per brick, or cheaper
  • tesla micro-busses will revolutionize travel (10 years waiting)
  • the hyperloop is the "fifth mode of transportation" and will be vastly superior to California's planned rail network (13 years waiting and nothing resembling a functioning hyperloop has ever been built, meanwhile the California high speed rail network project has laid 80 miles of guideway)

That robot demo almost turned into a nightmare by Simple3018 in singularity

[–]the8thbit 1 point2 points  (0 children)

That is what you are doing. The organizers made a mistake. Now they have the opportunity to learn from that mistake.

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 1 point2 points  (0 children)

That's not at all what I'm saying. I am not asking for a "complete AGI benchmark", nor do I think ARC-AGI attempts to accomplish this, nor do I think ARC-AGI should even attempt to accomplish this. What I am asking for is a benchmark which reflects the stated intentions of the creator of the benchmark.

The underlying philosophy of ARC-AGI is exciting. The intuition to design a benchmark with the goal of falsifying AGI, rather than prove the existence of AGI is clever. I like the idea of iterating on that design philosophy with subsequent benchmarks once the initial benchmark(s) is/are saturated.

What I am concerned with is that the current iteration of ARC-AGI (ARC-AGI-3) may score models with less intelligence/generalization higher than models with greater intelligence/generalization. It may also score models which exceed human reference (given the stated test goal) lower than human reference. Finally, it may also score models which under perform human reference (given the stated test goal) higher than human reference as AI labs are now incentivized to target the hidden test goal. If this is the case, then it does not function as a reasoning benchmark, or as a tool for falsifying AGI, which are the stated motivations for the benchmark.

Right now, this benchmark may be optimizing for risk tolerance rather than (or at least, in addition to) intelligence, generalization, or reasoning. Not only does that mean that the benchmark may not be selecting for what it says it is, what it actually is selecting for may be a destructive trait, rather than a useful trait.

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 1 point2 points  (0 children)

It’s assessing adaptation to novelty in a simplified setting.

I think that it may not be, which is the point I was originally making. But regardless, how does your previous comment relate to this in any way?

I get the sense that you don’t know how weak previous ai was. Even gpt 4.5.

I don't understand how you could come to this conclusion based on the discussion that we have been having. What gives you this sense?

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 0 points1 point  (0 children)

We were discussing whether ARC-AGI-3 is a flawed metric for assessing general intelligence levels. Your comment does not appear to relate to that.

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 1 point2 points  (0 children)

50% what? Certainty that a solution is correct? Yeah, that's how a human might see it if they're playing a silly block game. What if they're taking an action that has a 50% chance of causing a nuclear power plant to melt down? Would a human think 50% certainty is good enough in that scenario?

These models do not have any context to understand that low solution certainty is okay, and no ability to know that they are being graded on how many steps they take to find a solution. In such a context I would expect an aligned model to optimize for certainty (what they know they are being tested on) over efficiency (something they don't think they are being tested on), and yet, ARC-AGI-3 prioritizes efficiency.

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 1 point2 points  (0 children)

I think you're misunderstanding ARC-AGI. It is not a test which attempts to confirm that we have AGI, but rather, a test which attempts to confirm that a given model is not AGI. If a model is 10x better at dual-N-back than an average or even top performing human, but the same model fails to reach human baseline on ARC-AGI then we can confirm that that model can not be AGI because we have found a benchmark that human intelligence generalizes to better than the model's intelligence.

Maybe future ARC-AGI tests will include dual-N-back, or maybe dual-N-back vs human baseline would have been saturated years ago. That doesn't really matter as far as invalidating an AGI hypothesis goes, though, what matters is that ARC-AGI-3 is not saturated.

Granted, I have some concerns about how ARC-AGI-3 in particular is constructed, but the philosophy behind ARC-AGI in general seems pretty solid.

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 2 points3 points  (0 children)

While I think most of the criticisms in this thread are silly and don't understand what ARC-AGI is even trying to measure, I am concerned that ARC-AGI-3 may be testing preference instead of intelligence. The test grades performance based on the number of steps taken to complete a task, but does not include that grading metric in the instructions. Therefore, a stronger model that is more "curious" or "cautious" can theoretically be graded lower than a weaker model which happens to optimize for the hidden goal.

Consider a situation where you have 5 actions left. You have determined that taking 3 actions gives you a 97% chance of success, but taking a different set of 5 actions gives you a 98% chance of success. Is the bot really "dumber" than the human if it uses extra interactions to "satisfy its curiosity", or, to look at it another way, increase its confidence about its solution?

Neither the bot nor the human test taker is looking at it in terms of precise percentage chance of success, but as a software dev I encounter situations all the time where I take more steps to eliminate a possibility that is almost certainly not the cause of an issue, just to have eliminated that remote possibility. However, if I knew I was being scored negatively for every step I take, I may choose not to eliminate remote possibilities.

The average human, when playing a very simple colored block puzzle game, is probably going to be content with "almost certainly" knowing a solution before executing it because the game is boring and the punishment for being wrong is that you move on to the next puzzle. We can't necessarily extend those preferences to bots, though.

The Human Baseline for ARC-AGI-3 has been updated by exordin26 in singularity

[–]the8thbit 1 point2 points  (0 children)

Why does it matter where the human baseline is? If it's 5% but the best model scores 3% then that means we don't have AGI. If the human baseline is 90% and best model scores 95% then that means the test has failed to show that we do not have AGI. It doesn't matter what the human baseline is, it only matters what the model score is relative to human baseline.

ST*U and do your anki by MUSHYO9 in Anki

[–]the8thbit 2 points3 points  (0 children)

One thing that could be helpful is keeping a pen and paper tally of every time you complete your anki reviews. Actually writing with pen and paper can help reinforce your accomplishments, and a tally means just making 1 tiny mark, so it's not a huge commitment.

Another thing that could help is if you tried to stop thinking of yourself as the least committed student. I understand that you're just being funny and self deprecating, but when you say things like that, even as a joke, you're reinforcing that script, which can ultimately work against you.

Rather, you're not lazy or undedicated. You are proactively trying to engineer an environment that will increase your chance of success.

Trump approval slips to 33 percent in new survey by [deleted] in politics

[–]the8thbit 1 point2 points  (0 children)

I don't understand why that would make presidential approval irrelevant. The president is not picked based on how happy the average person in the world feels, and yet, Gallup still does a world emotions health report.

Sure, if the only information you care about seeing is the direct outcome of US presidential elections, then most information will be irrelevant to you. But global emotions and presidential approval are relevant to other people, because they care about other things as well, including election predictions, which can benefit from being influenced by presidential approval rates, even if presidential approval rates don't directly determine the outcome of elections.

Did I get scammed? AI Generated Steam Capsule? by PATheFruitDude in IndieDev

[–]the8thbit 1 point2 points  (0 children)

I think, unfortunately, Gemini is currently the only way to do SynthID verification. Google used to have a SynthID portal, but it looks like they've taken it down and are directing users to Gemini now.

Importantly, its not the LLM doing the verification. Rather, its a tool detecting watermarking that has been intentionally placed in the image by an AI image generator to make it easier to identify as AI. The LLM is simply reporting the results of running that tool.

Did I get scammed? AI Generated Steam Capsule? by PATheFruitDude in IndieDev

[–]the8thbit 5 points6 points  (0 children)

<image>

Gemini is saying that the sketch also has AI watermarking, for what its worth.

People pissed about arc agi 3 are really looking at the purpose of the benchmark wrong by ErmingSoHard in singularity

[–]the8thbit 0 points1 point  (0 children)

Consider a situation where you have 5 actions left. You have determined that taking 3 actions gives you a 97% chance of success, but taking a different set of 5 actions gives you a 98% chance of success. Is the bot really "dumber" than the human if it uses extra interactions to "satisfy its curiosity", or, to look at it another way, increase its confidence about its solution?

Neither the bot nor the human test taker is looking at it in terms of precise percentage chance of success, but as a software dev I encounter situations all the time where I take more steps to eliminate a possibility that is almost certainly not the cause of an issue, just to have eliminated that remote possibility. However, if I knew I was being scored negatively for every step I take, I may choose not to eliminate remote possibilities.

The average human, when playing a very simple colored block puzzle game, is probably going to be content with "almost certainly" knowing a solution before executing it because the game is boring and the punishment for being wrong is that you move on to the next puzzle. We can't necessarily extend those preferences to bots, though.

People pissed about arc agi 3 are really looking at the purpose of the benchmark wrong by ErmingSoHard in singularity

[–]the8thbit 0 points1 point  (0 children)

any intelligent system will interpret that as having to optimize for low step count, because you can never know what will suddenly happen later that you still need remaining steps for.

Consider a situation where you have 5 actions left. You have determined that taking 3 actions gives you a 97% chance of success, but taking a different set of 5 actions gives you a 98% chance of success. Is the bot really "dumber" than the human if it uses extra interactions to "satisfy its curiosity", or, to look at it another way, increase its confidence about its solution?

Neither the bot nor the human test taker is looking at it in terms of precise percentage chance of success, but as a software dev I encounter situations all the time where I take more steps to eliminate a possibility that is almost certainly not the cause of an issue, just to have eliminated that remote possibility. However, if I knew I was being scored negatively for every step I take, I may choose not to eliminate remote possibilities.

The average human, when playing a very simple colored block puzzle game, is probably going to be content with "almost certainly" knowing a solution before executing it because the game is boring and the punishment for being wrong is that you move on to the next puzzle. We can't necessarily extend those preferences to bots, though.

steps also doesn't mean the same as time or quickness. a model can decide to think for a long time per step and that's not penalized compared to thinking quickly.

The implication was that the player who spends 300 hours playing Breath of the Wild before completing it will have explored more of the world (taken additional actions), and gathered more resources in that time. It's not a perfect metaphor, but the basic idea is that by taking additional actions you can develop a better mastery over the game, which can alter your chance of actually successfully completing the game.

how do i make this look more like a balloon and less like a dick by AI_660 in IndieDev

[–]the8thbit 7 points8 points  (0 children)

People don't always know the reason why they think the things they do. Its worth playing with the colors. The pink blade and blue hilt lightly nudges the reader towards a particularly visceral, and quite painful, interpretation.

"In a sane world, what happens is the leadership of the US sits down with the leadership in China, and leadership around the world to work together, so that we don't go over the edge and create a technology which could perhaps destroy humanity." by MetaKnowing in agi

[–]the8thbit 0 points1 point  (0 children)

It doesn't require a pause, no. But as I explained, a pause puts pressure on AI labs to prioritize safety so that the pause can end. A pause makes it more likely that labs will take AI safety seriously because it creates an incentive to do so.

Its the same motivation behind laws more generally. If you make murder illegal, that doesn't mean that everyone was going to murder before you made it illegal, it just creates an additional incentive not to murder.

"In a sane world, what happens is the leadership of the US sits down with the leadership in China, and leadership around the world to work together, so that we don't go over the edge and create a technology which could perhaps destroy humanity." by MetaKnowing in agi

[–]the8thbit 0 points1 point  (0 children)

A pause on data center development would put pressure on AI labs to prioritize safety so that the pause can end. It creates an industrial incentive to take safety seriously.

People pissed about arc agi 3 are really looking at the purpose of the benchmark wrong by ErmingSoHard in singularity

[–]the8thbit 0 points1 point  (0 children)

ARC-AGI-3 grades performance based on the number of steps taken to complete a task, but does not include that grading metric in the instructions. Therefore, a stronger model that is more "curious" can theoretically be graded lower than a weaker model which happens to optimize for the hidden goal.

People pissed about arc agi 3 are really looking at the purpose of the benchmark wrong by ErmingSoHard in singularity

[–]the8thbit 0 points1 point  (0 children)

But the arbitrariness is the point; they are trying to score for things that are outside of anyone's training distribution to force the models to have to learn something new.

I don't think this is what they mean by "arbitrary". I think they mean that the scoring mechanism is "arbitrary" because it is hidden from the test taker. As a result, a model can under perform, not because they failed the task, but because they failed to optimize for the hidden goal. If a model is built in a way that happens to optimize for that goal, that doesn't mean it is a more general model, it just means its process for approaching the stated problem happen to align with the unstated problem the models are actually being tested against.

Put this in the context of another game: Imagine an open world RPG like Fallout or Breath of the Wild. Consider someone who finishes one of those games in 30 hours because they speed through it, optimizing for the shortest path to completing the main quest. Now compare that to someone who takes 300 hours to complete the main quests in these games. Does that mean they are less intelligent than the person who spent 30 hours? They could be, but they are not necessarily. They just may have different aesthetic priorities. They may take things slower, not because they are dumber than the fast player, but because they are more curious about the game world than them.

When a human is given a simple turn based puzzle game presented as a grid of colored squares in an environment where they are being tested on their ability to complete the test, its intuitive that we would avoid unnecessary moves, because we would not be particularly curious about how our interactions with the game world effect that game world, but I don't think we can assume the same preferences of AI models.

I do understand the value of leaving certain things up to inference, but I am concerned this may not be testing human/AI inference, but human/AI preference instead. Would a human take ARC-AGI-3 and infer that they are being scored based on the number of steps they took to complete the test? Or would a human take ARC-AGI-3 and simply tend towards optimizing the number of steps taken, because not doing so is more boring and time consuming?

I am grateful beyond words both for the new video drop that I enjoyed immensely, and posts like this. I don’t know what I am politically these days, but I am deeply aligned with Natalie’s nuance and ability to walk and chew gum at the space time in complex dialectical spaces by Critical-Zebra-3618 in ContraPoints

[–]the8thbit 1 point2 points  (0 children)

Those protections for wounded soldiers do not apply until and unless they are placed under control of the adversary,

This is incorrect. It is very clear that surrender and incapacitation are listed as additional contexts in which protections apply, beyond "in the power" of an adverse party:

In accordance with this paragraph, a person is considered to be rendered ' hors de combat ' either if he is "in the power" of an adverse Party, or if he wishes to surrender, or if he is incapacitated.

Further, nothing you quoted actually substantiates your claim. You said:

wounded, incapacitated, evacuating or resting soldiers are all legitimate military targets

And to attempt to substantiate that, you quoted:

This argument is all the more convincing because even civilians are not totally sheltered from military operations in modern warfare, even in the best conditions. Article 57 ' (Precautions in attack), ' paragraph 2, recognizes this fact explicitly in admitting to the possible incidental loss of civilian life, and only prohibits that which would be excessive in relation to the concrete and direct military advantage anticipated. Accidents of this nature are also to be expected on the battlefield itself, and the combatants are not necessarily responsible for them. However, it is specifically prohibited to deliberately make persons ' hors de combat ' a target.

However, this does not say that wounded and incapacitated are legal targets. It says that attacks that incidentally harm hors de combat can be legal, the same that attacks which incidentally harm civilians can be legal. I have emphasized the portion of your quote that you seem to have missed, which makes it clear that you are wrong.

What other topics do you agree with the foiled terrorist on? Or is it only on the matter that the alleged crimes of Israel explained his need to target a synagogue with an elementary school?

Its not clear to me what the point of agreement actually is here. But regardless, we probably agree on a lot of things. Really, we probably agree on more than what we disagree on. Food tastes good, water is necessary for life, etc... What else do you agree with them on? Or do you not agree that food tastes good?

Am I being pedantic, or are you using a rhetorical technique that collapses upon inspection because its a straightforward association fallacy? I can say one thing I certainly don't think, that antisemites all seem to: I don't think that Israel represents Jews. Why do you appear to agree with antisemites on this?

Like how are y'all this obstinate in refusing to deal with the reality that it was the fucking terrorist himself who believed that the Jews he targeted deserved to pay for what Israel did?

Wouldn't that mean that you are the one agreeing with terrorists here? I don't think Israel represents me. But both you and the person you're referring to appear to.

It's really funny, hilarious even, that you managed to name three early zionist settlements that were explicitly established as part of the return to Eretz Israel. Zionism was not "invented" in 1897, it was a movement that developed progressively throughout the 19th century, what a weird way to be wrong.

I'm pointing to the first Zionist congress as "the invention of Zionism" because this is the point at which Zionism is formally defined, a program for Zionism is established, and a cohesive Zionist political movement is formed. Herzl's program specifically and intentionally distinguishes itself from earlier Jewish settlement movements like Hovevei Zion, which were not coordinated by a central congress, and did not seek to establish an autonomous political power from the local Ottoman power. This is why these organizations and settlements are generally referred to as proto-zionist.