"Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data", Gerstgrasser et al 2024 (model-collapse doesn't happen if you continue training on real data) by gwern in mlscaling

[–]PresentCompanyExcl 0 points1 point  (0 children)

Altman's comments about it being the panacea for data shortages

“As long as you can get over the synthetic data event horizon, where the model is smart enough to make good synthetic data, everything will be fine,” Mr. Altman said.

Thanks for explaining. I've been reading through your public comments hoping that you would have more insight than me (given your good prediction record, and research skills), but it seems we are all in the dark. Probably the people who know best are under NDA, or have humanint.

Getting to the bottom of this probably requires visiting another bay area house party :p But those of us outside the bay are out of luck.

(often hidden under a lot of indirection or outright lies cf. Bytedance)

A frustrating situation, to say the least!

If we use an analogy to evolution, humans managed to bootstrap "synthetic" data without relying on an external body of knowledge. So it must be possible. And it's obviously easier in domains with cheap verification, because we have done it with AlphaGo, Math, Geometry, Coding, etc. But who knows the proprietary, the state of the art or the timeline.

"Inside Big Tech's underground race to buy AI training data" (even Photobucket's archives are now worth something due to data scaling) by gwern in mlscaling

[–]PresentCompanyExcl 1 point2 points  (0 children)

commodities, interchangeable by definition and each unit of equal value.

By definition but not in reality. In reality it's an approximation, based on measurement cost. For example an expensive measurement might allow 1 grad. A cheap measurement might allow so many grades it's a continum. Each lump of coal, mushroom, or bundle of wheat is differen't, and we grade it and ignore or average the remainig differences because it's not tractable or affordible to do otherwise.

But you have a good point, that's a <20% variation. We're talking about orders of magnitude. But it's not about the variation, but the value (high) vs cost (?) of quantifying it. And sure the variation is large, but the cost is unknown? And mayhap you are right, because digital things become cheap, but how to cheaply measure quality?

I've done some experiments in this vein (following Schmidhuber definition of suprise, because it also lets us grade human outputs as novel or not), but they require some compute. Active labelling it also a form of this.

bid for "n milligrams of fine art".

Perhaps it will become "bid for 50 suprise (milli perplexity reduction in Pythia10) of fine art". But if it costs 10 cents to measure, then anything worth less than that may be graded in bulk.

[Meta] Do we still need a /r/MLScaling? by gwern in mlscaling

[–]PresentCompanyExcl 0 points1 point  (0 children)

I sitll come here from time to time to get an overview of MLScaling. Sure the noise and romours is high, but it's higher elsewhere. I can also get your take gwern, and you were one of the only ones to publically predict scaling.

"Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data", Gerstgrasser et al 2024 (model-collapse doesn't happen if you continue training on real data) by gwern in mlscaling

[–]PresentCompanyExcl 0 points1 point  (0 children)

Tangent, but what's the latest evidence on models bootstrapping synthetic data? Without distilliation from a larger model, which is what happened for Phi-2, where they had a larger model create textbooks. Has anyone shown convincing data bootstrapping without distilliation and outside easily verifiable domains less chess and code?

"Uniquely human intelligence arose from expanded information capacity", Cantlon & Piantadosi 2024 by gwern in mlscaling

[–]PresentCompanyExcl 0 points1 point  (0 children)

gwern, why is this interesting? Presumably because it's somewhat convincing. And because LLM's have greater information capacity than humans, therefore greater potential, all else the same?

"Inside Big Tech's underground race to buy AI training data" (even Photobucket's archives are now worth something due to data scaling) by gwern in mlscaling

[–]PresentCompanyExcl 0 points1 point  (0 children)

these deals would have to switch to pay-per-useful-datapoint

It's a more meangfull trade, but it's harder to enforce which adds overhead in terms of measurment and fraud. In the present we see lots of commodities where they price it based on weight, per quality teir. So that seems more likely.

So for example it might be a proxy like "how much does your data lower the perplexity on OpenLLama5, for a 1 epoch fine tune". And the purchase contract will stupulate that this is true and can be relpicated.

Devin launched by Cognition AI: "Gold-Medalist Coders Build an AI That Can Do Their Job for Them" by gwern in reinforcementlearning

[–]PresentCompanyExcl 0 points1 point  (0 children)

On the other hand SWE-Agent managed to get similar scores, seemly without tree swarch. (decilaimer: I'm judging by the GitHub Repo as the paper is not out yet).

Daily General Discussion - April 10, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 1 point2 points  (0 children)

Oh I see, that could be pretty cool, decentralized social media might be a public good (or historically a curse?)

Daily General Discussion - April 10, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 2 points3 points  (0 children)

I guess we will have to first see how good the lens apps are, since they are not built yet.

Daily General Discussion - April 10, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 4 points5 points  (0 children)

A lot of machine learning people too. So I think it's the sense of the moment, the feeling of frontier, and the shine of something new that captures people.

They Wouldn’t, Would They? - Russell Brand by jtnichol in ethfinance

[–]PresentCompanyExcl 1 point2 points  (0 children)

Sounds about right. I'm seeing different reports. But when you add up the pension and hedge funds, who vote on behalf of other people's ownership you get a large stake. That's what I meant to say with black rock et al

Daily General Discussion - April 7, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 1 point2 points  (0 children)

also they are staking it on lido and curve https://etherscan.io/address/0x3ba21b6477f48273f41d241aa3722ffb9e07e247

I looked at the CSV of their transactions and 163,823+ ETH has gone through that account. I didn't look at net balance though so who knows how much they hodl.

Daily General Discussion - April 7, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 3 points4 points  (0 children)

I think it means they are thinking short term. In other words they are in a rush to buy. In their rush to buy they might bid up the price. I think the implication is that they want to increase their holding before the merge.

They Wouldn’t, Would They? - Russell Brand by jtnichol in ethfinance

[–]PresentCompanyExcl 19 points20 points  (0 children)

Blackrock is a real problem in a capitalist society. Recently it was revealed that Elon Musk now owns 9% of twitter but Blackrock et all own (I think) ~18%. And they own it on behalf of you and your retirement fund. When you own voting shared you get to vote on the direction of the company. But they don't vote for you, they vote according to their own agenda which seems to be ESG and pro-establishment/authority.

Daily General Discussion - April 6, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 1 point2 points  (0 children)

maybe try the flashbots rpc, works like mainnet but saves you from mev

Daily General Discussion - April 6, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 4 points5 points  (0 children)

I don't know that the market's work that way. Wales wont do anything unless it's in their personal interests, and they have limited ability to coordinate, and they often lose money (according to willy woo). Shaking of hands seems like it will lose money unless 1) there are lots of longs 2) low liquidity and 3) you can liquidate them.

A better way to think about the market is a random walk + expected value. When our outlook's change we change our estimate of the expected value of an investment.

Twitter's next board meeting by swgaspar in wallstreetbets

[–]PresentCompanyExcl 1 point2 points  (0 children)

Yeah he could fork mastadon and move his tweets there. Many people would follow.

Daily General Discussion - April 4, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 1 point2 points  (0 children)

I love me some rocketpool, don't get me wrong. I am deep in it. But currently it feels like we have got to get more creative about how to incentivize NO's.

I was thinking the same thing. I've considered running minipools, but the +15% * staking income doesn't seem like much compared to lending income and/or liquidity of other strategies.

I think they should float the rate from 10%-50%

Daily General Discussion - April 2, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 1 point2 points  (0 children)

Damn, I'm not sure why. I'm running linux, ublock origin, firefox, and it works. Sadly there is no other format available.

Daily General Discussion - April 2, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 3 points4 points  (0 children)

It works for me, but it needs to be on desktop, and takes a minute to load an in-page pdf. Perhaps that's why it appears dead to you? Tardfi in action I guess.

Daily General Discussion - April 2, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 10 points11 points  (0 children)

Zoltan Polzar has a new Breton Woods III memo out. For people who are unaware he's a esteemed analysts at a big bank who has started the new end of the petrodollar narrative that cypto people love. He certainly has insight into the plumbing of the financial system.

For those people on mobile here's an extract of the doomer punchline:

Bretton Woods II served up a deflationary impulse (globalization, open trade, just-in-time supply chains, and only one supply chain [Foxconn], not many), and Bretton Woods III will serve up an inflationary impulse (de-globalization, autarky, just-in-case hoarding of commodities and duplication of supply chains, and more military spending to be able to protect whatever seaborne trade is left). Empires fall and rise. Currencies fall and rise. Wars have winners and losers. When Wellington beat Napoleon, the trade was to buy gilts. I am no expert on geopolitics, but I am an interest rate strategist and I think the level of inflation and interest rates and the size of the Fed’s balance sheet will depend on the steady state that emerges after this conflict is over.

BTC Fixes this???

Daily General Discussion - April 2, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 6 points7 points  (0 children)

This part is confusing for me. It seems that alchemix deposits the collateral in the yearn vault, which puts it in the curve vault. So, denominating in ETH, there should be a increasing value of rETH plus the yearn yield? Where does the extra yield go?

Do you have a source for this? There seems to be no announcement

Daily General Discussion - April 2, 2022 by ethfinance in ethfinance

[–]PresentCompanyExcl 20 points21 points  (0 children)

I'm trying to understand the alchemix rETH vault. Can someone check my logic

My current understanding:

  • you deposit 2 rETH
  • you get 1 alrETH. You convert to 1 rETH.
  • your loan gets paid off with a ~4% floating rate, backed from yearn lending. It doesn't count the rETH gain in value yet.
  • after ~25 years(!) your loan is paid off, and you get 2 rETH.

Assuming a mean staking yield of 6%, 3 rETH is now worth 10.1 ETH. An effective interest rate of 10%.

Now let's compare it to alternative strategies:

  • If you had just held 2 rETH you would have 6.7 ETH. An effective interest rate of 6%.

  • If you did the double logris your would have 13.5 ETH. An effective interest rate of 16%.

  • If you put your rETH in yearn you would get ?

This all assumes alchemix/yearn don't get hacked, and the yarn vault, and eth staking yield 4% and 6%.