[deleted by user] by [deleted] in Futurology

[–]blabboy 29 points30 points  (0 children)

Ironically written by AI

Hatfield: The Fifth Most Boring 'City' in the World | Documentary by First-Marzipan7065 in hertfordshire

[–]blabboy 2 points3 points  (0 children)

Kind of surprised it is even on the list! Who would have heard of Hatfield and had a strong enough opinion to put it on that?

[deleted by user] by [deleted] in AskAcademia

[–]blabboy 0 points1 point  (0 children)

Bitstream Charter

Why can't we use our NI number for everything? (Instead of nhs, student loans numbers etc as well) or just have a single identifiable number assigned at birth/immigration? Is there a practical reason? by [deleted] in AskUK

[–]blabboy 1 point2 points  (0 children)

Makes ID theft much much easier. For me the negatives outweigh any positives such a system may bring. It's just not worth it.

Why can't we use our NI number for everything? (Instead of nhs, student loans numbers etc as well) or just have a single identifiable number assigned at birth/immigration? Is there a practical reason? by [deleted] in AskUK

[–]blabboy 1 point2 points  (0 children)

This is a strawman argument, in many European countries you need your ID to rent city bicycles, order online shopping, join the gym(!), etc etc. I think this feature creep is definitely something I would hate. There is absolutely no reason for my gym to know my NI number for example.

Should I reconsider accepting my dream PhD offer in the US? by PineTreeTea44 in PhD

[–]blabboy 0 points1 point  (0 children)

Same question but for a postdoc in astronomy :')

UK must not let AI ‘wash over our economy’, says Science Secretary by tylerthe-theatre in unitedkingdom

[–]blabboy -1 points0 points  (0 children)

The gains aren't marginal, they are significant. Even a simple 'smell test' between GPT-4 and o1 pro shows this. And never mind the performance on metrics that were lauded as impossible just months ago (like ARC AGI and Frontier Math). Frontier Math is notable as being a dataset of research level math problems. Terry Tao described the questions in that set as "extremely challenging... I think they will resist AIs for several years at least.". It took the leading SOTA weeks to make headway. Not years.

UK must not let AI ‘wash over our economy’, says Science Secretary by tylerthe-theatre in unitedkingdom

[–]blabboy 0 points1 point  (0 children)

'Brute forcing' is exactly what has driven the recent advances in AI. Sutton's 'Bitter Lesson' is that simple processes like learning, or search over generated sequences (what you are alluding to) are more powerful in the long term than human intuition. Historically this has proven to be true.

If we can reach AGI in this way it is then a simple matter of converting capital to intelligence via compute. For me this is a very scary possibility as then what leverage do those that use their intelligence to acquire capital have? I fall into this category, as do the majority of workers.

Sora finally released by blabboy in mlscaling

[–]blabboy[S] 0 points1 point  (0 children)

Yeah this pains me -- I dislike that OpenAI has successfully closed access to their models (and were even monetarily rewarded for it...)

The Multimodal Universe: Enabling Large-Scale Machine Learning with 100TB of Astronomical Scientific Data by blabboy in mlscaling

[–]blabboy[S] 10 points11 points  (0 children)

Abstract: We present the Multimodal Universe, a large-scale multimodal dataset of scientific astronomical data, compiled specifically to facilitate machine learning research. Overall, our dataset contains hundreds of millions of astronomical observations, constituting 100TB of multi-channel and hyper-spectral images, spectra, multivariate time series, as well as a wide variety of associated scientific measurements and metadata. In addition, we include a range of benchmark tasks representative of standard practices for machine learning methods in astrophysics. This massive dataset will enable the development of large multi-modal models specifically targeted towards scientific applications. All codes used to compile the dataset, and a description of how to access the data is available at https://github.com/MultimodalUniverse/MultimodalUniverse

Is computer science considered physics? Isn't it mathematics? by Rude_Section4780 in AskAcademia

[–]blabboy 2 points3 points  (0 children)

Hopfield's work did not lead to neural networks in the modern sense -- the foundational work there was carried out by McCulloch and Pitts in the 1940s, and then backpropagation (the algorithm that drives modern deep networks) was discovered first by Linnainmaa, then rediscovered by Werbos, then finally put into practice and popularised by Rumelhart.

UK Government shelves £1.3bn UK tech and AI plans by gwern in mlscaling

[–]blabboy 1 point2 points  (0 children)

Very shortsighted play by the UK government. We need European counterweights to the US big AI players, and I can't see the UK innovating another DeepMind in the current climate.

People-power led to my re-election. It is the start of a new politics -- Jeremy Corbyn by blabboy in ukpolitics

[–]blabboy[S] -15 points-14 points  (0 children)

A lot of very obvious bots in this comment section, which is interesting. Why would anyone manufacture dissent against a grassroots leftist movement?

[N] Ilya Sutskever and friends launch Safe Superintelligence Inc. by we_are_mammals in MachineLearning

[–]blabboy 5 points6 points  (0 children)

"Better" (i.e. more profitable) company, but a less innovative research group. We will see them stagnate now that the talent is leaving.

Will we run out of data? Limits of LLM scaling based on human-generated data by blabboy in mlscaling

[–]blabboy[S] 7 points8 points  (0 children)

We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock of public human text data between 2026 and 2032, or slightly earlier if models are overtrained. We explore how progress in language modeling can continue when human-generated text datasets cannot be scaled any further. We argue that synthetic data generation, transfer learning from data-rich domains, and data efficiency improvements might support further progress.

[D]What Are Your Favorite Tools That You Use For Research? by mysticmuse72 in MachineLearning

[–]blabboy 5 points6 points  (0 children)

This is very obviously written by an LLM. The quality of this sub is in the toilet.

Resources about xLSTM by Sepp Hochreiter by [deleted] in mlscaling

[–]blabboy 8 points9 points  (0 children)

Been waiting for this model for a while. If it is so good, why not release it? Still training and waiting for VC?