What will slow research isn’t AI. It’s the flood of preprints being treated like peer-reviewed work across AI and computer science. Right now, an undergrad with a Canva poster and a faculty sponsor can push out ten preprints in a semester and get them cited like they’ve reshaped the field. OSF allows researchers to delete preregistrations, which sounds harmless until it’s used to quietly erase bad or fraudulent work. If something gets flagged, it’s gone. No history, no accountability. That’s a perfect setup for bad actors.
And we still haven’t dealt with the reproducibility crisis. We didn’t fix it. We just buried it under buzzwords, hype, and career incentives. Simultaneously, we are using completely broken scientific metaphors to justify AI architectures. We’re still pretending spiking neurons are equivalent to RNNs. That synaptic noise is optimization. That the behavior of starving mice tells us how humans think. These comparisons aren’t science. They’re branding.
Research architectures are more expensive, more power-hungry, and more opaque than ever. Despite the lack of a clear path to profitability, AI continues to consume billions of dollars in funding. The hype keeps growing. Amplified work often prioritizes speed, clout, and marketability over real understanding.
AI isn't a threat to science. The hype is. The culture is around it. The people enabling it are.
These are all points with which I agree, but they rest on an idealization very common among (and flattering to) scientists. AI is treated as without social/economic/historical context, as is often done with weapons, and of course with discredited ideas, subsequently relegated to the status of "bad science." But the science we have is the science we have, though we might hope for better. There is no way to fund AI's development (training data, compute, etc.) without a degree of "hype" the "culture" and "people enabling." That said, it does seem we could do better on all fronts, have indeed done better with at least some prior transformative technologies. It's a really interesting question of contemporary intellectual history.
Yes, this is one of many critiques of the paper. We acknowledge them in the essay and link to a discussion of the issues. The authors' response is broadly convincing to us, but nonetheless the finding should definitely be treated as preliminary, which is why we present five different lines of evidence, all seeming to point to a slowdown.
This is a fantastic essay. I do think it is worth emphasizing (as you do point out) that there is a big difference between AI that outputs a solution and AI that outputs an explanation.
But as you point out, to favor explanation-producing-AI, the incentives need to be changed for that to become dominant. We should consider how this might be changed.
One point that was not mentioned, but clearly affects the production process, is how NSF funding has changed. IDK if it has changed, but for years there were complaints that the NSF funded far to few "blue-sky" ideas and preferred to fund science experiments whose results were expected to confirm a hypothesis, and therefore deemed successful. Whether this applied to other funding agencies in the US and other countries, I have no knowledge, but if it were common it would help support your explanation.
In my experience, if a new technique proves useful, then the literature would subsequently be filled with experiments using this technique. During this period, other techniques to serve a similar role would be ignored. This is a phenomenon that happens in many endeavors, creating "fads" that last for a while, until progress slows, and new techniques and instruments are developed that work better and result in better explanations.
Funding to National Defense and Data centers subsidies might actually inhibit funding to science. The hope that AI agents might accelerate scientific breakthroughs seems fairly unlikely to actually materialize. Overconfidence in AI especially American AI also leads to huge wastes of capital chasing the wrong approaches to real innovation and R&D.
In many fields I am familiar with (translation, document drafting...), wide adoption of AI seems to go hand in hand with rising thresholds of fault tolerance. Film distributors are no longer afraid to release films with inaccurate/unidiomatic subtitles, if the cost of producing those subtitles is a fraction of employing a human to write them (or even just to check them). This culture of use seems to undercut - to some extent - your claim that commercial AI has incentives to troubleshoot and preempt/prevent errors that AI in science does not (yet) have.
I wonder also how far a general culture of higher tolerance for lower quality/less reliability in all fields might affect social and political attitudes towards science - and science funding - in the longer term? Perhaps science's authority will be undermined, not just by the specific ways scientists use AI, but by an evolution in our culture's attitude to certainty and intellectual authority in general - an evolution already underway, and which makes the widescale deployment of AI possible, as much as AI may in turn accelerate it.
It's interesting for me to run parallels between what's happening in STM publishing and developments I'm a part of in trade/consumer publishing.
In trade publishing we've never had to adopt the pretense that what we're publishing is going to advance science or knowledge -- only, we hope each time, to advance our earnings, both as authors and as publishers. BUT, of course, every nonfiction author dreams or hopes that they are making some sort of contribution -- there are few who would crow "I just rewrote seven previous bestsellers into an unoriginal new book." The want to note, at a minimum, "well, at least I added some commentary, based on my unique experience, increasing the relevance of the earlier books. Plus, I quoted from my recent consulting learnings."
Does this seem so very different from what you're describing?
Economics and technology have changed the incentives in traditional publishing of all kinds, whether a daily paper, a monthly magazine, textbooks, trade books or The Lancet.
We keep looking to legacy formats to somehow gently adapt and address the new world. Unhappy families all, they are failing to do so, each in their own way.
There needs to be a more dramatic reinvention of "publishing." I'm still hoping that AI might aid in that quest.
This reminds me of the recent paper that argues the apparent slowdown in drug research is down to the technology of generating candidate molecules advancing faster than that of evaluating them, as well as a mismatch of incentives (invent a drug and you could get rich and win a Nobel prize; invent a test or a statistical heuristic to decide when to call an experiment, and you get a citation-classic at best). For 104 candidates, going from a predictive validity of 0.4 to 0.5 is equivalent to doing 40x as many tests:
Arvind and Sayash - thought provoking!! Particularly like the section on Human Understanding. In addition to synthesis though, abstraction is a key element of human understanding. In fact, over time episodic details semanticize leaving us mostly with abstractions and yet we are able to apply them in disconnected contexts well.
One reason that science is not producing much in the way of fundamental insights never gets discussed: many of the gaps in human knowledge are *conceptual* and progress depends on doing original observational work that leads to concepts that can generate new bodies of descriptive corpora. The representations and the data for this new knowledge simply don't exist at the moment, and the normal experimental process of science won't be possible until someone dreams it up. The idea that humans have observed and conceptualized all the foundational knowledge in the world is absurd.
This type of work is exactly what LLMs are not able to do.
> Sloppy use of AI may help in the short run, but will hinder meaningful scientific achievement.
You mention this in the context of scientific research, but I think it is true of AI in general.
Using AI to vibe code or create some slop might get you some clicks or publish an app quickly, but without understanding what the AI created bad things can happen.
AI is a rubber stamp on the conventional wisdom because that’s all it knows.
But is the use of AI really worse than the whole peer review system?
Nothing much has changed since Thomas Kuhn's landmark book.
In fact, whole fields are exactly following the course of the geocentric theory so well discussed in your essay.
And the more funded a field of study is, the worse the conventional wisdom.
Because funding is given to scientists who tow the conventional line.
For example, dark matter and other silliness patching up discredited theories that scientists are funded to “believe” while discoveries of numerous cosmological anomalies by Halton Arp are simply ignored.
Climate “science” is similarly quack theories in the guise of science because those get funding.
Cancer research too. We know cancer is more of a metabolic disease, and yet the money and the conventional wisdom is on the long discredited genetic mutation theory.
And always, a few lone outsiders inevitably are more correct than the conventional wisdom.
Galileo was an outsider.
Why do we have to wait decades and decades for the referees who advocate discredited conventional wisdom to die off — so that decent science can eventually be published?
What feeds it is the referee system and the publishing monopoly e.g. Elsevier et al.
The science publishing monopoly should be replaced by Substack where smart people can comment and argue for and against a paper in public, in real time, building on line reputations for rigor.
What will slow research isn’t AI. It’s the flood of preprints being treated like peer-reviewed work across AI and computer science. Right now, an undergrad with a Canva poster and a faculty sponsor can push out ten preprints in a semester and get them cited like they’ve reshaped the field. OSF allows researchers to delete preregistrations, which sounds harmless until it’s used to quietly erase bad or fraudulent work. If something gets flagged, it’s gone. No history, no accountability. That’s a perfect setup for bad actors.
And we still haven’t dealt with the reproducibility crisis. We didn’t fix it. We just buried it under buzzwords, hype, and career incentives. Simultaneously, we are using completely broken scientific metaphors to justify AI architectures. We’re still pretending spiking neurons are equivalent to RNNs. That synaptic noise is optimization. That the behavior of starving mice tells us how humans think. These comparisons aren’t science. They’re branding.
Research architectures are more expensive, more power-hungry, and more opaque than ever. Despite the lack of a clear path to profitability, AI continues to consume billions of dollars in funding. The hype keeps growing. Amplified work often prioritizes speed, clout, and marketability over real understanding.
AI isn't a threat to science. The hype is. The culture is around it. The people enabling it are.
These are all points with which I agree, but they rest on an idealization very common among (and flattering to) scientists. AI is treated as without social/economic/historical context, as is often done with weapons, and of course with discredited ideas, subsequently relegated to the status of "bad science." But the science we have is the science we have, though we might hope for better. There is no way to fund AI's development (training data, compute, etc.) without a degree of "hype" the "culture" and "people enabling." That said, it does seem we could do better on all fronts, have indeed done better with at least some prior transformative technologies. It's a really interesting question of contemporary intellectual history.
See this paper claiming that the results from Park et al are due to a bug in plotting. https://arxiv.org/abs/2402.14583
Yes, this is one of many critiques of the paper. We acknowledge them in the essay and link to a discussion of the issues. The authors' response is broadly convincing to us, but nonetheless the finding should definitely be treated as preliminary, which is why we present five different lines of evidence, all seeming to point to a slowdown.
This is a fantastic essay. I do think it is worth emphasizing (as you do point out) that there is a big difference between AI that outputs a solution and AI that outputs an explanation.
But as you point out, to favor explanation-producing-AI, the incentives need to be changed for that to become dominant. We should consider how this might be changed.
I also wrote something arguing explanation-producing and solution-producing AI are different in important ways: https://thomasddewitt.substack.com/p/the-new-ai-for-science-is-different
One point that was not mentioned, but clearly affects the production process, is how NSF funding has changed. IDK if it has changed, but for years there were complaints that the NSF funded far to few "blue-sky" ideas and preferred to fund science experiments whose results were expected to confirm a hypothesis, and therefore deemed successful. Whether this applied to other funding agencies in the US and other countries, I have no knowledge, but if it were common it would help support your explanation.
In my experience, if a new technique proves useful, then the literature would subsequently be filled with experiments using this technique. During this period, other techniques to serve a similar role would be ignored. This is a phenomenon that happens in many endeavors, creating "fads" that last for a while, until progress slows, and new techniques and instruments are developed that work better and result in better explanations.
Funding to National Defense and Data centers subsidies might actually inhibit funding to science. The hope that AI agents might accelerate scientific breakthroughs seems fairly unlikely to actually materialize. Overconfidence in AI especially American AI also leads to huge wastes of capital chasing the wrong approaches to real innovation and R&D.
In many fields I am familiar with (translation, document drafting...), wide adoption of AI seems to go hand in hand with rising thresholds of fault tolerance. Film distributors are no longer afraid to release films with inaccurate/unidiomatic subtitles, if the cost of producing those subtitles is a fraction of employing a human to write them (or even just to check them). This culture of use seems to undercut - to some extent - your claim that commercial AI has incentives to troubleshoot and preempt/prevent errors that AI in science does not (yet) have.
I wonder also how far a general culture of higher tolerance for lower quality/less reliability in all fields might affect social and political attitudes towards science - and science funding - in the longer term? Perhaps science's authority will be undermined, not just by the specific ways scientists use AI, but by an evolution in our culture's attitude to certainty and intellectual authority in general - an evolution already underway, and which makes the widescale deployment of AI possible, as much as AI may in turn accelerate it.
The second section reminded of this paper in which a simple chi-square stat was computed wrong and yet it got published in Nature journal
https://medium.com/@saikrishna_17904/do-we-tilt-right-or-left-to-kiss-chi-square-test-847552007ac9
It's interesting for me to run parallels between what's happening in STM publishing and developments I'm a part of in trade/consumer publishing.
In trade publishing we've never had to adopt the pretense that what we're publishing is going to advance science or knowledge -- only, we hope each time, to advance our earnings, both as authors and as publishers. BUT, of course, every nonfiction author dreams or hopes that they are making some sort of contribution -- there are few who would crow "I just rewrote seven previous bestsellers into an unoriginal new book." The want to note, at a minimum, "well, at least I added some commentary, based on my unique experience, increasing the relevance of the earlier books. Plus, I quoted from my recent consulting learnings."
Does this seem so very different from what you're describing?
Economics and technology have changed the incentives in traditional publishing of all kinds, whether a daily paper, a monthly magazine, textbooks, trade books or The Lancet.
We keep looking to legacy formats to somehow gently adapt and address the new world. Unhappy families all, they are failing to do so, each in their own way.
There needs to be a more dramatic reinvention of "publishing." I'm still hoping that AI might aid in that quest.
This reminds me of the recent paper that argues the apparent slowdown in drug research is down to the technology of generating candidate molecules advancing faster than that of evaluating them, as well as a mismatch of incentives (invent a drug and you could get rich and win a Nobel prize; invent a test or a statistical heuristic to decide when to call an experiment, and you get a citation-classic at best). For 104 candidates, going from a predictive validity of 0.4 to 0.5 is equivalent to doing 40x as many tests:
https://www.nature.com/articles/s41573-022-00552-x.epdf?sharing_token=UAd7xkgoc3sGOe1KIkhqh9RgN0jAjWel9jnR3ZoTv0NCj65ouIhd_KrJ7CxCFmbJ2TFq0lOfa404SWvMspmI5HUyItjPqmmnyWXClFZb-miSYwYal_WrrGSIEXhlXlOsdbeagcaR77R65JnT5n-db_cugkiD4npkm_W7d_Bvdqk%3D
You can either think of this as everyone bogging down in a vast pool of slop, or alternatively think about how AI might contribute to evaluation....
Arvind and Sayash - thought provoking!! Particularly like the section on Human Understanding. In addition to synthesis though, abstraction is a key element of human understanding. In fact, over time episodic details semanticize leaving us mostly with abstractions and yet we are able to apply them in disconnected contexts well.
One reason that science is not producing much in the way of fundamental insights never gets discussed: many of the gaps in human knowledge are *conceptual* and progress depends on doing original observational work that leads to concepts that can generate new bodies of descriptive corpora. The representations and the data for this new knowledge simply don't exist at the moment, and the normal experimental process of science won't be possible until someone dreams it up. The idea that humans have observed and conceptualized all the foundational knowledge in the world is absurd.
This type of work is exactly what LLMs are not able to do.
> Sloppy use of AI may help in the short run, but will hinder meaningful scientific achievement.
You mention this in the context of scientific research, but I think it is true of AI in general.
Using AI to vibe code or create some slop might get you some clicks or publish an app quickly, but without understanding what the AI created bad things can happen.
AI is a rubber stamp on the conventional wisdom because that’s all it knows.
But is the use of AI really worse than the whole peer review system?
Nothing much has changed since Thomas Kuhn's landmark book.
In fact, whole fields are exactly following the course of the geocentric theory so well discussed in your essay.
And the more funded a field of study is, the worse the conventional wisdom.
Because funding is given to scientists who tow the conventional line.
For example, dark matter and other silliness patching up discredited theories that scientists are funded to “believe” while discoveries of numerous cosmological anomalies by Halton Arp are simply ignored.
Climate “science” is similarly quack theories in the guise of science because those get funding.
Cancer research too. We know cancer is more of a metabolic disease, and yet the money and the conventional wisdom is on the long discredited genetic mutation theory.
And always, a few lone outsiders inevitably are more correct than the conventional wisdom.
Galileo was an outsider.
Why do we have to wait decades and decades for the referees who advocate discredited conventional wisdom to die off — so that decent science can eventually be published?
What feeds it is the referee system and the publishing monopoly e.g. Elsevier et al.
The science publishing monopoly should be replaced by Substack where smart people can comment and argue for and against a paper in public, in real time, building on line reputations for rigor.
most excellent analysis. thanks for creating and sharing!