Hitting the Books: Why AI won’t be taking our cosmology jobs

The problem with studying the universe around us is that it is simply too big. The stars overhead remain too far away to interact with directly, so we are relegated to testing our theories on the formation of the galaxies based on observable data.
Simulating these celestial bodies on computers has proven an immensely useful aid in wrapping our heads around the nature of reality and, as Andrew Pontzen explains in his new book, The Universe in a Box: Simulations and the Quest to Code the Cosmos, recent advances in supercomputing technology are further revolutionizing our capability to model the complexities of the cosmos (not to mention myriad Earth-based challenges) on a smaller scale. In the excerpt below, Pontzen looks at the recent emergence of astronomy-focused AI systems, what they’re capable of accomplishing in the field and why he’s not too worried about losing his job to one.
Adapted from THE UNIVERSE IN A BOX: Simulations and the Quest to Code the Cosmos by Andrew Pontzen published on June 13, 2023 by Riverhead, an imprint of Penguin Publishing Group, a division of Penguin Random House LLC. Copyright © 2023 Andrew Pontzen.
As a cosmologist, I spend a large fraction of my time working with supercomputers, generating simulations of the universe to compare with data from real telescopes. The goal is to understand the effect of mysterious substances like dark matter, but no human can digest all the data held on the universe, nor all the results from simulations. For that reason, artificial intelligence and machine learning is a key part of cosmologists’ work.
Consider the Vera Rubin Observatory, a giant telescope built atop a Chilean mountain and designed to repeatedly photograph the sky over the coming decade. It will not just build a static picture: it will particularly be searching for objects that move (asteroids and comets), or change brightness (flickering stars, quasars and supernovae), as part of our ongoing campaign to understand the ever-changing cosmos. Machine learning can be trained to spot these objects, allowing them to be studied with other, more specialized telescopes. Similar techniques can even help sift through the changing brightness of vast numbers of stars to find telltale signs of which host planets, contributing to the search for life in the universe. Beyond astronomy there are no shortage of scientific applications: Google’s artificial intelligence subsidiary DeepMind, for instance, has built a network that can outperform all known techniques for predicting the shapes of proteins starting from their molecular structure, a crucial and difficult step in understanding many biological processes.
These examples illustrate why scientific excitement around machine learning has built during this century, and there have been strong claims that we are witnessing a scientific revolution. As far back as 2008, Chris Anderson wrote an article for Wired magazine that declared the scientific method, in which humans propose and test specific hypotheses, obsolete: ‘We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.’
I think this is taking things too far. Machine learning can simplify and improve certain aspects of traditional scientific approaches, especially where processing of complex information is required. Or it can digest text and answer factual questions, as illustrated by systems like ChatGPT. But it cannot entirely supplant scientific reasoning, because that is about the search for an improved understanding of the universe around us. Finding new patterns in data or restating existing facts are only narrow aspects of that search. There is a long way to go before machines can do meaningful science without any human oversight.
To understand the importance of context and understanding in science, consider the case of the OPERA experiment which in 2011 seemingly determined that neutrinos travel faster than the speed of light. The claim is close to a physics blasphemy, because relativity would have to be rewritten; the speed limit is integral to its formulation. Given the enormous weight of experimental evidence that supports relativity, casting doubt on its foundations is not a step to be taken lightly.
Knowing this, theoretical physicists queued up to dismiss the result, suspecting the neutrinos must actually be traveling slower than the measurements indicated. Yet, no problem with the measurement could be found – until, six months later, OPERA announced that a cable had been loose during their experiment, accounting for the discrepancy. Neutrinos travelled no faster than light; the data suggesting otherwise had been wrong.
Surprising data can lead to revelations under the right circumstances. The planet Neptune was discovered when astronomers noticed something awry with the orbits of the other planets. But where a claim is discrepant with existing theories, it is much more likely that there is a fault with the data; this was the gut feeling that physicists trusted when seeing the OPERA results. It is hard to formalize such a reaction into a simple rule for programming into a computer intelligence, because it is midway between the knowledge-recall and pattern-searching worlds.
The human elements of science will not be replicated by machines unless they can integrate their flexible data processing with a broader corpus of knowledge. There is an explosion of different approaches toward this goal, driven in part by the commercial need for computer intelligences to explain their decisions. In Europe, if a machine makes a decision that impacts you personally – declining your application for a mortgage, maybe, or increasing your insurance premiums, or pulling you aside at an airport – you have a legal right to ask for an explanation. That explanation must necessarily reach outside the narrow world of data in order to connect to a human sense of what is reasonable or unreasonable.
Problematically, it is often not possible to generate a full account of how machine-learning systems reach a particular decision. They use many different pieces of information, combining them in complex ways; the only truly accurate description is to write down the computer code and show the way the machine was trained. That is accurate but not very explanatory. At the other extreme, one might point to an obvious factor that dominated a machine’s decision: you are a lifelong smoker, perhaps, and other lifelong smokers died young, so you have been declined for life insurance. That is a more useful explanation, but might not be very accurate: other smokers with a different employment history and medical record have been accepted, so what precisely is the difference? Explaining decisions in a fruitful way requires a balance between accuracy and comprehensibility.
In the case of physics, using machines to create digestible, accurate explanations which are anchored in existing laws and frameworks is an approach in its infancy. It starts with the same demands as commercial artificial intelligence: the machine must not just point to its decision (that it has found a new supernova, say) but also give a small, digestible amount of information about why it has reached that decision. That way, you can start to understand what it is in the data that has prompted a particular conclusion, and see whether it agrees with your existing ideas and theories of cause and effect. This approach has started to bear fruit, producing simple but useful insights into quantum mechanics, string theory, and (from my own collaborations) cosmology.
These applications are still all framed and interpreted by humans. Could we imagine instead having the computer framing its own scientific hypotheses, balancing new data with the weight of existing theories, and going on to explain its discoveries by writing a scholarly paper without any human assistance? This is not Anderson’s vision of the theory-free future of science, but a more exciting, more disruptive and much harder goal: for machines to build and test new theories atop hundreds of years of human insight.
This article originally appeared on Engadget at https://www.engadget.com/hitting-the-books-universe-in-a-box-andrew-pontzen-riverhead-books-153005483.html?src=rss
The problem with studying the universe around us is that it is simply too big. The stars overhead remain too far away to interact with directly, so we are relegated to testing our theories on the formation of the galaxies based on observable data. Simulating these celestial bodies on computers…
Recent Posts
- Hackers are targeting Signal with new QR code-linked cyberattack
- DJI’s RS 4 Mini camera stabilizer can now track moving people
- Dune: Awakening will spice things up on May 20
- GoPro unveils a much cheaper 360-degree camera, but it’s not the all-new Max 2 that we’ve been waiting for
- Among Us 3D will let you deduce from a first-person perspective
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010