Increased efficiency can sometimes, counterintuitively, lead to worse outcomes. This is true almost everywhere. We will name this phenomenon the strong version of Goodhart's law. As one example, more efficient centralized tracking of student progress by standardized testing seems like such a good idea that well-intentioned laws mandate it. However, testing also incentivizes schools to focus more on teaching students to test well, and less on teaching broadly useful skills. As a result, it can cause overall educational outcomes to become worse. Similar examples abound, in politics, economics, health, science, and many other fields.
[...] This same counterintuitive relationship between efficiency and outcome occurs in machine learning, where it is called overfitting. [...] If we keep on optimizing the proxy objective, even after our goal stops improving, something more worrying happens. The goal often starts getting worse, even as our proxy objective continues to improve. Not just a little bit worse either — often the goal will diverge towards infinity.
This is an extremely general phenomenon in machine learning. It mostly doesn’t matter what our goal and proxy are, or what model architecture we use. If we are very efficient at optimizing a proxy, then we make the thing it is a proxy for grow worse.
Showing posts with label data science. Show all posts
Showing posts with label data science. Show all posts
Sunday, March 30, 2025
Too much efficiency makes everything worse
From "Overfitting and the strong version of Goodhart's law":
Thursday, July 13, 2023
Compression Schemes as Prediction Schemes
Awhile ago I mentioned inference in a Github post on my other blog in reference to security. But when we talk about inference in a wider informational sense, we often talk about complexity.
Labels:
compression,
data science,
gzip,
linguistics,
programming
Subscribe to:
Comments (Atom)
Using Python To Access archive.today, July 2025
It seems like a lot of the previous software wrappers to interact with archive.today (and archive.is, archive.ph, etc) via the command-line ...
-
Latin1 was the early default character set for encoding documents delivered via HTTP for MIME types beginning with /text . Today, only ...
-
From "Overfitting and the strong version of Goodhart's law" : Increased efficiency can sometimes, counterintuitively, lead to ...
-
Playing around with writing malware proof-of-concepts, running red and blue team simulations in my computer lab against Windows Home edition...