TIL The Goodhart's Law: Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.
TIL The Goodhart's Law: Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.
Here is the Wikipedia link.
Alternative (and generally easier to understand) formulation: Once a measure becomes a target, it ceases to be a good measure.
See: grades, GDP, workplace metrics...
Yeah. OP's overly complicated explanation doesn't convey the reason why this happens, which is really kinda just human psychology. People are probably thinking it's like observability in quantum mechanics or some shit.
Goodhart's Law Example:
It sounds like this is an example you chose from first hand experience. If so, I'm very sorry. That sounds incredibly frustrating.
I never heard of this before, but I now know the word to describe exercise problems!
Body builders who are judged by how bulging their muscles are feel like garbage despite supposedly being "peak".
People with high muscle mass or tall being screwed over by BMI targets.
People who are told weight indicates health ignore everything else in exchange for lowering calories.
Even in high school I remember how they would judge you based on like how many push ups you could do... no one who did a ton did proper push ups. Which led to them not helping at all as actual exercise, and even possibly leading to injury.
Heck, we can even use this for stupid Dog Shows, where because they measure specific things for the "goodness" of the dog, they screw over the dog in every way imaginable that isn't being judged.
This is a good law to know. I like knowing this law. It's sad how often it's used, but it's good to know.
A big part of it seems to be manipulation of the results? So, like, devs writing tests for more parts of the code base, but ones that are written to always pass.
Lol I really don't think anyone was thinking that
ya, code testing doesn't actually increase code complexity, nor worsen the code base, and tends to actually reduce and avoid bugs
This is the first line of the linked article btw.
So how does a company manage anything if they can't use measurement targets?
Like software engineering. How do you improve productivity or code quality if setting a target value for a measurement doesn't work?
You don’t make your measures your targets.
Example:
“Our customers hate us. We will make our employees get a 10 on their surveys for each customer or we’ll punish them” makes the measure a target.
“Our customers hate us, so we’re going to change our shitty policies to be more consumer friendly and see how our customers respond” keeps the measure as a measure.
Ok I'm going to answer my own question because I'm too curious to wait lol
[Source]
Thanks, yes, I saw that one, too, but I liked the emphasis on relationships. The shorter version is easier to get but it does not explain why this happens. E.g., you can observe some relationship (e.g., test results and a student's intelligence) and then you target grades. But then you have an incentive to teach to the test, which breaks down the relationship between test results and intelligence. Other people here gave great examples of relationships that can fail.
Imagine an antivirus program that looks at a piece of code and outputs either "Yes, this is malware" or "No, this is not malware." It is not perfect, but it is pretty good.
If the malware authors have access to this program, they can test their malware with it. They can keep modifying their malware until it passes the antivirus program.
Once the antivirus people publish a function AV(code)→boolean, the malware people can use that function to make malware that the function mistakes for non-malware.
If you publish the exact metric that you promise to use to make a decision, then people who want to control your decision can use that metric to test their methods of manipulating you.