ALL STORIES
Analysis · AI

The benchmark wars are quietly breaking science

When everyone optimises for the same test, the test stops measuring anything at all.

The benchmark wars are quietly breaking science

Photograph: tek54

A leaderboard is a powerful thing. It can also be a trap.

The numbers keep going up. Whether anything is actually improving is a harder question than the charts suggest.

References

  1. Goodhart's law, restated for ML
Machine learningOpen source