So, why measure toy benchmark programs?
You don't have time to inspect the source code of real applications to check that different implementations are kind-of comparable. Is it really the same program when written in different programming languages?
You do have time to inspect 100-line programs. You do have time to write 100-line programs. You still might have something to learn from how other people write 100-line programs.
Non-motivation: We are profoundly uninterested in claims that these measurements, of a few tiny programs, somehow define the relative performance of programming languages.