I am excited to finally share an idea that has been cooking for awhile: Measuring Data
https://arxiv.org/abs/2212.05129
When you "measure data", you quantify its characteristics to support dataset comparison & curation.
You also begin to know what systems will learn.