@baldur You are absolutely correct about this, but to be fair, those models are super brittle and can be "broken" by over-tuning in post-training, where certain cases start to perform worse while others may or may not improve.
They're definitely not degrading models on purpose, as it would mean having a worse product for the same cost of inference.
In reply to
Patrys
@patrys@mastodon.online
Patryk Zawadzki, software engineer, systems admin, FLOSS contributor, Pythonista, ex-GNOME Foundation, CTO & Founder of Saleor Commerce. Not a tech bro. (he/him)
mastodon.online
1
2
0
Conversation (2)
Showing 0 of 2 cached locally.
Syncing comments from the remote thread. 2 more replies are still loading.
Loading comments...