Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

I found mistakes in OpenAI's HealthBench using AI

david-gilbertson.medium.com

1 points by Kuinox 13 hours ago