• 19 Posts
  • 312 Comments
Joined 1 year ago
cake
Cake day: August 13th, 2023

help-circle
  • so openai is claimed to be doing great on the FrontierMath dataset. I’ve already seen the usual sort of dipshits using this to pump ai on reddit, and here’s a post that went to the frontpage on HN:

    https://xenaproject.wordpress.com/2024/12/22/can-ai-do-maths-yet-thoughts-from-a-mathematician/

    (tl;dr only a few problems from the dataset are public but if representative the problems are about 25% survivable by an undergrad; coincidentally this is the % openai says their models are completing.)

    this post is by kevin buzzard. he has a let’s say not easily beloved personality, but I don’t think of him as credulous or grifty, and people in his area regard him as an excellent mathematician.

    he points out but I think does not focus enough on how discrediting the secretive nature of the dataset is. the fact that you can’t make it public is necessary to run such experiments in a scientifically reasonable way, but also makes it totally impossible to run the experiment in a scientifically reasonable way. an experiment which cannot be examined or reproduced is actually the opposite of science. it’s pure grift fuel