正式发布M3能不能测一下deepswe,感觉swepro不能反应模型coding能力了。

#34
by PhelixZhen - opened

如题。
Once it’s officially released, could also benchmark it on DeepSWE? Feeling that SWE-Pro isn’t a good reflection of nowday's model’s coding ability.

PhelixZhen changed discussion title from 正式发布能不能测一下deepswe,感觉swepro不能反应模型coding能力了。 to 正式发布M3能不能测一下deepswe,感觉swepro不能反应模型coding能力了。

You would think that the website, which is the largest open source bleeding edge AI model and dataset community and repository would atleast have a button that can translate text from one language to another by now... 🤯

Sign up or log in to comment