• 23/02/2025 15:41

Xai is accused of manipulation with Grok test results

Recent statements about artificial intelligence from XAI. Open Grok 3 testing data on the AIME 2025 platform proved to be controversial, which led to the accusations of possible distortion of real results. According to IZ with reference to Techcrunch, Openai representatives stated that the graphs published by XAI did not take into account the methodology of Cons@64, which significantly influences the final assessments of models. However, the founder of the XAI insists that the company acted as part of the correct assessment of the capabilities of its product. Xai presented Grok 3 as the smartest AI in the world, but without taking into account a special technique that gives competitors an additional advantage. Under standard testing, Grok 3 Reasoning Beta shows lower performance than competitive Openai models, including O3-mini-High. Researchers say that without a clear comparison of all models on equal terms, it is difficult to evaluate the real performance of each of them, which only enhances confusion among users and investors.

Discussion of AI testing methodology goes beyond this conflict. Artificial intelligence researchers have repeatedly emphasized that benchmarks do not always completely reflect the real capabilities of technology. In addition, the question of the cost of resources that consume companies on maximum indicators remains open. In this regard, many experts offer a unified approach to testing AI models, which will avoid such disputes in the future.

Source

Leave a Reply

Your email address will not be published. Required fields are marked *