Grok AI Performance Test

News

OpenAI’s o3: AI Benchmark Discrepancy Reveals Gaps in Performance Claims

Image: Epoch AI The latest results from FrontierMath, a benchmark test for generative AI on advanced ... Other rankings include: OpenAI o1 Grok-3 mini Claude 3.7 Sonnet (16K) Grok-3 Claude 3.7 ...

TechCrunch28d

A dev built a test to see how AI chatbots respond to controversial topics

A pseudonymous developer has created what they’re calling a “free speech eval,” SpeechMap, for the AI models ... the chatbot Grok. Grok 3 responds to 96.2% of SpeechMap’s test prompts ...

1don MSN

OpenAI’s HealthBench reveals how well AI answers medical questions

OpenAI has launched HealthBench, a new dataset designed to test how accurately AI models respond to real-world health care ...

13don MSN

DOGE’s AI surveillance risks silencing whistleblowers and weakening democracy

Surveillance of speech by algorithm raises urgent questions about data privacy and the future of a neutral, expert public ...

14don MSN

Meta bets you want a sprinkle of social in your chatbot

Meta is scrambling to grab some of that ChatGPT and Grok buzz with the launch of its own standalone AI app. Built on its ...

TechCrunch23d

OpenAI’s o3 AI model scores lower on a benchmark than the company initially implied

“We’re seeing [internally], with o3 in aggressive test-time compute settings ... of publishing misleading benchmark charts for its latest AI model, Grok 3. Just this month, Meta admitted ...

14don MSN

Meta AI finally gets an app, but users in India must wait for voice chats

Their latest model, Llama 4, underlines features such as text conversations, voice conversations and image editing, which ...

2don MSN

Here’s Why NVIDIA Corporation (NVDA) Fell in Q1

Baron Funds, an investment management company, released its “Baron Technology Fund” first quarter 2025 investor letter. A ...

14don MSN

Claude Sonnet 3.7 is the leading LLM for AI SEO: Report

Benchmark reveals which LLMs you can use for some SEO tasks. It also reminds us that humans are more reliable than AI (for ...

Business of Apps13d

AI App Market Map 2025

The AI app market is forecast to grow by a compound annual growth rate of 80.7% over the next five years, according to the AI App Report. Chatbots, image ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results