The Download: AI benchmarks, and Spain’s grid blackout

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. How to build a better AI benchmark It’s not easy being one of Silicon Valley’s favorite benchmarks.  SWE-Bench (pronounced “swee bench”) launched in November 2024 as a way to evaluate an AI model’s…

May 8, 2025 - 13:47
 0
The Download: AI benchmarks, and Spain’s grid blackout

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

How to build a better AI benchmark

It’s not easy being one of Silicon Valley’s favorite benchmarks. 

SWE-Bench (pronounced “swee bench”) launched in November 2024 as a way to evaluate an AI model’s coding skill. It has since quickly become one of the most popular tests in AI. A SWE-Bench score has become a mainstay of major model releases from OpenAI, Anthropic, and Google—and outside of foundation models, the fine-tuners at AI firms are in constant competition to see who can rise above the pack.

Despite all the fervor, this isn’t exactly a truthful assessment of which model is “better.” Entrants have begun to game the system—which is pushing many others to wonder whether there’s a better way to actually measure AI achievement. Read the full story.

—Russell Brandom

Did solar power cause Spain’s blackout?

At roughly midday on Monday, April 28, the lights went out in Spain. The grid blackout, which extended into parts of Portugal and France, affected tens of millions of people—flights were grounded, cell networks went down, and businesses closed for the day.

Over a week later, officials still aren’t entirely sure what happened, but some have suggested that renewables may have played a role, because just before the outage happened, wind and solar accounted for about 70% of electricity generation. Others, including Spanish government officials, insist that it’s too early to assign blame.

It’ll take weeks to get the full report, but we do know a few things about what happened. Here are a few takeaways that could help our future grid. 

—Casey Crownhart

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 The Trump administration will repeal some global chip curbs 
It’s drawing up new rules that prioritize direct negotiations with various nations. (Bloomberg $)
+ The curbs have always been leaky anyway. (Economist $)

2 India and Pakistan have accused each other of overnight drone attacks
The conflict between the two countries is rapidly escalating. (The Guardian)
+ Pakistan claims to have shot down 25 drones in its airspace. (Reuters)
+ Mass-market military drones have changed the way wars are fought. (MIT Technology Review)

3 The FDA is interested in using AI for drug evaluation
And has met with OpenAI to hear more about how to do it. (Wired $)
+ An AI-driven “factory of drugs” claims to have hit a big milestone. (MIT Technology Review)

4 The US is pushing nations facing its tariffs to adopt Starlink
Government officials in India and other countries have fast tracked approvals. (WP $)
+ India recently announced new rules for satellite internet providers. (Rest of World)

5 Apple is overhauling its Safari browser to focus on AI search
Its search volume is down for the first time in 22 years. (The Verge)
+ Apple exec Eddy Cue thinks AI search will replace traditional search engines. (Bloomberg $)
+ AI means the end of internet search as we’ve known it. (MIT Technology Review)

6 Mark Zuckerberg is betting big on AI chatbots
He’s on a media charm offensive to convince us that AI friends are the future. (WSJ $)
+ The AI relationship revolution is already here. (MIT Technology Review)

7 Students can’t wean themselves off ChatGPT
And experts fear that they’ll emerge into the workforce essentially illiterate. (NY Mag $)
+ Some educators believe that AI highlights how the ways we teach need to change. (MIT Technology Review)

8 We don’t really know how memory works                         </div>
                                            <div class= Read More