Square Specification Gaming

A short Specification Gaming Story

You think you understand the basics of Geometry
Your request is a square, so you give your specification to the AI, input:

Give me a shape
with 4 sides equal length,
with 4 right angles

And it outputs this:

Original post @lethal_ai

Here is another valid result:

And behold here is another square 🤪

Specification Gaming tells us:

The AGI can give you an infinite stream of possible “Square” results

And the Corrigibility problem tells us:

Whatever square you get at the output,
you won’t be able to iterate and improve upon.
You’ll be stuck with that specific square for eternity, no matter what square you had in your mind.

Of-course the real issue is not with these toy experiments
it’s with the upcoming super-capable AGI agents,
we’re about to share the planet with,
operating in the physical domain

Oh, the crazy shapes our physical universe will take,
with AGI agents gaming in it!

Latest Posts Feed

Super-optimizers will super-optimize

Of course they will understand what I want, probably better than myself. So what?

It’s just one of the variables and there is an infinitely wide range of variables to play with, mess with the planet like pixels in a game.

Original post @lethal_ai

Inspired by @AnthonyNAguirre

How does one control a country of a million geniuses that can think many times faster than humans, copy and extend themselves, and create and enact sophisticated plans toward goals of their own? The answer is that one doesn’t. That’s not a tool, it’s a replacement for us, first individually in our jobs, then in actually running our institutions, and eventually as stewards of Earth. On the way it would almost inevitably disrupt and undermine our civilization, and maybe start WWIII to boot.

Spent years working for my kids’ future

Original post @lethal_ai

Sam Altman in 2015 (before becoming OpenAI CEO): “Why You Should Fear Machine Intelligence” (read below)

Jul 24, 2025
Superintelligence, AI Corporates

Original post @HumanHarlan

Development of superhuman machine intelligence (SMI) is probably the greatest threat to the continued existence of humanity. There are other threats that I think are more certain to happen (for example, an engineered virus with a long incubation period and a high mortality rate) but are unlikely to destroy every human in the universe in the way that SMI could. Also, most of these other big threats are already widely feared.

It is extremely hard to put a timeframe on when this will happen (more on this later), and it certainly feels to most people working in the field that it’s still many, many years away. But it’s also extremely hard to believe that it isn’t very likely that it will happen at some point.

SMI does not have to be the inherently evil sci-fi version to kill us all. A more probable scenario is that it simply doesn’t care about us much either way, but in an effort to accomplish some other goal (most goals, if you think about them long enough, could make use of resources currently being used by humans) wipes us out. Certain goals, like self-preservation, could clearly benefit from no humans. We wash our hands not because we actively wish ill towards the bacteria and viruses on them, but because we don’t want them to get in the way of our plans.
[…]
Evolution will continue forward, and if humans are no longer the most-fit species, we may go away. In some sense, this is the system working as designed. But as a human programmed to survive and reproduce, I feel we should fight it.

How can we survive the development of SMI? It may not be possible. One of my top 4 favorite explanations for the Fermi paradox is that biological intelligence always eventually creates machine intelligence, which wipes out biological life and then for some reason decides to makes itself undetectable.

It’s very hard to know how close we are to machine intelligence surpassing human intelligence. Progression of machine intelligence is a double exponential function; human-written programs and computing power are getting better at an exponential rate, and self-learning/self-improving software will improve itself at an exponential rate. Development progress may look relatively slow and then all of a sudden go vertical—things could get out of control very quickly (it also may be more gradual and we may barely perceive it happening).
[…]
it’s very possible that creativity and what we think of us as human intelligence are just an emergent property of a small number of algorithms operating with a lot of compute power (In fact, many respected neocortex researchers believe there is effectively one algorithm for all intelligence. I distinctly remember my undergrad advisor saying the reason he was excited about machine intelligence again was that brain research made it seem possible there was only one algorithm computer scientists had to figure out.)

Because we don’t understand how human intelligence works in any meaningful way, it’s difficult to make strong statements about how close or far away from emulating it we really are. We could be completely off track, or we could be one algorithm away.

Human brains don’t look all that different from chimp brains, and yet somehow produce wildly different capabilities. We decry current machine intelligence as cheap tricks, but perhaps our own intelligence is just the emergent combination of a bunch of cheap tricks.

Many people seem to believe that SMI would be very dangerous if it were developed, but think that it’s either never going to happen or definitely very far off. This is sloppy, dangerous thinking.”

AI Safety Advocates

Watch videos of experts eloquently explaining AI Risk

Industry Leaders and Notables

Videos of famous public figures openly warning about AI Risk

Original Films

Lethal Intelligence Guide and Short Stories

Channels

Creators contributing to raising AI risk awareness

Publication

Blow your mind at the frontier of AI

Categories

Latest Posts Feed

AI Safety Advocates

Industry Leaders and Notables

Original Films

Channels

Publication

Categories

Stay In The Know!

Popular Authors