"AI Risk Debate" For Humanity: An AI Safety Podcast Episode #12 Theo Jaffee Interview
January 24, 2024 5:55 pm
In Episode #12, we have our first For Humanity debate!! John talks with Theo Jaffee, a fast-rising AI podcaster who is a self described “techno-optimist.” The debate covers a wide range of topics in AI risk.
This podcast is not journalism. But it’s not opinion either. This show simply strings together the existing facts and underscores the unthinkable probable outcome, the end of all life on earth.
For Humanity: An AI Safety Podcast, is the accessible AI Safety Podcast for all humans, no tech background required. Our show focuses solely on the threat of human extinction from AI.
Peabody Award-winning former journalist John Sherman explores the shocking worst-case scenario of artificial intelligence: human extinction. The makers of AI openly admit it their work could kill all humans, in as soon as 2 years. This podcast is solely about the threat of human extinction from AGI. We’ll meet the heroes and villains, explore the issues and ideas, and what you can do to help save humanity.
Resources
Theo’s YouTube Channel : https://youtube.com/@theojaffee8530?si=aBnWNdViCiL4ZaEg
Glossary: First Definitions by ChaptGPT4, I asked it to give answers simple enough elementary school student could understand( lol, I find this helpful often!), Commentaries by John Sherman
Reinforcement Learning with Human Feedback (RLHF):
Definition: RLHF, or Reinforcement Learning with Human Feedback, is like teaching a computer to make decisions by giving it rewards when it does something good and telling it what's right when it makes a mistake. It's a way for computers to learn and get better at tasks with the help of guidance from humans, just like how a teacher helps students learn. So, it's like a teamwork between people and computers to make the computer really smart!
Commentary: RLHF is widely seen as bullshit by AI safety researchers like Connor Leahy. When you give an AI model a thumbs up or thumbs down for its answer you are giving it RLHF. But Leahy says without knowing what’s happening in the black-box system, RLHF is not alignment work at all, it’s just blindly poking in the model the dark to get a different result that you also do not know how it arrived at.
Model Weights
Definiton: Model weights are like the special numbers that help a computer understand and remember things. Imagine it's like a recipe book, and these weights are the amounts of ingredients needed to make a cake. When the computer learns new things, these weights get adjusted so that it gets better at its job, just like changing the recipe to make the cake taste even better! So, model weights are like the secret ingredients that make the computer really good at what it does.
Commentary: Releasing the model weights of a model publicly, open-sourced, is very controversial. Meta and Yann LeCun are big fans of this, which makes me automatically opposed.
Foom/Fast Take-off:
Definition: "AI fast take-off" or "foom" refers to the idea that artificial intelligence (AI) could become super smart and powerful really quickly. It's like imagining a computer getting super smart all of a sudden, like magic! Some people use the word "foom" to talk about the possibility of AI becoming super intelligent in a short amount of time. It's a bit like picturing a computer going from learning simple things to becoming incredibly smart in the blink of an eye! Foom comes from cartoons, it’s the sound a super hero makes in comic books when they burst off the ground into flight.
Commentary: Many AI safety researchers think Foom is very possible. It simple means once the AI system begins recursively improve itself, all on its own, with no sleep and speed must fast than a human, and every increasing compute speed and power, within a matter of hours or days an Artificial General Intelligence could become an Artificial Super Intelligence, and we would lose control very quickly and potentially meet extinction very quickly.
Gradient Descent: Gradient descent is like a treasure hunt for the best way to do something. Imagine you're on a big hill with a metal detector, trying to find the lowest point. The detector beeps louder when you're closer to the lowest spot. In gradient descent, you adjust your steps based on these beeps to reach the lowest point on the hill, and in the computer world, it helps find the best values for a task, like making a robot walk smoothly or a computer learn better.
Orthogonality: Orthogonality is like making sure things are independent and don't mess each other up. Think of a chef organizing ingredients on a table – if each ingredient has its own space and doesn't mix with others, it's easier to work. In computers, orthogonality means keeping different parts separate, so changing one thing doesn't accidentally affect something else. It's like having a well-organized kitchen where each tool has its own place, making it easy to cook without chaos! ...