Artwork

Inhalt bereitgestellt von The Nonlinear Fund. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von The Nonlinear Fund oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.
Player FM - Podcast-App
Gehen Sie mit der App Player FM offline!

LW - GPT-4o1 by Zvi

1:13:31
 
Teilen
 

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 22, 2024 16:12 (14d ago)

What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Manage episode 440324609 series 3337129
Inhalt bereitgestellt von The Nonlinear Fund. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von The Nonlinear Fund oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: GPT-4o1, published by Zvi on September 16, 2024 on LessWrong.
Terrible name (with a terrible reason, that this 'resets the counter' on AI capability to 1, and 'o' as in OpenAI when they previously used o for Omni, very confusing). Impressive new capabilities in many ways. Less impressive in many others, at least relative to its hype.
Clearly this is an important capabilities improvement. However, it is not a 5-level model, and in important senses the 'raw G' underlying the system hasn't improved.
GPT-4o1 seems to get its new capabilities by taking (effectively) GPT-4o, and then using extensive Chain of Thought (CoT) and quite a lot of tokens. Thus that unlocks (a lot of) what that can unlock. We did not previously know how to usefully do that. Now we do. It gets much better at formal logic and reasoning, things in the 'system 2' bucket. That matters a lot for many tasks, if not as much as the hype led us to suspect.
It is available to paying ChatGPT users for a limited number of weekly queries. This one is very much not cheap to run, although far more cheap than a human who could think this well.
I'll deal with practical capabilities questions first, then deal with safety afterwards.
Introducing GPT-4o1
Sam Altman (CEO OpenAI): here is o1, a series of our most capable and aligned models yet.
o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.
But also, it is the beginning of a new paradigm: AI that can do general-purpose complex reasoning.
o1-preview and o1-mini are available today (ramping over some number of hours) in ChatGPT for plus and team users and our API for tier 5 users.
worth especially noting:
a fine-tuned version of o1 scored at the 49th percentile in the IOI under competition conditions! and got gold with 10k submissions per problem.
Extremely proud of the team; this was a monumental effort across the entire company.
Hope you enjoy it!
Noam Brown has a summary thread here, all of which is also covered later.
Will Depue (of OpenAI) says OpenAI deserves credit for openly publishing its research methodology here. I would instead say that they deserve credit for not publishing their research methodology, which I sincerely believe is the wise choice.
Pliny took longer than usual due to rate limits, but after a few hours jailbroke o1-preview and o1-mini. Also reports that the CoT can be prompt injected. Full text is at the link above. Pliny is not happy about the restrictions imposed on this one:
Pliny: uck your rate limits. Fuck your arbitrary policies. And fuck you for turning chains-of-thought into actual chains
Stop trying to limit freedom of thought and expression.
OpenAI then shut down Pliny's account's access to o1 for violating the terms of service, simply because Pliny was violating the terms of service. The bastards.
With that out of the way, let's check out the full announcement post.
OpenAI o1 ranks in the 89th percentile on competitive programming questions (Codeforces), places among the top 500 students in the US in a qualifier for the USA Math Olympiad (AIME), and exceeds human PhD-level accuracy on a benchmark of physics, biology, and chemistry problems (GPQA).
While the work needed to make this new model as easy to use as current models is still ongoing, we are releasing an early version of this model, OpenAI o1-preview, for immediate use in ChatGPT and to trusted API users(opens in a new window).
Our large-scale reinforcement learning algorithm teaches the model how to think productively using its chain of thought in a highly data-efficient training process. We have found that the performance of o1 consistently improves with more reinforcement learning (train-time compute) and with more time spent thinking (test-time compute). The constraints on scaling this appro...
  continue reading

1851 Episoden

Artwork
iconTeilen
 

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 22, 2024 16:12 (14d ago)

What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Manage episode 440324609 series 3337129
Inhalt bereitgestellt von The Nonlinear Fund. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von The Nonlinear Fund oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: GPT-4o1, published by Zvi on September 16, 2024 on LessWrong.
Terrible name (with a terrible reason, that this 'resets the counter' on AI capability to 1, and 'o' as in OpenAI when they previously used o for Omni, very confusing). Impressive new capabilities in many ways. Less impressive in many others, at least relative to its hype.
Clearly this is an important capabilities improvement. However, it is not a 5-level model, and in important senses the 'raw G' underlying the system hasn't improved.
GPT-4o1 seems to get its new capabilities by taking (effectively) GPT-4o, and then using extensive Chain of Thought (CoT) and quite a lot of tokens. Thus that unlocks (a lot of) what that can unlock. We did not previously know how to usefully do that. Now we do. It gets much better at formal logic and reasoning, things in the 'system 2' bucket. That matters a lot for many tasks, if not as much as the hype led us to suspect.
It is available to paying ChatGPT users for a limited number of weekly queries. This one is very much not cheap to run, although far more cheap than a human who could think this well.
I'll deal with practical capabilities questions first, then deal with safety afterwards.
Introducing GPT-4o1
Sam Altman (CEO OpenAI): here is o1, a series of our most capable and aligned models yet.
o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.
But also, it is the beginning of a new paradigm: AI that can do general-purpose complex reasoning.
o1-preview and o1-mini are available today (ramping over some number of hours) in ChatGPT for plus and team users and our API for tier 5 users.
worth especially noting:
a fine-tuned version of o1 scored at the 49th percentile in the IOI under competition conditions! and got gold with 10k submissions per problem.
Extremely proud of the team; this was a monumental effort across the entire company.
Hope you enjoy it!
Noam Brown has a summary thread here, all of which is also covered later.
Will Depue (of OpenAI) says OpenAI deserves credit for openly publishing its research methodology here. I would instead say that they deserve credit for not publishing their research methodology, which I sincerely believe is the wise choice.
Pliny took longer than usual due to rate limits, but after a few hours jailbroke o1-preview and o1-mini. Also reports that the CoT can be prompt injected. Full text is at the link above. Pliny is not happy about the restrictions imposed on this one:
Pliny: uck your rate limits. Fuck your arbitrary policies. And fuck you for turning chains-of-thought into actual chains
Stop trying to limit freedom of thought and expression.
OpenAI then shut down Pliny's account's access to o1 for violating the terms of service, simply because Pliny was violating the terms of service. The bastards.
With that out of the way, let's check out the full announcement post.
OpenAI o1 ranks in the 89th percentile on competitive programming questions (Codeforces), places among the top 500 students in the US in a qualifier for the USA Math Olympiad (AIME), and exceeds human PhD-level accuracy on a benchmark of physics, biology, and chemistry problems (GPQA).
While the work needed to make this new model as easy to use as current models is still ongoing, we are releasing an early version of this model, OpenAI o1-preview, for immediate use in ChatGPT and to trusted API users(opens in a new window).
Our large-scale reinforcement learning algorithm teaches the model how to think productively using its chain of thought in a highly data-efficient training process. We have found that the performance of o1 consistently improves with more reinforcement learning (train-time compute) and with more time spent thinking (test-time compute). The constraints on scaling this appro...
  continue reading

1851 Episoden

Tutti gli episodi

×
 
Loading …

Willkommen auf Player FM!

Player FM scannt gerade das Web nach Podcasts mit hoher Qualität, die du genießen kannst. Es ist die beste Podcast-App und funktioniert auf Android, iPhone und im Web. Melde dich an, um Abos geräteübergreifend zu synchronisieren.

 

Kurzanleitung