[07] John Schulman - Optimizing Expectations: From Deep RL To Stochastic Computation Graphs The Thesis Review podcast

Artwork

Science Thesis Review Sean Welleck

Inhalt bereitgestellt von The Thesis Review and Sean Welleck. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von The Thesis Review and Sean Welleck oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

The Thesis Review « »
[07] John Schulman - Optimizing Expectations: From Deep RL to Stochastic Computation Graphs

4+ y ago 1:04:28

Teilen

MP3•Episode-Home

Inhalt bereitgestellt von The Thesis Review and Sean Welleck. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von The Thesis Review and Sean Welleck oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

John Schulman is a Research Scientist and co-founder of Open AI. John co-leads the reinforcement learning team, researching algorithms that safely and efficiently learn by trial and error and by imitating humans. His PhD thesis is titled "Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs", which he completed in 2016 at Berkeley. We talk about his work on stochastic computation graphs and TRPO, how it evolved to PPO and how it's used in large-scale applications like Open AI Five, as well as his recent work on generalization in RL. Episode notes: https://cs.nyu.edu/~welleck/episode7.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.buymeacoffee.com/thesisreview

… continue reading

49 Episoden

#Science #Thesis Review #Sean Welleck

Artwork

[07] John Schulman - Optimizing Expectations: From Deep RL to Stochastic Computation Graphs

The Thesis Review

published 4+ y ago

Teilen

MP3•Episode-Home

Inhalt bereitgestellt von The Thesis Review and Sean Welleck. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von The Thesis Review and Sean Welleck oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

John Schulman is a Research Scientist and co-founder of Open AI. John co-leads the reinforcement learning team, researching algorithms that safely and efficiently learn by trial and error and by imitating humans. His PhD thesis is titled "Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs", which he completed in 2016 at Berkeley. We talk about his work on stochastic computation graphs and TRPO, how it evolved to PPO and how it's used in large-scale applications like Open AI Five, as well as his recent work on generalization in RL. Episode notes: https://cs.nyu.edu/~welleck/episode7.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.buymeacoffee.com/thesisreview

… continue reading

49 Episoden

#Science #Thesis Review #Sean Welleck

Todos os episódios

×

Willkommen auf Player FM!

Player FM scannt gerade das Web nach Podcasts mit hoher Qualität, die du genießen kannst. Es ist die beste Podcast-App und funktioniert auf Android, iPhone und im Web. Melde dich an, um Abos geräteübergreifend zu synchronisieren.

Höre 500+ Themen zu

Hören Sie sich diese Show an, während Sie die Gegend erkunden