Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents

Interconnects Audio

Inhalt bereitgestellt von Nathan Lambert. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von Nathan Lambert oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

3M ago 1:02:22

MP3•Episode-Home

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on October 02, 2024 13:35 (23d ago)

What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

I had the pleasure of Talking with Ross Taylor (https://x.com/rosstaylor90), who has a great spectrum of unique experiences in the language modeling space — evaluation experience, Galactica lead author, Llama post training, etc. This is a really great conversation on the frontier of language model (LM) reasoning, LM deployments and demos, LM’s for science, RLHF, and other topics. I’ve been trying to get Ross to come on for a bit. He’s one of those people in the LM space that doesn’t speak too much, but when you do, you listen.

Ross Taylor was previously an LLM lead at Meta AI, heading up the reasoning team. Previously he led the early work on LLM agents, and was the research lead on the Galactica project. Before that, he was a co-founder of Papers with Code, which was acquired by Meta in 2019. Before that, he has worked as a quant in sports betting and finance, and before that a policy advisor for the UK Government. He is currently working on a new startup.

More details: https://www.interconnects.ai/p/interviewing-ross-taylor-on-llm-reasoning

00:00:00 Introduction of Ross Taylor and his background
00:02:12 Papers with Code
00:09:58 Galactica, goals, controversy, legacy
00:18:12 Technical details of the Galactica model
00:23:18 Potential for language models to make scientific discoveries
00:25:21 Defining and improving reasoning in language models
00:32:38 Process-based reward models and their potential applications
00:35:00 Generating synthetic data for SFT
00:40:23 Evaluating the effectiveness of language models as judges for human preference data
00:42:43 Considerations for creating base models that are easy to fine-tune
00:46:45 Balancing SFT and RLHF
00:54:13 Characteristics of successful post-training teams
00:58:26 Future directions for language model development

58 Episoden

#Science #Tech #Nathan Lambert #Artificial Intelligence #Machine Learning