Artwork

Inhalt bereitgestellt von Machine Learning Street Talk (MLST). Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von Machine Learning Street Talk (MLST) oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.
Player FM - Podcast-App
Gehen Sie mit der App Player FM offline!

John Palazza - Vice President of Global Sales @ CentML ( sponsored)

54:50
 
Teilen
 

Manage episode 470701550 series 2803422
Inhalt bereitgestellt von Machine Learning Street Talk (MLST). Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von Machine Learning Street Talk (MLST) oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

John Palazza from CentML joins us in this sponsored interview to discuss the critical importance of infrastructure optimization in the age of Large Language Models and Generative AI. We explore how enterprises can transition from the innovation phase to production and scale, highlighting the significance of efficient GPU utilization and cost management. The conversation covers the open-source versus proprietary model debate, the rise of AI agents, and the need for platform independence to avoid vendor lock-in, as well as emerging trends in AI infrastructure and the pivotal role of strategic partnerships.

SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/

***

TRANSCRIPT:

https://www.dropbox.com/scl/fi/dnjsygrgdgq5ng5fdlfjg/JOHNPALAZZA.pdf?rlkey=hl9wyydi9mj077rbg5acdmo3a&dl=0

John Palazza:

Vice President of Global Sales @ CentML

https://www.linkedin.com/in/john-p-b34655/

TOC:

1. Enterprise AI Organization and Strategy

[00:00:00] 1.1 Organizational Structure and ML Ownership

[00:02:59] 1.2 Infrastructure Efficiency and GPU Utilization

[00:07:59] 1.3 Platform Centralization vs Team Autonomy

[00:11:32] 1.4 Enterprise AI Adoption Strategy and Leadership

2. MLOps Infrastructure and Resource Management

[00:15:08] 2.1 Technology Evolution and Enterprise Integration

[00:19:10] 2.2 Enterprise MLOps Platform Development

[00:22:15] 2.3 AI Interface Evolution and Agent-Based Solutions

[00:25:47] 2.4 CentML's Infrastructure Solutions

[00:30:00] 2.5 Workload Abstraction and Resource Allocation

3. LLM Infrastructure Optimization and Independence

[00:33:10] 3.1 GPU Optimization and Cost Efficiency

[00:36:47] 3.2 AI Efficiency and Innovation Challenges

[00:41:40] 3.3 Cloud Provider Strategy and Infrastructure Control

[00:46:52] 3.4 Platform Independence and Vendor Lock-in

[00:50:53] 3.5 Technical Innovation and Growth Strategy

REFS:

[00:01:25] Apple Acquires GraphLab, Apple Inc.

https://techcrunch.com/2016/08/05/apple-acquires-turi-a-machine-learning-company/

[00:03:50] Bain Tech Report 2024, Gartner

https://www.bain.com/insights/topics/technology-report/

[00:04:50] PaaS vs IaaS Efficiency, Gartner

https://www.gartner.com/en/newsroom/press-releases/2024-11-19-gartner-forecasts-worldwide-public-cloud-end-user-spending-to-total-723-billion-dollars-in-2025

[00:14:55] Fashion Quote, Oscar Wilde

https://www.amazon.com/Complete-Works-Oscar-Wilde-Collins/dp/0007144369

[00:15:30] PointCast Network, PointCast Inc.

https://en.wikipedia.org/wiki/Push_technology

[00:18:05] AI Bain Report, Bain & Company

https://www.bain.com/insights/how-generative-ai-changes-the-game-in-tech-services-tech-report-2024/

[00:20:40] Uber Michelangelo, Uber Engineering Team

https://www.uber.com/en-SE/blog/michelangelo-machine-learning-platform/

[00:20:50] Algorithmia Acquisition, DataRobot

https://www.datarobot.com/newsroom/press/datarobot-is-acquiring-algorithmia-enhancing-leading-mlops-architecture-for-the-enterprise/

[00:22:55] Fine Tuning vs RAG, Heydar Soudani, Evangelos Kanoulas & Faegheh Hasibi.

https://arxiv.org/html/2403.01432v2

[00:24:40] LLM Agent Survey, Lei Wang et al.

https://arxiv.org/abs/2308.11432

[00:26:30] CentML CServe, CentML

https://docs.centml.ai/apps/llm

[00:29:15] CentML Snowflake, Snowflake

https://www.snowflake.com/en/engineering-blog/optimize-llms-with-llama-snowflake-ai-stack/

[00:30:15] NVIDIA H100 GPU, NVIDIA

https://www.nvidia.com/en-us/data-center/h100/

[00:33:25] CentML\'s 60% savings, CentML

https://centml.ai/platform/

  continue reading

238 Episoden

Artwork
iconTeilen
 
Manage episode 470701550 series 2803422
Inhalt bereitgestellt von Machine Learning Street Talk (MLST). Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von Machine Learning Street Talk (MLST) oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

John Palazza from CentML joins us in this sponsored interview to discuss the critical importance of infrastructure optimization in the age of Large Language Models and Generative AI. We explore how enterprises can transition from the innovation phase to production and scale, highlighting the significance of efficient GPU utilization and cost management. The conversation covers the open-source versus proprietary model debate, the rise of AI agents, and the need for platform independence to avoid vendor lock-in, as well as emerging trends in AI infrastructure and the pivotal role of strategic partnerships.

SPONSOR MESSAGES:

***

CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!

https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.

Goto https://tufalabs.ai/

***

TRANSCRIPT:

https://www.dropbox.com/scl/fi/dnjsygrgdgq5ng5fdlfjg/JOHNPALAZZA.pdf?rlkey=hl9wyydi9mj077rbg5acdmo3a&dl=0

John Palazza:

Vice President of Global Sales @ CentML

https://www.linkedin.com/in/john-p-b34655/

TOC:

1. Enterprise AI Organization and Strategy

[00:00:00] 1.1 Organizational Structure and ML Ownership

[00:02:59] 1.2 Infrastructure Efficiency and GPU Utilization

[00:07:59] 1.3 Platform Centralization vs Team Autonomy

[00:11:32] 1.4 Enterprise AI Adoption Strategy and Leadership

2. MLOps Infrastructure and Resource Management

[00:15:08] 2.1 Technology Evolution and Enterprise Integration

[00:19:10] 2.2 Enterprise MLOps Platform Development

[00:22:15] 2.3 AI Interface Evolution and Agent-Based Solutions

[00:25:47] 2.4 CentML's Infrastructure Solutions

[00:30:00] 2.5 Workload Abstraction and Resource Allocation

3. LLM Infrastructure Optimization and Independence

[00:33:10] 3.1 GPU Optimization and Cost Efficiency

[00:36:47] 3.2 AI Efficiency and Innovation Challenges

[00:41:40] 3.3 Cloud Provider Strategy and Infrastructure Control

[00:46:52] 3.4 Platform Independence and Vendor Lock-in

[00:50:53] 3.5 Technical Innovation and Growth Strategy

REFS:

[00:01:25] Apple Acquires GraphLab, Apple Inc.

https://techcrunch.com/2016/08/05/apple-acquires-turi-a-machine-learning-company/

[00:03:50] Bain Tech Report 2024, Gartner

https://www.bain.com/insights/topics/technology-report/

[00:04:50] PaaS vs IaaS Efficiency, Gartner

https://www.gartner.com/en/newsroom/press-releases/2024-11-19-gartner-forecasts-worldwide-public-cloud-end-user-spending-to-total-723-billion-dollars-in-2025

[00:14:55] Fashion Quote, Oscar Wilde

https://www.amazon.com/Complete-Works-Oscar-Wilde-Collins/dp/0007144369

[00:15:30] PointCast Network, PointCast Inc.

https://en.wikipedia.org/wiki/Push_technology

[00:18:05] AI Bain Report, Bain & Company

https://www.bain.com/insights/how-generative-ai-changes-the-game-in-tech-services-tech-report-2024/

[00:20:40] Uber Michelangelo, Uber Engineering Team

https://www.uber.com/en-SE/blog/michelangelo-machine-learning-platform/

[00:20:50] Algorithmia Acquisition, DataRobot

https://www.datarobot.com/newsroom/press/datarobot-is-acquiring-algorithmia-enhancing-leading-mlops-architecture-for-the-enterprise/

[00:22:55] Fine Tuning vs RAG, Heydar Soudani, Evangelos Kanoulas & Faegheh Hasibi.

https://arxiv.org/html/2403.01432v2

[00:24:40] LLM Agent Survey, Lei Wang et al.

https://arxiv.org/abs/2308.11432

[00:26:30] CentML CServe, CentML

https://docs.centml.ai/apps/llm

[00:29:15] CentML Snowflake, Snowflake

https://www.snowflake.com/en/engineering-blog/optimize-llms-with-llama-snowflake-ai-stack/

[00:30:15] NVIDIA H100 GPU, NVIDIA

https://www.nvidia.com/en-us/data-center/h100/

[00:33:25] CentML\'s 60% savings, CentML

https://centml.ai/platform/

  continue reading

238 Episoden

Semua episode

×
 
Loading …

Willkommen auf Player FM!

Player FM scannt gerade das Web nach Podcasts mit hoher Qualität, die du genießen kannst. Es ist die beste Podcast-App und funktioniert auf Android, iPhone und im Web. Melde dich an, um Abos geräteübergreifend zu synchronisieren.

 

Kurzanleitung

Hören Sie sich diese Show an, während Sie die Gegend erkunden
Abspielen