Anonymization Of Sensitive Information In Financial Documents (sps25) Chaos Computer Club - Recent Audio-only Feed podcast

Artwork

CCC media team Ccc Congress Hacking Security Netzpolitik Sicherheit Technologie Software-Entwicklung

Inhalt bereitgestellt von CCC media team. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von CCC media team oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

Chaos Computer Club - recent audio-only feed « »
Anonymization of sensitive information in financial documents (sps25)

4d ago 31:17

Teilen

MP3•Episode-Home

Inhalt bereitgestellt von CCC media team. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von CCC media team oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

Data is the fossil fuel of the machine learning world, essential for developing high quality models but in limited supply. Yet institutions handling sensitive documents — such as financial, medical, or legal records often cannot fully leverage their own data due to stringent privacy, compliance, and security requirements, making training high quality models difficult. A promising solution is to replace the personally identifiable information (PII) with realistic synthetic stand-ins, whilst leaving the rest of the document in tact. In this talk, we will discuss the use of open source tools and models that can be self hosted to anonymize documents. We will go over the various approaches for Named Entity Recognition (NER) to identify sensitive entities and the use of diffusion models to inpaint anonymized content. about this event: https://talks.python-summit.ch/sps25/talk/EWMBRM/

… continue reading

2450 Episoden

#CCC media team #Ccc Congress Hacking Security Netzpolitik #Sicherheit #Technologie #Software-Entwicklung

Artwork

Anonymization of sensitive information in financial documents (sps25)

Chaos Computer Club - recent audio-only feed

72 subscribers

published 4d ago

Teilen

MP3•Episode-Home

Inhalt bereitgestellt von CCC media team. Alle Podcast-Inhalte, einschließlich Episoden, Grafiken und Podcast-Beschreibungen, werden direkt von CCC media team oder seinem Podcast-Plattformpartner hochgeladen und bereitgestellt. Wenn Sie glauben, dass jemand Ihr urheberrechtlich geschütztes Werk ohne Ihre Erlaubnis nutzt, können Sie dem hier beschriebenen Verfahren folgen https://de.player.fm/legal.

Data is the fossil fuel of the machine learning world, essential for developing high quality models but in limited supply. Yet institutions handling sensitive documents — such as financial, medical, or legal records often cannot fully leverage their own data due to stringent privacy, compliance, and security requirements, making training high quality models difficult. A promising solution is to replace the personally identifiable information (PII) with realistic synthetic stand-ins, whilst leaving the rest of the document in tact. In this talk, we will discuss the use of open source tools and models that can be self hosted to anonymize documents. We will go over the various approaches for Named Entity Recognition (NER) to identify sensitive entities and the use of diffusion models to inpaint anonymized content. about this event: https://talks.python-summit.ch/sps25/talk/EWMBRM/

… continue reading

2450 Episoden

#CCC media team #Ccc Congress Hacking Security Netzpolitik #Sicherheit #Technologie #Software-Entwicklung

Alle Folgen

×

Willkommen auf Player FM!

Player FM scannt gerade das Web nach Podcasts mit hoher Qualität, die du genießen kannst. Es ist die beste Podcast-App und funktioniert auf Android, iPhone und im Web. Melde dich an, um Abos geräteübergreifend zu synchronisieren.

Höre 500+ Themen zu

Hören Sie sich diese Show an, während Sie die Gegend erkunden