Skip to content
package

experimental

module

experimental.anonymize_data

Module for anonymizing personally identifiable information (PII) in text.

This module provides functionality to detect and replace various types of sensitive information in German text using named entity recognition, regular expressions, and the Faker library for generating replacement data.

Functions

module

experimental.email_extraction

This module provides text cleaning and email signature removal utilities for German text.

It includes functions to:

  • Clean German text by removing stop words and converting to lowercase
  • Remove signature blocks from email bodies using pattern matching and NLP techniques

Dependencies: spacy: For German NLP processing nltk: For stopword removal and tokenization

Functions