CCE Faculty Articles

Document Type

Article

Publication Title

The Journal of Open Source Software

ISSN

2475-9066

Publication Date

6-20-2023

Abstract

[Summary] There have been a number of advancements in the detection of personal identifiable information (PII) and scrubbing libraries to aid developers and researchers in their detection and anonymization efforts. With the recent shift in data handling procedures and global policy implementations regarding identifying information, it is becoming more important for data consumers to be aware of what data needs to be scrubbed, why it’s being scrubbed, and to have the means to perform said scrubbing. PII-Codex is a collection of extended theoretical, conceptual, and policy works in PII categorization and severity assessment (Milne et al., 2016; Schwartz & Solove, 2011), and the integration thereof with PII detection software and API client adapters. It allows researchers to analyze a body of text or a collection thereof and determine whether the PII detected within these texts, if any, are considered identifiable. Furthermore, it allows end-users to determine the severity and associated categorizations of detected PII tokens.

DOI

10.21105/joss.05402

Volume

8

Issue

86

First Page

5402

Comments

Authors of papers retain copyright and release the work under a Creative Commons Attribution 4.0 International License (CC BY 4.0).

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS