CCE Faculty Articles
Document Type
Article
Publication Title
The Journal of Open Source Software
ISSN
2475-9066
Publication Date
6-20-2023
Abstract
[Summary] There have been a number of advancements in the detection of personal identifiable information (PII) and scrubbing libraries to aid developers and researchers in their detection and anonymization efforts. With the recent shift in data handling procedures and global policy implementations regarding identifying information, it is becoming more important for data consumers to be aware of what data needs to be scrubbed, why it’s being scrubbed, and to have the means to perform said scrubbing. PII-Codex is a collection of extended theoretical, conceptual, and policy works in PII categorization and severity assessment (Milne et al., 2016; Schwartz & Solove, 2011), and the integration thereof with PII detection software and API client adapters. It allows researchers to analyze a body of text or a collection thereof and determine whether the PII detected within these texts, if any, are considered identifiable. Furthermore, it allows end-users to determine the severity and associated categorizations of detected PII tokens.
DOI
10.21105/joss.05402
Volume
8
Issue
86
First Page
5402
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
NSUWorks Citation
Rosado, Eidan J., "PII-Codex: a Python library for PII detection, categorization, and severity assessment" (2023). CCE Faculty Articles. 533.
https://nsuworks.nova.edu/gscis_facarticles/533
Comments
Authors of papers retain copyright and release the work under a Creative Commons Attribution 4.0 International License (CC BY 4.0).