Protection Profiles¶
sensitivity.io
Protection Profiles are a set of rules that you use to start the module job.
You can create your Custom Protection Profile inside your Control Panel Account
You can also select from already defined ones such as Predefined Profiles or Compliance Profiles that will deliver instant compliance to your application such as HIPAA, PCI*, GDPR etc.
- 1 Compliance Profiles - out-of-the-box
- 2 Supported File Types
- 3 Protection Profiles structure
- 3.1 General settings
- 3.2 SDS module settings
- 3.2.1 Email
- 3.2.2 Credit Card
- 3.2.3 Date
- 3.2.4 IBAN
- 3.2.5 SWIFT
- 3.2.6 Social Security Number (SSN)
- 3.2.7 Passport
- 3.2.8 Driver’s License
- 3.2.9 Health Insurance Number
- 3.2.10 ID Card
- 3.2.11 Phone Number
- 3.2.12 Tax ID
- 3.2.13 VAT Number
- 3.2.14 Foreign Registration Number
- 3.2.15 Address
- 3.2.16 Dictionaries
- 3.2.17 Regular Expression
- 3.2.18 Mime Type Alerts
- 3.3 SDC module settings
1 Compliance Profiles - out-of-the-box¶
1.1 PCI-DSS - Payment Card Industry Data Security Standard¶
Name | Description |
PCI-DSS - Credit Cards | Protect Credit Card Numbers |
PCI-DSS - Credit Cards & Social Security Numbers | Protect Credit Card Numbers and Social Security Numbers |
PCI-DSS - Credit Cards & Personally Identifiable Information | Protect Credit Cards & Personally Identifiable Information |
PCI-DSS - Credit Cards & E-mail addresses | Protect Credit Cards & E-mail addresses |
PCI-DSS - Credit Cards & IBAN | Protect Credit Cards & IBAN |
PCI-DSS - Credit Cards & phone numbers | Protect Credit Cards & phone numbers |
PCI-DSS - Credit Cards & postal addresses | Protect Credit Cards & postal addresses |
PCI-DSS - Credit Cards, social security numbers and postal addresses | Protect Credit Cards, social security numbers and postal addresses |
PCI-DSS - Credit Cards, social security numbers and e-mail addresses | Protect Credit Cards, social security numbers and e-mail addresses |
PCI-DSS - Credit Cards, social security numbers and phone numbers | Protect Credit Cards, social security numbers and phone numbers |
PCI-DSS - Credit Cards with CVV | Protect Credit Cards with CVV |
PCI-DSS - Credit Cards with expiry date | Protect Credit Cards with expiry date |
PCI-DSS - Credit Cards with CVV and E-mail addresses | Protect Credit Cards with CVV and E-mail addresses |
PCI-DSS - Credit Cards with CVV and Social Security Numbers | Protect Credit Cards with CVV and Social Security Numbers |
PCI-DSS - Credit Cards with CVV and expiry date | Protect Credit Cards with CVV and expiry date |
PCI-DSS - Credit Cards with CVV, expiry date and IBAN | Protect Credit Cards with CVV, expiry date and IBAN |
PCI-DSS - Credit Cards with CVV, expiry date and PII | Protect Credit Cards with CVV, expiry date and PII |
1.2 GLBA - Gramm-Leach-Bliley Act¶
Name | Description |
GLBA - Personally Identifiable Information | Protect Personally Identifiable Information |
1.3 HIPAA - Health Insurance Portability and Accountability Act¶
Name | Description |
HIPAA - Prescription Drugs and Personally Identifiable Information | Protect FDA recognised prescription drugs and Personally Identifiable Information |
HIPAA - ICD-9 & Diagnosis Lexicon | Protect ICD-9 codes and diagnosis lexicon |
HIPAA - ICD-10 & Diagnosis Lexicon | Protect ICD-10 codes and diagnosis lexicon |
HIPAA - Diagnosis Lexicon and Personally Identifiable Information | Protect ICD-9 codes, diagnosis lexicon and Personally Identifiable Information |
HIPAA - Personally Identifiable Information | Protect Personally Identifiable Information |
HIPAA - Pharmaceutical firms | Protect FDA recognised pharmaceutical firms |
HIPAA - Pharmaceutical firms, drugs and diagnosis | Protect Pharmaceutical firms, drugs and diagnosis |
HIPAA - Pharmaceutical firms and Personally Identifiable Information | Protect Pharmaceutical firms and Personally Identifiable Information |
HIPAA - Prescription Drugs | Protect FDA recognised prescription drugs |
1.4 GDPR - General Data Protection Regulation Europe¶
Name | Description |
GDPR - Credit Card Numbers | Protect Credit Card Numbers |
GDPR - Credit Card Numbers and e-mail address | Protect Credit Card Numbers & E-mail addresses |
GDPR - Postal address, e-mail address, phone number, national ID | Protect postal address, e-mail address, phone number, national ID |
GDPR - Credit Card Numbers and name | Protect Credit Card Numbers and name |
GDPR - Computer IP address | Protect Computer IP Address |
GDPR - Date of birth and national ID | Protect date of birth and national ID |
Austria - GDPR | Protect GDPR Data for this country |
Belgium - GDPR | Protect GDPR Data for this country |
Czech Republic - GDPR | Protect GDPR Data for this country |
Denmark - GDPR | Protect GDPR Data for this country |
Finland - GDPR | Protect GDPR Data for this country |
France - GDPR | Protect GDPR Data for this country |
Germany - GDPR | Protect GDPR Data for this country |
Greece - GDPR | Protect GDPR Data for this country |
Ireland - GDPR | Protect GDPR Data for this country |
Italy - GDPR | Protect GDPR Data for this country |
Netherlands - GDPR | Protect GDPR Data for this country |
Norway - GDPR | Protect GDPR Data for this country |
Poland - GDPR | Protect GDPR Data for this country |
Portugal - GDPR | Protect GDPR Data for this country |
Romania - GDPR | Protect GDPR Data for this country |
Spain - GDPR | Protect GDPR Data for this country |
Sweden - GDPR | Protect GDPR Data for this country |
Switzerland - GDPR | Protect GDPR Data for this country |
Turkey - GDPR | Protect GDPR Data for this country |
United Kingdom - GDPR | Protect GDPR Data for this country |
1.5 Predefined Protection Profiles¶
Name | Description |
File Type - Archive Files | Protect archive files |
File Type - Office Files | Protect Office files |
File Type - Other Files | Protect other files |
File Type - Programming Files | Protect programming files |
File Type - Media Files | Protect media files |
Custom Content - Dictionary | Protect confidential terms |
Default Regular Expressions | Protect Predefined Regular Expressions to match email addresses |
Intellectual Property Glossary | Protect specific terms for Intellectual Property |
Personally Identifiable Information | Protect Personally Identifiable Information |
Indicators of insiders risk | Detect expressions that indicate a certain degree of risk from employees to disclose sensitive data outside the organization |
2 Supported File Types¶
2.1 Office Files¶
Group | File Type | Mime Type |
Office Files | application/pdf | |
application/x-hpt | ||
Excel | application/vnd.ms-excel | |
application/vnd.oasis.opendocument.spreadsheet | ||
application/x-hwp-cell | ||
Word | application/msword | |
application/vnd.oasis.opendocument.text text/rtf | ||
application/x-hwp | ||
application/encrypted-msword | ||
PowerPoint | application/vnd.ms-powerpoint | |
application/vnd.oasis.opendocument.presentation | ||
application/encrypted-powerpoint | ||
Infopath | application/vnd.ms-cab-compressed | |
Publisher | application/x-mspublisher | |
Outlook | application/winframe | |
Office2007+/password | application/encrypted-office2007+ | |
iWork files | application/iwork-file |
2.2 Graphic Files¶
Group | File Type | Mime Type |
Graphic Files | JPEG | image/jpeg |
PNG | image/png | |
GIF | image/gif | |
BMP | image/x-ms-bmp | |
TIFF | image/tiff | |
EPS | application/postscript | |
DJV | image/vnd.djvu | |
CGM | image/cgm | |
ICO | image/x-icon | |
application/ico | ||
CorelDraw | application/x-cdr | |
Corel Photo-Paint | application/x-cpt | |
PSD | image/vnd.adobe.photoshop | |
Adobe InDesign | application/x-indesign | |
Adobe Illustrator | application/ai-indesign | |
BPF | application/bpf-file |
2.3 Media Files¶
Group | File Type | Mime Type |
Media Files | mp3 | audio/mpeg |
aif | audio/x-aiff | |
m3u | audio/scpls | |
m4a,mp4 | audio/mp4 | |
video/mp4 | ||
wav | audio/x-wav | |
wma | video/x-ms-asf | |
avi | video/x-msvideo | |
video/x-msvideo | ||
mov | video/quicktime |
2.4 Archive Files¶
Group | File Type | Mime Type |
Archive Files | ZIP | application/zip |
application/x-compress | ||
7z | application/x-7z-compressed | |
ACE | application/x-ace | |
RAR | application/x-rar | |
TAR | application/x-tar | |
application/x-compress | ||
XZ | application/x-xz | |
ZIP/password | application/zip-encrypted | |
RAR/password | application/encrypted-x-rar | |
ACE/password | application/encrypted-x-ace | |
bz2 | application/x-bzip2 | |
GZ | application/x-gzip | |
.xar | application/x-com.apple.xar-archive |
2.5 Programming Files¶
Group | File Type | Mime Type |
Programming Files | bat, cmd | text/x-msdos-batch |
java | text/x-java | |
c, cpp | text/x-c++ | |
text/x-c | ||
py | text/x-python | |
pas | text/x-pascal | |
sh, csh | text/x-shellscript | |
f | text/x-fortran | |
xml, dtd | application/xml | |
tex | text/x-tex | |
asm | text/x-asm | |
makefile | text/x-makefile | |
fdl | application/fdl-file | |
VBS | text/x-vbs | |
matlab | text/x-matlab | |
php | text/x-php | |
ruby | text/x-ruby | |
perl | text/x-perl | |
P12 | application/x-pkcs12 | |
IPA | application/vnd.apple.ipa | |
APK | application/vnd.android.package-archive | |
SQL | text/x-sql | |
DMP | application/sql-dump | |
BACKUP | text/x-sql-backup | |
GO | application/go |
2.6 Other Files¶
Group | File Type | Mime Type |
Other Files | exe, sys, dll | application/x-dosexec |
so | application/x-sharedlib | |
Text files | text/plain | |
text/html | ||
AutoCAD files | application/x-autocad | |
DRM Files | application/x-scdsa | |
SgWgc | application/x-ciel.sgwgc | |
SolidWorks Files | application/solidworks-file | |
Journal files | application/jnt-file | |
Fasoo Files | application/fasoo-drm | |
NASCA DRM | application/nasca-drm | |
I-DEAS 3D CAD | application/ideas-3d-cad | |
CATIA | application/catia-file | |
SID | application/sid-file | |
DTA | application/x-dta | |
xia | application/xia-file | |
Pro-E CAD | application/Pro-ECAD-file | |
.accdb | application/vnd.ms-access | |
.accdb | application/x-msaccess | |
EPP_encrypted files | application/epp-encrypted-file | |
VMDK | application/vmdk-file | |
Other format | application/octet-stream | |
application/dicom | ||
application/encrypted-excel | ||
application/encrypted-ms-excel | ||
application/encrypted-pdf | ||
application/epub+zip | ||
application/mac-binhex40 | ||
application/marc | ||
application/pgp | ||
application/pgp-encrypted | ||
application/pgp-keys | ||
application/pgp-signature | ||
application/vnd.cups-raster | ||
application/vnd.font-fontforge-sfd | ||
application/vnd.google-earth.kml+xml | ||
application/vnd.google-earth.kmz | ||
application/vnd.iccprofile | ||
application/vnd.lotus-wordpro | ||
application/vnd.ms-fontobject | ||
application/vnd.ms-opentype | ||
application/vnd.ms-tnef | ||
application/vnd.oasis.opendocument.chart | ||
application/vnd.oasis.opendocument.chart-template | ||
application/vnd.oasis.opendocument.database | ||
application/vnd.oasis.opendocument.formula | ||
application/vnd.oasis.opendocument.formula-template | ||
application/vnd.oasis.opendocument.graphics | ||
application/vnd.oasis.opendocument.graphics-template | ||
application/vnd.oasis.opendocument.image | ||
application/vnd.oasis.opendocument.image-template | ||
application/vnd.oasis.opendocument.presentation-template | ||
application/vnd.oasis.opendocument.spreadsheet-template | ||
application/vnd.oasis.opendocument.text-master | ||
application/vnd.oasis.opendocument.text-template | ||
application/vnd.oasis.opendocument.text-web | ||
application/vnd.rn-realmedia | ||
application/vnd.symbian.install | ||
application/vnd.tcpdump.pcap | ||
application/warc | ||
application/x-123 | ||
application/x-adrift | ||
application/x-arc | ||
application/x-archive | ||
application/x-arj | ||
application/x-bittorrent | ||
application/x-coredump | ||
application/x-cpio | ||
application/x-dbf | ||
application/x-dbm | ||
application/x-debian-package | ||
application/x-dvi | ||
application/x-eet | ||
application/x-elc | ||
application/x-epoc-app | ||
application/x-epoc-opl | ||
application/x-epoc-opo | ||
application/x-epoc-sheet | ||
application/x-epoc-word | ||
application/x-executable | ||
application/x-font-ttf | ||
application/x-freemind | ||
application/x-gdbm | ||
application/x-gnucash | ||
application/x-gnumeric | ||
application/x-gnupg-keyring | ||
application/x-hdf | ||
application/x-ia-arc | ||
application/x-ichitaro4 | ||
application/x-ichitaro5 | ||
application/x-ichitaro6 | ||
application/x-iso9660-image | ||
application/x-java-applet | ||
application/x-java-jce-keystore | ||
application/x-java-keystore | ||
application/x-java-pack200 | ||
application/x-kdelnk | ||
application/x-lha | ||
application/x-lharc | ||
application/x-lrzip | ||
application/x-lzip | ||
application/x-lzma | ||
application/x-mif | ||
application/xml-sitemap | ||
application/x-object | ||
application/x-pgp-keyring | ||
application/x-quark-xpress-3 | ||
application/x-quicktime-player | ||
application/x-rpm | ||
application/x-sc | ||
application/x-scribus | ||
application/x-setupscript. | ||
application/x-shockwave-flash | ||
application/x-stuffit | ||
application/x-svr4-package | ||
application/x-tex-tfm | ||
application/x-tokyocabinet-btree | ||
application/x-tokyocabinet-fixed | ||
application/x-tokyocabinet-hash | ||
application/x-tokyocabinet-table | ||
application/x-zoo | ||
audio/basic | ||
audio/midi | ||
audio/x-adpcm | ||
audio/x-ape | ||
audio/x-dec-basic | ||
audio/x-flac | ||
audio/x-hx-aac-adif | ||
audio/x-hx-aac-adts | ||
audio/x-mod | ||
audio/x-mp4a-latm | ||
audio/x-pn-realaudio | ||
audio/x-unknown | ||
audio/x-w64 | ||
chemical/x-pdb | ||
image/jp2 | ||
image/svg+xml | ||
image/x-canon-cr2 | ||
image/x-canon-crw | ||
image/x-coreldraw | ||
image/x-cpi | ||
image/x-epoc-mbm | ||
image/x-epoc-sketch | ||
image/x-niff | ||
image/x-olympus-orf | ||
image/x-paintnet | ||
image/x-portable-bitmap | ||
image/x-portable-greymap | ||
image/x-portable-pixmap | ||
image/x-quicktime | ||
image/x-unknown | ||
image/x-x3f | ||
image/x-xcf | ||
image/x-xcursor | ||
image/x-xpmi | ||
image/x-xwindowdump | ||
message/news | ||
message/rfc822 | ||
model/vrml | ||
model/x3d | ||
rinex/broadcast | ||
rinex/clock | ||
rinex/meteorological | ||
rinex/navigation | ||
rinex/observation | ||
text/calendar | ||
text/PGP | ||
text/texmacs | ||
text/troff | ||
text/vnd.graphviz | ||
text/x-awk | ||
text/x-bcpl | ||
text/x-diff | ||
text/x-gawk | ||
text/x-info | ||
text/x-lisp | ||
text/x-lua | ||
text/x-m4 | ||
text/x-nawk | ||
text/x-po | ||
text/x-tcl | ||
text/x-texinfo | ||
text/x-vcard | ||
text/x-xmcd | ||
video/3gpp | ||
video/3gpp2 | ||
video/h264 | ||
video/mp2p | ||
video/mp2t | ||
video/mp4v-es | ||
video/mpeg | ||
video/mpeg4-generic | ||
video/mpv | ||
video/webm | ||
video/x-flc | ||
video/x-fli | ||
video/x-flv | ||
video/x-jng | ||
video/x-matroska | ||
video/x-mng | ||
video/x-sgi-movie | ||
video/x-unknown | ||
x-epoc/x-sisx-app |
3 Protection Profiles structure¶
A protection profile represents the link between a profile definition and:
- either a profile origin (account, project or app)
- or a profile destination (i.e. cloud storage path).
A profile definition consists of:
- predefined content definitions;
- file type definitions;
- dictionary definitions;
- regex definitions.
The resulting collection, also known as configuration settings in the SDK, is used as an instruction set by the sensitivity.io
SDK engine throught the Settings
resource.
Important
All the settings in the Settings
resource are validated by having that specific feature enabled and valid in your License.
Compose the Settings
yourself by using the following options:
3.1 General settings¶
These settings apply to all sensitivity.io
modules.
3.1.1 Maximum File Size¶
The maximum file size, in Bytes, that the engine will process. Files with size above this value will be ignorred by the engine.
Supported values >= 0 where 0 means no limit.
Recommended: 10485760 (10MB more than enough for typical documents)
{ "settings" : { "text_extractor" : { "max_file_size" : 2048 } } }
Settings option is max_file_size with integer value |
3.1.2 Maximum Recursion Level¶
The maximum recursion level to search inside archives (useful to avoid zip bombs).
Supported file types are: compound document format based files (doc, xls, ppt), PDF, office open XML files (.docx, .xlsx, .pptx) and ZIP archives.
Supported values must be in interval [2 - 8].
{ "settings" : { "text_extractor" : { "max_recursion_level" : 3 } } }
Settings option is max_recursion_level with integer value |
3.1.3 Extract Metadata¶
For supported file types, you can extract content also from Metadata/File Properties
Supported values:
true
false
Defaults to false
.
{ "settings" : { "text_extractor" : { "extract_metadata" : true } } }
Settings option is extract_metadata with boolean value |
3.1.4 Ignored Mime Types¶
sensitivity.io
engine will ignore these mime-types. If present, mime types can not be empty.
Format:
"ignored_mime_types": [ "application/x-executable", "application/x-ms-dos-executable", "application/x-sharedlib", "application/x-object" ]
Settings option is ignored_mime_types with the above format |
3.1.5 Ignored File Extensions¶
sensitivity.io
engine will ignore these file extensions. An empty extension may represent Unix executable but this is risky, better use mimetype for this case.
Format:
"ignored_extensions": [ "exe", "so", "dll", "dylib", "a", "lib", "o", "obj" ]
Settings option is ignored_extensions with the above format |
3.2 SDS module settings¶
These settings apply to the Sensitive Data Scanning module.
3.2.1 Email¶
Scanning for emails.
Supported values:
true
false
"email" : true
Settings option is email with boolean value |
3.2.2 Credit Card¶
Scanning for credit card numbers.
Supported credit card formats:
Mastercard
Visa
American Express
- The American Express Company, also known as AmexJCB
- Credit card company based in Tokyo, JapanDiners
- Diners Club International (DCI), founded as Diners ClubDiscover
- The Discover Card is a credit card, issued primarily in the United StatesMIR
- Payment system established by the Central Bank of RussiaMaestro
- Multinational debit card service owned by MastercardChina UnionPay
- China UnionPay, also known under its abbreviation, CUP, is a Chinese financial services corporationCarte Blanche
- Carte Blanche is a Diners Club Card"credit_card" : { "amex" : true, "china_unionpay" : true, "carte_blanche" : true, "diners" : true, "discover" : true, "jcb" : true, "maestro" : true, "mastercard" : true, "mir" : true, "visa" : true }
Settings option is credit_card with boolean credit cards list |
3.2.3 Date¶
Scanning for dates like birthdays, calendar events, meetings etc.
Supported values:
true
false
"date" : true
Settings option is date with boolean value |
3.2.4 IBAN¶
Scanning for IBANs.
Supported values:
true
false
"iban" : true
Settings option is iban with boolean value |
3.2.5 SWIFT¶
Scanning for swift code numbers.
Supported values:
true
false
"swift" : true
Settings option is swift with boolean value |
3.2.7 Passport¶
Scanning for country specific passport numbers.
Supported countries for Passport:
ca
- Canadacn
- China Mainlandde
- Germanyfi
- Finlandfr
- Francegb
- United Kingdomhk
- Hong Kongie
- Irelandjp
- Japankr
- Koreamo
- Macao"passport" : { "ca" : true, "cn" : true, "de" : true, "fi" : true, "fr" : true, "gb" : true, "hk" : true, "ie" : true, "jp" : true, "kr" : true, "mo" : true }
Settings option is passport with boolean country list |
3.2.8 Driver’s License¶
Scanning for country specific driver’s license numbers.
Supported countries for Driver’s License:
de
- Germanygb
- Great Britainit
- Italykr
- Korea"driving_license" : { "de" : true, "gb" : true, "it" : true, "kr" : true }
Settings option is driving_license with boolean country list |
3.2.9 Health Insurance Number¶
Scanning for country specific health insurance numbers
Supported countries for health insurance numbers:
ca
- Canadaau
- Australiagb
- United Kingdomkr
- Korea"health_insurance_number" : { "ca" : true, "au" : true, "gb" : true, "kr" : true }
Settings option is health_insurance_number with boolean country list |
3.2.10 ID Card¶
Scanning for country specific identification cards.
Supported countries for ID Card:
ae
- United Arab Emiratesal
- Albaniabe
- Belgiumbg
- Bulgariabr
- Brazilcl
- Chilecn
- China Mainlandcz
- Czechde
- Germanydk
- Denmarkec
- Ecuadoree
- Estoniafi
- Finlandfr
- Francegr
- Greecehk
- Hong Konghr
- Croatiaid
- Indonesiail
- Israelin
- Indiais
- Icelandit
- Italykz
- Kazakhstanlt
- Lithuanialv
- Latviamo
- Macaomx
- Mexicomy
- Malaysiano
- Norwaype
- Perupl
- Polandpt
- Portugalse
- Swedensg
- Singaporeth
- Thailandtr
- Turkeyyu
- Yugoslaviaza
- South Africa"id_card" : { "ae" : true, "al" : true, "be" : true, "bg" : true, "br" : true, "cl" : true, "cn" : true, "cz" : true, "de" : true, "dk" : true, "ec" : true, "ee" : true, "hk" : true, "hr" : true, "fi" : true, "fr" : true, "gr" : true, "id" : true, "il" : true, "in" : true, "is" : true, "it" : true, "kz" : true, "lt" : true, "lv" : true, "mo" : true, "mx" : true, "my" : true, "no" : true, "pe" : true, "pl" : true, "pt" : true, "se" : true, "sg" : true, "th" : true, "tr" : true, "yu" : true, "za" : true }
Settings option is id_card with boolean country list |
3.2.11 Phone Number¶
Scanning for country specific phone numbers.
Supported countries for Phone Number:
cn
- China Mainlandco
- Colombiade
- Germanyhk
- Hong Kongin
- Indiaintl
- Internationaljp
- Japankr
- Korealocal
- Local phone number - matching a lot of local numbers - use with precautionmo
- Macaotr
- Turkeyua
- Ukraineus
- United States"phone_number" : { "cn" : true, "co" : true, "de" : true, "hk" : true, "in" : true, "intl" : true, "jp" : true, "kr" : true, "local": false, "mo" : true, "tr" : true, "ua" : true, "us" : true }
Settings option is phone_number with boolean country list |
3.2.12 Tax ID¶
Scanning for country specific tax identification numbers.
Supported countries for Tax ID:
au
- Australiabg
- Bulgariabr
- Brazilec
- Ecuadores
- Spainintl
- Internationalit
- Italype
- Perupl
- Polandus
- United States"tax_id" : { "au" : true, "bg" : true, "br" : true, "ec" : true, "es" : true, "intl" : true, "it" : true, "pe" : true, "pl" : true, "us" : true, }
Settings option is tax_id with boolean country list |
3.2.13 VAT Number¶
Scanning for country specific VAT numbers.
Supported countries for VAT Number:
de
- Germanyfi
- Finlandgr
- Greeceie
- Irelandit
- Italylu
- Luxembourgsi
- Slovenia"vat_number" : { "de": true, "fi": true, "gr": true, "ie": true, "it": true, "lu": true, "si": true }
Settings option is vat_number with boolean country list |
3.2.14 Foreign Registration Number¶
Scanning for country specific foreign registration numbers.
Supported countries for Foreign Registration Numbers:
kr
- Korea"foreign_registration_number" : { "kr" : true }
Settings option is foreign_registration_number with boolean country list |
3.2.15 Address¶
Scanning for country specific addresses.
Supported countries for addresses:
co
- Colombiade
- Germanyin
- Indiaus
- United States"address" : { "co" : true, "de" : true, "in" : true, "us" : true }
Settings option is address with boolean country list |
3.2.16 Dictionaries¶
Scanning for specific terms (words or whole phrases)
Options:
case_sensitive
:boolean
- Scan case sensitivedisable
:boolean
- Present in settings, but inactive, won’t scan for these termsname
:string
- Name for this group of termswhole_words_only
:boolean
- Scan for whole specified term only or for each of the words in the termwords
:list
- Words or phrases to scan for
Format:
"dictionaries" : [ { "case_sensitive" : false, "disable" : false, "name" : "Lite - Confidential Dictionary", "whole_words_only" : false, "words" : [ "Confidential", "Restricted", "Secret", "Sensitive", "Financial" ] }, { "case_sensitive" : true, "disable" : false, "name" : "Case and Whole Dictionary", "whole_words_only" : false, "words" : [ "Monthly internal report", "Another strict secret" ] } ]
Settings option is dictionaries with the above format |
3.2.17 Regular Expression¶
Scan for regular expressions
Options:
name
:string
- Name for this regular expressionvalue
:string
- Regular expression - regexp
Format:
"regexps": [ { "name": "IPv4", "value": "(10.[0-9]{1,3}|172.(3[01]|2[0-9]|1[6-9])|192.168).[0-9]{1,3}.[0-9]{1,3}" }, { "name": "IPv6", "value": "(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]).){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]).){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))" } ]
Settings option is regexps with the above format |
3.2.18 Mime Type Alerts¶
sensitivity.io
engine will display information when specific file types (mime-types) are found.
Options:
name
:string
- Name for this group of mime-typesmime_types
:list of mime-types
- Mime-types that will display an alert if found
Format:
"mime_type_alerts": [ { "mime_types": ["application/x-cpt"], "name": "Corel Photo-Paint" }, { "mime_types": ["application/jnt-file"], "name": "Journal files" }, { "mime_types": ["application/x-bzip2"], "name": "bz2" }, { "mime_types": ["application/x-gzip"], "name": "GZ" }, { "mime_types": ["application/x-gzip"], "name": "GZ" }, { "name": "ZIP", "mime_types": ["application/x-compress", "application/zip"] } ]
Settings option is mime_type_alerts with the above format |
3.3 SDC module settings¶
These settings apply to the Sensitive Data Classification module.
3.3.1 Minimum Word Length¶
The minimum word length the classification engine will include in the model, expressed in number of characters.
Supported values >= 0 where 0 means no restriction.
Recommended: 2
"word_min_length": 2
Settings option is word_min_length with integer value |
3.3.2 Maximum Word Length¶
The maximum word length the classification engine will include in the model expressed in number of characters.
Supported values >= 0 where 0 means no restriction.
Recommended: 15
"word_max_length": 15
Settings option is word_max_length with integer value |
3.3.3 Vocabulary¶
List of words the model uses for creating a pattern. It is mandatory, the list can not be empty and the contained words can not be empty.
Format:
"vocabulary": [ "a", "adding", "and", "are", "buy", "by", "cheap", "cialis", "dog", "drugs", "ed", "email" ]
Settings option is vocabulary with a list of string values |
3.3.4 Ignored Words¶
List of words that sensitivity.io
engine will ignore when classifying. If present words can not be empty.
Format:
"ignored_words": [ "foo", "bar" ]
Settings option is ignored_words with a list of string values |
3.3.5 Model¶
sensitivity.io
engine will use this SVM model to classify documents. It is always required.
Format:
"model": "svm_type c_svc\nkernel_type rbf\ngamma 0.0263158\nnr_class 2\ntotal_sv 10\nrho 0.23536\nlabel 1 0\nnr_sv 5 5\nSV\n1 30982880:1.5832638e-316 0:0 25:1 29:1 36:1 \n1 8:1 11:1 27:1 34:1 \n1 3:1 5:1 8:1 10:1 20:1 27:1 29:1 34:1 \n1 5:1 6:1 10:1 22:1 \n1 5:1 7:1 34:1 \n-1 1:1 9:1 16:1 18:1 19:1 24:1 26:1 28:1 \n-1 1:1 9:1 23:1 26:1 28:1 31:1 38:1 \n-1 1:1 4:1 13:1 14:1 19:1 30:1 33:1 35:1 37:1 \n-1 2:1 12:1 19:1 21:1 32:1 33:1 \n-1 4:1 15:1 17:1 19:1 37:1 \n"
Settings option is model with string value |
3.2.6 Social Security Number (SSN)¶
Scanning for country specific social security or insurance numbers.
Supported countries for SSN:
ssn
withboolean
country list