Protection Profiles

sensitivity.io Protection Profiles are a set of rules that you use to start the module job.

You can create your Custom Protection Profile inside your Control Panel Account

You can also select from already defined ones such as Predefined Profiles or Compliance Profiles that will deliver instant compliance to your application such as HIPAA, PCI*, GDPR etc.

1 Compliance Profiles - out-of-the-box

1.1 PCI-DSS - Payment Card Industry Data Security Standard

Name Description
PCI-DSS - Credit Cards Protect Credit Card Numbers
PCI-DSS - Credit Cards & Social Security Numbers Protect Credit Card Numbers and Social Security Numbers
PCI-DSS - Credit Cards & Personally Identifiable Information Protect Credit Cards & Personally Identifiable Information
PCI-DSS - Credit Cards & E-mail addresses Protect Credit Cards & E-mail addresses
PCI-DSS - Credit Cards & IBAN Protect Credit Cards & IBAN
PCI-DSS - Credit Cards & phone numbers Protect Credit Cards & phone numbers
PCI-DSS - Credit Cards & postal addresses Protect Credit Cards & postal addresses
PCI-DSS - Credit Cards, social security numbers and postal addresses Protect Credit Cards, social security numbers and postal addresses
PCI-DSS - Credit Cards, social security numbers and e-mail addresses Protect Credit Cards, social security numbers and e-mail addresses
PCI-DSS - Credit Cards, social security numbers and phone numbers Protect Credit Cards, social security numbers and phone numbers
PCI-DSS - Credit Cards with CVV Protect Credit Cards with CVV
PCI-DSS - Credit Cards with expiry date Protect Credit Cards with expiry date
PCI-DSS - Credit Cards with CVV and E-mail addresses Protect Credit Cards with CVV and E-mail addresses
PCI-DSS - Credit Cards with CVV and Social Security Numbers Protect Credit Cards with CVV and Social Security Numbers
PCI-DSS - Credit Cards with CVV and expiry date Protect Credit Cards with CVV and expiry date
PCI-DSS - Credit Cards with CVV, expiry date and IBAN Protect Credit Cards with CVV, expiry date and IBAN
PCI-DSS - Credit Cards with CVV, expiry date and PII Protect Credit Cards with CVV, expiry date and PII

1.2 GLBA - Gramm-Leach-Bliley Act

Name Description
GLBA - Personally Identifiable Information Protect Personally Identifiable Information

1.3 HIPAA - Health Insurance Portability and Accountability Act

Name Description
HIPAA - Prescription Drugs and Personally Identifiable Information Protect FDA recognised prescription drugs and Personally Identifiable Information
HIPAA - ICD-9 & Diagnosis Lexicon Protect ICD-9 codes and diagnosis lexicon
HIPAA - ICD-10 & Diagnosis Lexicon Protect ICD-10 codes and diagnosis lexicon
HIPAA - Diagnosis Lexicon and Personally Identifiable Information Protect ICD-9 codes, diagnosis lexicon and Personally Identifiable Information
HIPAA - Personally Identifiable Information Protect Personally Identifiable Information
HIPAA - Pharmaceutical firms Protect FDA recognised pharmaceutical firms
HIPAA - Pharmaceutical firms, drugs and diagnosis Protect Pharmaceutical firms, drugs and diagnosis
HIPAA - Pharmaceutical firms and Personally Identifiable Information Protect Pharmaceutical firms and Personally Identifiable Information
HIPAA - Prescription Drugs Protect FDA recognised prescription drugs

1.4 GDPR - General Data Protection Regulation Europe

Name Description
GDPR - Credit Card Numbers Protect Credit Card Numbers
GDPR - Credit Card Numbers and e-mail address Protect Credit Card Numbers & E-mail addresses
GDPR - Postal address, e-mail address, phone number, national ID Protect postal address, e-mail address, phone number, national ID
GDPR - Credit Card Numbers and name Protect Credit Card Numbers and name
GDPR - Computer IP address Protect Computer IP Address
GDPR - Date of birth and national ID Protect date of birth and national ID
Austria - GDPR Protect GDPR Data for this country
Belgium - GDPR Protect GDPR Data for this country
Czech Republic - GDPR Protect GDPR Data for this country
Denmark - GDPR Protect GDPR Data for this country
Finland - GDPR Protect GDPR Data for this country
France - GDPR Protect GDPR Data for this country
Germany - GDPR Protect GDPR Data for this country
Greece - GDPR Protect GDPR Data for this country
Ireland - GDPR Protect GDPR Data for this country
Italy - GDPR Protect GDPR Data for this country
Netherlands - GDPR Protect GDPR Data for this country
Norway - GDPR Protect GDPR Data for this country
Poland - GDPR Protect GDPR Data for this country
Portugal - GDPR Protect GDPR Data for this country
Romania - GDPR Protect GDPR Data for this country
Spain - GDPR Protect GDPR Data for this country
Sweden - GDPR Protect GDPR Data for this country
Switzerland - GDPR Protect GDPR Data for this country
Turkey - GDPR Protect GDPR Data for this country
United Kingdom - GDPR Protect GDPR Data for this country

1.5 Predefined Protection Profiles

Name Description
File Type - Archive Files Protect archive files
File Type - Office Files Protect Office files
File Type - Other Files Protect other files
File Type - Programming Files Protect programming files
File Type - Media Files Protect media files
Custom Content - Dictionary Protect confidential terms
Default Regular Expressions Protect Predefined Regular Expressions to match email addresses
Intellectual Property Glossary Protect specific terms for Intellectual Property
Personally Identifiable Information Protect Personally Identifiable Information
Indicators of insiders risk Detect expressions that indicate a certain degree of risk from employees to disclose sensitive data outside the organization

2 Supported File Types

2.1 Office Files

Group File Type Mime Type
Office Files PDF application/pdf
application/x-hpt
Excel application/vnd.ms-excel
application/vnd.oasis.opendocument.spreadsheet
application/x-hwp-cell
Word application/msword
application/vnd.oasis.opendocument.text text/rtf
application/x-hwp
application/encrypted-msword
PowerPoint application/vnd.ms-powerpoint
application/vnd.oasis.opendocument.presentation
application/encrypted-powerpoint
Infopath application/vnd.ms-cab-compressed
Publisher application/x-mspublisher
Outlook application/winframe
Office2007+/password application/encrypted-office2007+
iWork files application/iwork-file

2.2 Graphic Files

Group File Type Mime Type
Graphic Files JPEG image/jpeg
PNG image/png
GIF image/gif
BMP image/x-ms-bmp
TIFF image/tiff
EPS application/postscript
DJV image/vnd.djvu
CGM image/cgm
ICO image/x-icon
application/ico
CorelDraw application/x-cdr
Corel Photo-Paint application/x-cpt
PSD image/vnd.adobe.photoshop
Adobe InDesign application/x-indesign
Adobe Illustrator application/ai-indesign
BPF application/bpf-file

2.3 Media Files

Group File Type Mime Type
Media Files mp3 audio/mpeg
aif audio/x-aiff
m3u audio/scpls
m4a,mp4 audio/mp4
video/mp4
wav audio/x-wav
wma video/x-ms-asf
avi video/x-msvideo
video/x-msvideo
mov video/quicktime

2.4 Archive Files

Group File Type Mime Type
Archive Files ZIP application/zip
application/x-compress
7z application/x-7z-compressed
ACE application/x-ace
RAR application/x-rar
TAR application/x-tar
application/x-compress
XZ application/x-xz
ZIP/password application/zip-encrypted
RAR/password application/encrypted-x-rar
ACE/password application/encrypted-x-ace
bz2 application/x-bzip2
GZ application/x-gzip
.xar application/x-com.apple.xar-archive

2.5 Programming Files

Group File Type Mime Type
Programming Files bat, cmd text/x-msdos-batch
java text/x-java
c, cpp text/x-c++
text/x-c
py text/x-python
pas text/x-pascal
sh, csh text/x-shellscript
f text/x-fortran
xml, dtd application/xml
tex text/x-tex
asm text/x-asm
makefile text/x-makefile
fdl application/fdl-file
VBS text/x-vbs
matlab text/x-matlab
php text/x-php
ruby text/x-ruby
perl text/x-perl
P12 application/x-pkcs12
IPA application/vnd.apple.ipa
APK application/vnd.android.package-archive
SQL text/x-sql
DMP application/sql-dump
BACKUP text/x-sql-backup
GO application/go

2.6 Other Files

Group File Type Mime Type
Other Files exe, sys, dll application/x-dosexec
so application/x-sharedlib
Text files text/plain
text/html
AutoCAD files application/x-autocad
DRM Files application/x-scdsa
SgWgc application/x-ciel.sgwgc
SolidWorks Files application/solidworks-file
Journal files application/jnt-file
Fasoo Files application/fasoo-drm
NASCA DRM application/nasca-drm
I-DEAS 3D CAD application/ideas-3d-cad
CATIA application/catia-file
SID application/sid-file
DTA application/x-dta
xia application/xia-file
Pro-E CAD application/Pro-ECAD-file
.accdb application/vnd.ms-access
.accdb application/x-msaccess
EPP_encrypted files application/epp-encrypted-file
VMDK application/vmdk-file
Other format application/octet-stream
application/dicom
application/encrypted-excel
application/encrypted-ms-excel
application/encrypted-pdf
application/epub+zip
application/mac-binhex40
application/marc
application/pgp
application/pgp-encrypted
application/pgp-keys
application/pgp-signature
application/vnd.cups-raster
application/vnd.font-fontforge-sfd
application/vnd.google-earth.kml+xml
application/vnd.google-earth.kmz
application/vnd.iccprofile
application/vnd.lotus-wordpro
application/vnd.ms-fontobject
application/vnd.ms-opentype
application/vnd.ms-tnef
application/vnd.oasis.opendocument.chart
application/vnd.oasis.opendocument.chart-template
application/vnd.oasis.opendocument.database
application/vnd.oasis.opendocument.formula
application/vnd.oasis.opendocument.formula-template
application/vnd.oasis.opendocument.graphics
application/vnd.oasis.opendocument.graphics-template
application/vnd.oasis.opendocument.image
application/vnd.oasis.opendocument.image-template
application/vnd.oasis.opendocument.presentation-template
application/vnd.oasis.opendocument.spreadsheet-template
application/vnd.oasis.opendocument.text-master
application/vnd.oasis.opendocument.text-template
application/vnd.oasis.opendocument.text-web
application/vnd.rn-realmedia
application/vnd.symbian.install
application/vnd.tcpdump.pcap
application/warc
application/x-123
application/x-adrift
application/x-arc
application/x-archive
application/x-arj
application/x-bittorrent
application/x-coredump
application/x-cpio
application/x-dbf
application/x-dbm
application/x-debian-package
application/x-dvi
application/x-eet
application/x-elc
application/x-epoc-app
application/x-epoc-opl
application/x-epoc-opo
application/x-epoc-sheet
application/x-epoc-word
application/x-executable
application/x-font-ttf
application/x-freemind
application/x-gdbm
application/x-gnucash
application/x-gnumeric
application/x-gnupg-keyring
application/x-hdf
application/x-ia-arc
application/x-ichitaro4
application/x-ichitaro5
application/x-ichitaro6
application/x-iso9660-image
application/x-java-applet
application/x-java-jce-keystore
application/x-java-keystore
application/x-java-pack200
application/x-kdelnk
application/x-lha
application/x-lharc
application/x-lrzip
application/x-lzip
application/x-lzma
application/x-mif
application/xml-sitemap
application/x-object
application/x-pgp-keyring
application/x-quark-xpress-3
application/x-quicktime-player
application/x-rpm
application/x-sc
application/x-scribus
application/x-setupscript.
application/x-shockwave-flash
application/x-stuffit
application/x-svr4-package
application/x-tex-tfm
application/x-tokyocabinet-btree
application/x-tokyocabinet-fixed
application/x-tokyocabinet-hash
application/x-tokyocabinet-table
application/x-zoo
audio/basic
audio/midi
audio/x-adpcm
audio/x-ape
audio/x-dec-basic
audio/x-flac
audio/x-hx-aac-adif
audio/x-hx-aac-adts
audio/x-mod
audio/x-mp4a-latm
audio/x-pn-realaudio
audio/x-unknown
audio/x-w64
chemical/x-pdb
image/jp2
image/svg+xml
image/x-canon-cr2
image/x-canon-crw
image/x-coreldraw
image/x-cpi
image/x-epoc-mbm
image/x-epoc-sketch
image/x-niff
image/x-olympus-orf
image/x-paintnet
image/x-portable-bitmap
image/x-portable-greymap
image/x-portable-pixmap
image/x-quicktime
image/x-unknown
image/x-x3f
image/x-xcf
image/x-xcursor
image/x-xpmi
image/x-xwindowdump
message/news
message/rfc822
model/vrml
model/x3d
rinex/broadcast
rinex/clock
rinex/meteorological
rinex/navigation
rinex/observation
text/calendar
text/PGP
text/texmacs
text/troff
text/vnd.graphviz
text/x-awk
text/x-bcpl
text/x-diff
text/x-gawk
text/x-info
text/x-lisp
text/x-lua
text/x-m4
text/x-nawk
text/x-po
text/x-tcl
text/x-texinfo
text/x-vcard
text/x-xmcd
video/3gpp
video/3gpp2
video/h264
video/mp2p
video/mp2t
video/mp4v-es
video/mpeg
video/mpeg4-generic
video/mpv
video/webm
video/x-flc
video/x-fli
video/x-flv
video/x-jng
video/x-matroska
video/x-mng
video/x-sgi-movie
video/x-unknown
x-epoc/x-sisx-app

3 Protection Profiles structure

../_images/Protection_Profiles_Scheme_Small.png

A protection profile represents the link between a profile definition and:

  • either a profile origin (account, project or app)
  • or a profile destination (i.e. cloud storage path).

A profile definition consists of:

  • predefined content definitions;
  • file type definitions;
  • dictionary definitions;
  • regex definitions.

The resulting collection, also known as configuration settings in the SDK, is used as an instruction set by the sensitivity.io SDK engine throught the Settings resource.

Important

All the settings in the Settings resource are validated by having that specific feature enabled and valid in your License.

Compose the Settings yourself by using the following options:

3.1 General settings

These settings apply to all sensitivity.io modules.

3.1.1 Maximum File Size

The maximum file size, in Bytes, that the engine will process. Files with size above this value will be ignorred by the engine.

Supported values >= 0 where 0 means no limit.

Recommended: 10485760 (10MB more than enough for typical documents)

{
  "settings" : {
    "text_extractor" : {
      "max_file_size" : 2048
    }
  }
}
Settings option is max_file_size with integer value

3.1.2 Maximum Recursion Level

The maximum recursion level to search inside archives (useful to avoid zip bombs).

Supported file types are: compound document format based files (doc, xls, ppt), PDF, office open XML files (.docx, .xlsx, .pptx) and ZIP archives.

Supported values must be in interval [2 - 8].

{
  "settings" : {
    "text_extractor" : {
      "max_recursion_level" : 3
    }
  }
}
Settings option is max_recursion_level with integer value

3.1.3 Extract Metadata

For supported file types, you can extract content also from Metadata/File Properties

Supported values:

  • true
  • false

Defaults to false.

{
  "settings" : {
    "text_extractor" : {
      "extract_metadata" : true
    }
  }
}
Settings option is extract_metadata with boolean value

3.1.4 Ignored Mime Types

sensitivity.io engine will ignore these mime-types. If present, mime types can not be empty.

Format:

"ignored_mime_types": [
                    "application/x-executable",
                    "application/x-ms-dos-executable",
                    "application/x-sharedlib",
                    "application/x-object"
                    ]
Settings option is ignored_mime_types with the above format

3.1.5 Ignored File Extensions

sensitivity.io engine will ignore these file extensions. An empty extension may represent Unix executable but this is risky, better use mimetype for this case.

Format:

"ignored_extensions": [
                    "exe",
                    "so",
                    "dll",
                    "dylib",
                    "a",
                    "lib",
                    "o",
                    "obj"
                    ]
Settings option is ignored_extensions with the above format

3.2 SDS module settings

These settings apply to the Sensitive Data Scanning module.

3.2.1 Email

Scanning for emails.

Supported values:

  • true
  • false
"email" : true
Settings option is email with boolean value

3.2.2 Credit Card

Scanning for credit card numbers.

Supported credit card formats:

  • Mastercard

  • Visa

  • American Express - The American Express Company, also known as Amex

  • JCB - Credit card company based in Tokyo, Japan

  • Diners - Diners Club International (DCI), founded as Diners Club

  • Discover - The Discover Card is a credit card, issued primarily in the United States

  • MIR - Payment system established by the Central Bank of Russia

  • Maestro - Multinational debit card service owned by Mastercard

  • China UnionPay - China UnionPay, also known under its abbreviation, CUP, is a Chinese financial services corporation

  • Carte Blanche - Carte Blanche is a Diners Club Card

    "credit_card" : {
      "amex" : true,
      "china_unionpay" : true,
      "carte_blanche" : true,
      "diners" : true,
      "discover" : true,
      "jcb" : true,
      "maestro" : true,
      "mastercard" : true,
      "mir" : true,
      "visa" : true
    }
    
Settings option is credit_card with boolean credit cards list

3.2.3 Date

Scanning for dates like birthdays, calendar events, meetings etc.

Supported values:

  • true
  • false
"date" : true
Settings option is date with boolean value

3.2.4 IBAN

Scanning for IBANs.

Supported values:

  • true
  • false
"iban" : true
Settings option is iban with boolean value

3.2.5 SWIFT

Scanning for swift code numbers.

Supported values:

  • true
  • false
"swift" : true
Settings option is swift with boolean value

3.2.6 Social Security Number (SSN)

Scanning for country specific social security or insurance numbers.

Supported countries for SSN:

  • at - Austria
  • ca - Canada
  • ch - Switzerland
  • cy - Cyprus
  • de - Germany
  • es - Spain
  • fr - France
  • gb - United Kingdom
  • gr - Grece
  • hu - Hungary
  • ie - Ireland
  • jp - Japan
  • kr - Korea
  • lu - Luxembourg
  • nl - Netherlands
  • pl - Poland
  • ro - Romania
  • ru - Russia
  • tw - Taiwan
  • us - United States
"ssn" : {
  "at" : true,
  "ca" : true,
  "ch" : true,
  "cy" : true,
  "de" : true,
  "es" : true,
  "fr" : true,
  "gb" : true,
  "gr" : true,
  "hu" : true,
  "ie" : true,
  "jp" : true,
  "kr" : true,
  "lu" : true,
  "nl" : true,
  "pl" : true,
  "ro" : true,
  "ru" : true,
  "tw" : true,
  "us" : true
}
Settings option is ssn with boolean country list

3.2.7 Passport

Scanning for country specific passport numbers.

Supported countries for Passport:

  • ca - Canada
  • cn - China Mainland
  • de - Germany
  • fi - Finland
  • fr - France
  • gb - United Kingdom
  • hk - Hong Kong
  • ie - Ireland
  • jp - Japan
  • kr - Korea
  • mo - Macao
"passport" : {
  "ca" : true,
  "cn" : true,
  "de" : true,
  "fi" : true,
  "fr" : true,
  "gb" : true,
  "hk" : true,
  "ie" : true,
  "jp" : true,
  "kr" : true,
  "mo" : true
}
Settings option is passport with boolean country list

3.2.8 Driver’s License

Scanning for country specific driver’s license numbers.

Supported countries for Driver’s License:

  • de - Germany
  • gb - Great Britain
  • it - Italy
  • kr - Korea
"driving_license" : {
  "de" : true,
  "gb" : true,
  "it" : true,
  "kr" : true
}
Settings option is driving_license with boolean country list

3.2.9 Health Insurance Number

Scanning for country specific health insurance numbers

Supported countries for health insurance numbers:

  • ca - Canada
  • au - Australia
  • gb - United Kingdom
  • kr - Korea
"health_insurance_number" : {
  "ca" : true,
  "au" : true,
  "gb" : true,
  "kr" : true
}
Settings option is health_insurance_number with boolean country list

3.2.10 ID Card

Scanning for country specific identification cards.

Supported countries for ID Card:

  • ae - United Arab Emirates
  • al - Albania
  • be - Belgium
  • bg - Bulgaria
  • br - Brazil
  • cl - Chile
  • cn - China Mainland
  • cz - Czech
  • de - Germany
  • dk - Denmark
  • ec - Ecuador
  • ee - Estonia
  • fi - Finland
  • fr - France
  • gr - Greece
  • hk - Hong Kong
  • hr - Croatia
  • id - Indonesia
  • il - Israel
  • in - India
  • is - Iceland
  • it - Italy
  • kz - Kazakhstan
  • lt - Lithuania
  • lv - Latvia
  • mo - Macao
  • mx - Mexico
  • my - Malaysia
  • no - Norway
  • pe - Peru
  • pl - Poland
  • pt - Portugal
  • se - Sweden
  • sg - Singapore
  • th - Thailand
  • tr - Turkey
  • yu - Yugoslavia
  • za - South Africa
"id_card" : {
  "ae" : true,
  "al" : true,
  "be" : true,
  "bg" : true,
  "br" : true,
  "cl" : true,
  "cn" : true,
  "cz" : true,
  "de" : true,
  "dk" : true,
  "ec" : true,
  "ee" : true,
  "hk" : true,
  "hr" : true,
  "fi" : true,
  "fr" : true,
  "gr" : true,
  "id" : true,
  "il" : true,
  "in" : true,
  "is" : true,
  "it" : true,
  "kz" : true,
  "lt" : true,
  "lv" : true,
  "mo" : true,
  "mx" : true,
  "my" : true,
  "no" : true,
  "pe" : true,
  "pl" : true,
  "pt" : true,
  "se" : true,
  "sg" : true,
  "th" : true,
  "tr" : true,
  "yu" : true,
  "za" : true
}
Settings option is id_card with boolean country list

3.2.11 Phone Number

Scanning for country specific phone numbers.

Supported countries for Phone Number:

  • cn - China Mainland
  • co - Colombia
  • de - Germany
  • hk - Hong Kong
  • in - India
  • intl - International
  • jp - Japan
  • kr - Korea
  • local - Local phone number - matching a lot of local numbers - use with precaution
  • mo - Macao
  • tr - Turkey
  • ua - Ukraine
  • us - United States
"phone_number" : {
  "cn" : true,
  "co" : true,
  "de" : true,
  "hk" : true,
  "in" : true,
  "intl" : true,
  "jp" : true,
  "kr" : true,
  "local": false,
  "mo" : true,
  "tr" : true,
  "ua" : true,
  "us" : true
}
Settings option is phone_number with boolean country list

3.2.12 Tax ID

Scanning for country specific tax identification numbers.

Supported countries for Tax ID:

  • au - Australia
  • bg - Bulgaria
  • br - Brazil
  • ec - Ecuador
  • es - Spain
  • intl - International
  • it - Italy
  • pe - Peru
  • pl - Poland
  • us - United States
"tax_id" : {
  "au" : true,
  "bg" : true,
  "br" : true,
  "ec" : true,
  "es" : true,
  "intl" : true,
  "it" : true,
  "pe" : true,
  "pl" : true,
  "us" : true,
}
Settings option is tax_id with boolean country list

3.2.13 VAT Number

Scanning for country specific VAT numbers.

Supported countries for VAT Number:

  • de - Germany
  • fi - Finland
  • gr - Greece
  • ie - Ireland
  • it - Italy
  • lu - Luxembourg
  • si - Slovenia
"vat_number" : {
  "de": true,
  "fi": true,
  "gr": true,
  "ie": true,
  "it": true,
  "lu": true,
  "si": true
}
Settings option is vat_number with boolean country list

3.2.14 Foreign Registration Number

Scanning for country specific foreign registration numbers.

Supported countries for Foreign Registration Numbers:

  • kr - Korea
"foreign_registration_number" : {
  "kr" : true
}
Settings option is foreign_registration_number with boolean country list

3.2.15 Address

Scanning for country specific addresses.

Supported countries for addresses:

  • co - Colombia
  • de - Germany
  • in - India
  • us - United States
"address" : {
  "co" : true,
  "de" : true,
  "in" : true,
  "us" : true
}
Settings option is address with boolean country list

3.2.16 Dictionaries

Scanning for specific terms (words or whole phrases)

Options:

  • case_sensitive: boolean - Scan case sensitive
  • disable: boolean - Present in settings, but inactive, won’t scan for these terms
  • name: string - Name for this group of terms
  • whole_words_only: boolean - Scan for whole specified term only or for each of the words in the term
  • words: list - Words or phrases to scan for

Format:

"dictionaries" : [
  {
    "case_sensitive" : false,
    "disable" : false,
    "name" : "Lite - Confidential Dictionary",
    "whole_words_only" : false,
    "words" : [ "Confidential", "Restricted", "Secret", "Sensitive", "Financial" ]
  },
  {
    "case_sensitive" : true,
    "disable" : false,
    "name" : "Case and Whole Dictionary",
    "whole_words_only" : false,
    "words" : [ "Monthly internal report", "Another strict secret" ]
  }
]
Settings option is dictionaries with the above format

3.2.17 Regular Expression

Scan for regular expressions

Options:

  • name: string - Name for this regular expression
  • value: string - Regular expression - regexp

Format:

"regexps": [
  {
    "name": "IPv4",
    "value": "(10.[0-9]{1,3}|172.(3[01]|2[0-9]|1[6-9])|192.168).[0-9]{1,3}.[0-9]{1,3}"
  },
  {
    "name": "IPv6",
    "value": "(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]).){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]).){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))"
  }
]
Settings option is regexps with the above format

3.2.18 Mime Type Alerts

sensitivity.io engine will display information when specific file types (mime-types) are found.

Options:

  • name: string - Name for this group of mime-types
  • mime_types: list of mime-types - Mime-types that will display an alert if found

Format:

"mime_type_alerts": [
  {
    "mime_types": ["application/x-cpt"],
    "name": "Corel Photo-Paint"
  },
  {
    "mime_types": ["application/jnt-file"],
    "name": "Journal files"
  },
  {
    "mime_types": ["application/x-bzip2"],
    "name": "bz2"
  },
  {
    "mime_types": ["application/x-gzip"],
    "name": "GZ"
  },
  {
    "mime_types": ["application/x-gzip"],
    "name": "GZ"
  },
  {
    "name": "ZIP",
    "mime_types": ["application/x-compress", "application/zip"]
  }
]
Settings option is mime_type_alerts with the above format

3.3 SDC module settings

These settings apply to the Sensitive Data Classification module.

3.3.1 Minimum Word Length

The minimum word length the classification engine will include in the model, expressed in number of characters.

Supported values >= 0 where 0 means no restriction.

Recommended: 2

"word_min_length": 2
Settings option is word_min_length with integer value

3.3.2 Maximum Word Length

The maximum word length the classification engine will include in the model expressed in number of characters.

Supported values >= 0 where 0 means no restriction.

Recommended: 15

"word_max_length": 15
Settings option is word_max_length with integer value

3.3.3 Vocabulary

List of words the model uses for creating a pattern. It is mandatory, the list can not be empty and the contained words can not be empty.

Format:

"vocabulary": [
                "a",
                "adding",
                "and",
                "are",
                "buy",
                "by",
                "cheap",
                "cialis",
                "dog",
                "drugs",
                "ed",
                "email"
              ]
Settings option is vocabulary with a list of string values

3.3.4 Ignored Words

List of words that sensitivity.io engine will ignore when classifying. If present words can not be empty.

Format:

"ignored_words": [
                    "foo",
                    "bar"
                  ]
Settings option is ignored_words with a list of string values

3.3.5 Model

sensitivity.io engine will use this SVM model to classify documents. It is always required.

Format:

"model": "svm_type c_svc\nkernel_type rbf\ngamma 0.0263158\nnr_class 2\ntotal_sv 10\nrho 0.23536\nlabel 1 0\nnr_sv 5 5\nSV\n1 30982880:1.5832638e-316 0:0 25:1 29:1 36:1 \n1 8:1 11:1 27:1 34:1 \n1 3:1 5:1 8:1 10:1 20:1 27:1 29:1 34:1 \n1 5:1 6:1 10:1 22:1 \n1 5:1 7:1 34:1 \n-1 1:1 9:1 16:1 18:1 19:1 24:1 26:1 28:1 \n-1 1:1 9:1 23:1 26:1 28:1 31:1 38:1 \n-1 1:1 4:1 13:1 14:1 19:1 30:1 33:1 35:1 37:1 \n-1 2:1 12:1 19:1 21:1 32:1 33:1 \n-1 4:1 15:1 17:1 19:1 37:1 \n"
Settings option is model with string value