Changelog

Version 1.1.6

Release date: 28 Nov 2019

New features:

  • added CentOS 8
  • new ID card PIIs: Ecuador, Malaysia, Mexico, Greece
  • new Passport PII: Canada
  • new TaxID PII: Ecuador
  • new Phone number PIIs: India, Ukraine, Colombia, US (North America)
  • new Health Insurance Number PII: Canada
  • new Address PIIs: Colombia, India
  • added text and metadata extraction from older Visio files (Visio 2003-2010)
  • calculating line number and offset while scannning in plain text files (counters are starting from 0)
  • Python SDK (Linux only; Python >= 3.3 required)
  • PHP SDK (Linux only; PHP >= 5.5 required)
  • Ruby SDK (Linux only)
  • Go SDK (Linux only)
  • added openSUSE Leap 15.1

Improvements:

  • added ThreatLocation::GetMaskedSurroundingText()
  • updated CentOS 7 to 7.7
  • updated PDFLib TET to version 5.2p1
  • text_extractor: support max_file_size, ignored_mime_types, ignored_extensions, max_recursion_level and extract_metadata in both modules (sds/sdc)
  • text extractor: extract content and metadata from Gzip, TAR and .tar.gz archives
  • text extractor: extract textual content from .hwp (Hanword Document) files
  • PDF metadata extraction: extract PDF signatures
  • additional restriction rules was added to german phone number and its delimiters were improved
  • Improve credit card detection
  • Detect e-mail inside parantheses
  • sdc module working on all platforms/bindings
  • C++ scan/classify buffer: added overload for DynamicByteArray
  • scanner/classifier executor: added buf to file (write memory buffer to file so that memory is released while task in queue)
  • scan buffer: buffer no longer truncated if bigger than 2GB
  • ThreatLocation lineNumber and offset are int64
  • for performance ThreatInfos was changed from std::vector to QVector
  • extended scanner (executor) convenience API with maxNumThreats (a generalization of stopAtFirstThreat)
  • (cppdevtk) JNI_VERSION: 1.6 on Android, 1.8 on remaining platforms
  • Java: core SDK only
  • Java: available on all platforms except iOS
  • Java compliance: 1.7 on Android, 1.8 on remaining platforms
  • Java: implement AutoClosable and use try with resources (we use now Java >= 7, Android API level 19)
  • Java: Linux packaging
  • Java Android: moved ApplicationContext from com.cososys.sensitivityio.android_utils to com.cososys.sensitivityio.base
  • CLI and C#: renamed ThreatHandler::Execute() to HandleThreat()
  • documentation separated for each main module (sds/sdc)
  • Linux: separated documentation packages
  • php extensions: enabled symbol visibility
  • php extensions: dependencies
  • removed Qt version tests for Qt < 5.6.3
  • removed SENSITIVITYIO_HAVE_CURL (always true)
  • removed file magic packages
  • breaking API (C, low impact, normally not used directly): renamed sensitivityio_sds_destroyer_t to sensitivityio_sds_threat_handler_data_destroyer_t
  • breaking API + performance improvement (C++/C): scanner/classifier(executor): functions with raw buffer have a destroyer so that buffer is copied internally or destroyed
  • log to file
  • (cppdevtk) executors: improved cancellation
  • executor based test apps: enabled cancellation and handled isResultReadyAt(0) in adition to isStarted()/isCanceled()/isFinished()
  • large file support checks
  • qmake files cleanup
  • iOS >= 10
  • Android >= 4.4.2 (API level 19)
  • updated Qt to 5.9.7 on Ubuntu 16.04 and 14.04
  • updated Qt to 5.9.8 on Win, Mac and iOS
  • updated cppdevtk to v1.1.1
  • updated Ubuntu to 18.04.3, 16.04.6 and 14.04.6
  • updated CentOS 7 to 7.6

Fixed bugs:

  • memory access violations when scanning corrupt .xls and .doc files
  • infinite loop when scanning corrupt .xls and .doc files
  • infinite loop when scanning corrupt .zip archives or .docx files
  • use charset detection and conversion in plain text files to properly handle unicode characters
  • fixed issue with delimiters related to the following PIIs: Passport China, Passport Honk Kong, Passport Macao, Passport Japan
  • (cppdevtk) executors: workaround for Qt bug #6799
  • executors: if Qt < 5.6.2 then may be affected by Qt bug #54831
  • PHP extensions: workaround for symbol visibility on Ubuntu
  • fixed settings file reading on Win (workaround for QTextStream::readAll() that does not detect UTF-8 properly)

No longer supported:

  • openSUSE Leap 42.3: EOL since 1 July 2019
  • Ubuntu 14.04: EOL since April 2019
  • CentOS 6: we enabled C++11 in our code

Version 1.1.4

Release date: 11 Jul 2018

New features:

  • Support for PDF on Linux
  • Support for Ubuntu 18.04
  • Support for openSUSE Leap 15
  • Tax ID: br (Brazil CNPJ), bg (Bulgaria EIK), au (Australia TFN), es (Spain)
  • ID Card: ae (United Arab Emirates), il (Israel), in (India Aadhaar), fr (France CNI)
  • SSN: gr (Greece), ru (Russia), ie (Irish Personal Public Service Number)
  • Driver’s License: gb (The United Kingdom of Great Britain), de (Germany)
  • Passport: de (Germany)
  • Credit Card: Carte Blanche
  • SWIFT (BIC)

Improvements:

  • Preserve original case (upper/lower) when reporting matched dictionary words
  • Improved sensitive information detection in file names
  • Decrease false positive detections by discarding certain ‘threats’ that are not immediately followed by a delimiter
  • Remove credit card detection false positives in XLSX charts and drawings
  • GCC/Clang visibility
  • Updated CppDevTk to 1.1.0
  • Updated CentOS 7 to 7.5

Fixed bugs:

  • Sweden ID detection
  • Extracting PDF text even if the metadata is in an unsupported format
  • Crash due to corrupt properties table in OLE / CDF documents (.doc, .xls)
  • Crash in 32 bit versions when parsing certain .doc metadata

Version 1.1.3

Release date: 06 Feb 2018

New features:

  • Text extractor: extract metadata separately from data for supported file types and provide settings to control what is extracted (data or metadata). Supported file types are: compound document format based files (doc, xls, ppt), PDF, office open XML files (.docx, .xlsx, .pptx) and ZIP archives;
  • Text extractor: ZIP comments;
  • Text extractor & Scanner: provide detection location information (surrounding text - 24 bytes UTF8)
  • Text extractor: improved text extraction from .doc, .xls, .ppt and .rtf documents
  • Extended Credit Cards formats with 2018 patterns: Mastercard, Visa, American Express, JCB, Diners, Discover, MIR, Maestro, China UnionPay
  • Scanner: New PIIs: Portugal ID, Iceland ID, Italy VAT, Greece VAT, Finland VAT, Hungary SSN, Netherlands SSN, Luxembourg SSN, Cyprus SSN, Luxembourg VAT, Ireland VAT, Slovenia VAT, Ireland passport, Norway ID, Albania ID, Bulgaria ID, Italy ID, Latvia ID, Estonia ID, Croatia ID, Yugoslavia ID, Lithuania ID

Version 1.1.2

Release date: 17 Oct 2017

New features:

  • VAT Number feature

Improvements:

  • sds new feature: VAT
  • sds new threats: sdsthtidIdTh, sdsthtidIdDk, sdsthtidIdCl, sdsthtidPassportFr, sdsthtidPassportFi, sdsthtidDrivingLicenseIt, sdsthtidPhoneDe, sdsthtidAreaCodesDe, sdsthtidVatDe
  • use TLS 1.2 on Mac OS X if supported
  • SDS settings & large dictionary load time improvement
  • cli: enabled std thread library usage
  • CppDevTk updated to 1.0.3
  • Qt updates: Windows & Mac to 5.6.3, iOS to 5.9.2, Android fixed to 5.6.1-1 due to QTBUG-63407
  • iOS mininimum supported version increased from 6.0 to 8.0
  • Android: crypto++ and curl shared libs
  • per server side request: settings GET
  • android: https workarounds

Fixed bugs:

  • VC++ C# CLI bug
  • ScanBuffer() now works for UTF8, UTF16 and Unicode

Version 1.1.1

Release date: 14 Aug 2017

New features:

  • disabled Java based text extractor and restored C++ related

Improvements:

  • updated SSN US
  • better text extractor for older Office formats

Version 1.1.0

Release date: 7 Aug 2017

New features:

  • sdc (sensitive data classification) public release
  • new internal text extractor; default for mobile platforms, all modules
  • new third party Java based text extractor; default for desktop platforms, all modules
  • removed proprietary pdf text extraction library
  • internal: added console test app for text extractors

Improvements:

  • packaging cleanup
  • sdc: improved cancellation
  • sdc error reporting: added NoKnownDataException (SENSITIVITYIO_ENOKNOWNDATA in C)
  • sdc: simplified train and verify apps; removed older multipart ones
  • test apps: for convenience use default sample files/dirs
  • JNI review
  • enabled JNI on Win32

Fixed bugs:

  • fixed regression: Scanner::ScanBuffer() cancelation

Version 1.0.1

Release date: 3 Jul 2017

New features:

  • PHP 7 support
  • Debian based packaging

Improvements:

  • error reporting: ported UnsupportedMimeTypeException in bindings (SENSITIVITYIO_EUNSUPPORTEDMIMETYPE in C)
  • error reporting: added NetworkException and HttpException; ported in bindings (ENETUNREACH and EPROTO in C)
  • documentation review
  • php extensions: ini file

Fixed bugs:

  • fixed PHP pass parameters by value for PHP >= 5.6

Version 1.0.0

Release date: 22 May 2017

New features:

Improvements:

Fixed bugs: