Skip to main content

Data Anonymization for Reports

Data anonymization in the Core Reporter is used to mask sensitive information in reports. It is available at two levels: reporting-level and loading-level.

Originally, only reporting-level anonymization was supported. To provide stronger protection and enable high-level anonymization, loading-level anonymization was later introduced. This type specifically addresses the anonymization of Level 3 data.

  • Reporting-level - With this setup, administrators can always view the true values of the specified classifications. However, users assigned to the designated group will only see masked data.

  • Loading-level - With this type of data anonymization, the database stores only anonymized values, ensuring that real data is never exposed. However, some real data may still remain in the archive.

Requirements

  • Administrator access to the Core Reporter interface
  • Complete archive data for anonymization

Configuring anonymize.cfg

The anonymize.cfg file defines how data is anonymized (masked) in reports generated by the Core Reporter. This configuration file controls which classifications are hidden and at what level of anonymization they are applied.

  1. Open the anonymize.cfg file in the Configuration directory, which is by default in C:\Program Files\OpeniT\Core\Configuration.

  2. The file includes sample configurations that are commented out. To use one, remove the comment marker (#) from the desired configuration.

    Example
    Anonymize: HashHost
    classvars = Host name(92 97:11)
    strategy = hash-sha1
    placement = loading

    This configuration triggers the anonymization of host names of Level 3 data when the database is loaded. The host names are masked as hash strings (e.g., 56468421bc9c5ce).

  3. To create your own configuration, follow the required configuration format below.

    anonymize.cfg
    Anonymize : <Class>
    classvars = <List of classification names(data types)>
    strategy = Hash-sha1 | Serialize | List (<filename>)
    placement = reporting | loading
    ParametersDescription
    AnonymizeSpecifies the name of the anonymization rule. It is recommended to use short names, as these will appear in the output when the serialize strategy is selected. For example, use User for user name and user ID anonymization, or Host for host name anonymization.
    classvarsSpecifies the classifications to be anonymized. You may optionally limit anonymization to specific data types by writing the data type numbers in a space-separated list after the classification. If no data types are specified, anonymization is applied to all data types that include the listed classifications. For reference, you can check the list of available classifications on the specific data type pages.
    strategySpecifies the anonymization method. Valid values are hash-sha1, serialize, or list. If no strategy is specified, serialize is used by default.
    placementSpecifies when anonymization is applied. Valid values are reporting or loading. If no placement is specified, reporting is used by default.
    Anonymization Parameters and Descriptions

    Parameter Notes:

    • For classvars, if raw data types ((93) OLAP Raw Hourly, (96) OLAP Winapp Raw Hourly, (97) OLAP Freeze Raw Hourly, and (103) OLAP Featureset PPU) are specified, each field must be identified explicitly by its index value, starting from 0. In the configuration shown in step 2, the host name is stored in field 11 of data type (97) OLAP Freeze Raw Hourly.

      Example OLAP Freeze Raw Hourly Data
      1584316800:600:desktop09;dsls:DER;ALC:all: : :Design Reviewer:all:non-prime:user21:desktop71:Active:dsls:300:100:0.00015:1.25:600:amd64:windows_10_enterprise

      For clarity, here is the same data split into fields with index numbers:

      OLAP Freeze Raw Hourly Fields
      1584316800
      600
      desktop09;dsls
      DER;ALC
      all


      Design Reviewer
      all
      non-prime
      user21
      desktop71
      Active
      dsls
      300
      100
      0.00015
      1.25
      600
      amd64
      windows_10_enterprise

      Here, field 11 corresponds to the host name (desktop71). This value will be anonymized into a hash string as specified in the configuration.

      warning

      When listing classification names valid for multiple data types, place each on a separate line under classvars.

      • Valid: classvars = User name(46 49)
      • Not valid: classvars = User name, User(46 49)
    • For strategy, if the value is list, the entries are loaded from a user-created file. This file should be placed in the same directory as anonymize.cfg, which by default is located at C:\Program Files\OpeniT\Core\Configuration. The file must contain only names, with each line interpreted as a separate item; comments or additional information should not be included.

      note

      You can use common text-based file extensions, such as .txt or .list.

      Example with list strategy
      Anonymize : Host
      classvars = Host, Host name
      strategy = List (hostnames.list)

      This configuration anonymizes the Host and Host name classifications for all data types using list-based anonymization from the hostname.list file. If all entries in this file are exhausted, the values will be anonymized using serialization, for example, Host1, Host2, and so on.

      warning

      To avoid unexpected issues, make sure that the file is placed in the Configuration directory.

  4. Save the changes.

Examples

Example 1
Anonymize: HashUser
classvars = User name(92 97:10 98)
strategy = hash-sha1
placement = loading

This example anonymizes user names in LicenseOptimizer data types (92) License Optimizer Actions, (97) OLAP Freeze Raw Hourly, and (98) License Optimizer Individual Usage using hashing during the database loading process.


Example 2
Anonymize : License_User
classvars = User name(46 49)

Anonymize : User
classvars = User, User name, UID

This example shows that User name from data types (46) Individual License Use v2.0 and (49) Host User License Use will be serialized as License_User1, License_User2, and so on. User, User name, and UID from all other data types (except User name for data types 46 and 49) will be serialized as User1, User2, and so on.

Post-Configuration Steps

After updating the anonymize.cfg file, follow these steps to apply the changes and verify anonymization.

note

The duration of the processes listed below will vary based on the amount of data being anonymized.

  1. Regenerate the data.

  2. Process the data.

  3. Update the database.

  4. Verify anonymization by creating a report in Complete Selection.