Automatic File Classification

Automatically classify and protect sensitive data across your organization by leveraging Google's built-in Data Loss Prevention (DLP) rules.

How can it help me?

  • Automatically safeguard your organization's sensitive information, reducing the risk of data breaches and compliance violations with minimal effort.

How to use

Set up training

Create the classification label

The classification label is the label the AI model will automatically apply to your sensitive Drive files after the model is trained. The model will be trained on and use only one field per label. The AI-set field must be either a badged or option list field type. For more information about labels, go to Get started as a Drive labels admin.

When used as a classification label, an option list or badged field must meet these requirements:

  • Have at least 2 and no more than 7 options

  • Must be published

If you have an existing label that meets these requirements, you can use it as a classification label. Otherwise, create a label.

Create the training label

We recommend that you create the training label during label selection (next step), when you can create it automatically. This guarantees the training label will match the classification label in all the required ways.

If you choose to create the training label before label selection:

  • Make sure the label meets the required label criteria.

  • Identify the training label with the word "training" to make it easier for your trusted labelers to recognize the label and apply it during the training period.

  • Add a description field to the training label to further help trusted labelers understand its purpose.

Select labels and enable training

  1. Your current account, rowan.manson@cobry.co.uk, might not have permission to do these steps. To continue, make sure you're signed in to an administrator accountLearn more

  2. In the Admin console, go to Menu  SecurityAccess and data controlLabel manager.

  3. In AI classification for Google Drive, click Set up training.

  4. For Select classification label, click Select Label.

  5. Select the label you want AI classification to use and the field it will set.

  6. For Select training label, click Create training label.

    This automatically creates a training label with the same attributes as your Classification label.

  7. To make sure the new label is available to your designated labelers, click Update label permissions. This opens the label in Edit mode in label manager in a separate tab.

    Note: You can also set label permissions later. But it’s important that only your labelers have access to the training label.

  8. Click Permissions

    and then

    Edit, then grant the Can apply labels and set values permission to the configuration group that contains your labelers.

  9. Click Save and close the label manager tab.

    After selecting both the classification label and training label, the Enable training button is enabled.

  10. Click Enable training.

    Important: If you get an error message when you try to enable training, it means your classification label and training label don’t match. Review the label requirements below and make sure your labels meet all requirements, then enable training.

After you enable training, the Data classification page shows your selected Training label and Classification label.

  • The Classification label shows Not ready. After training is done, the label status changes to Ready.

  • Auto apply status shows Off for everyone. Once the Classification label status is Ready, you can then change the Auto apply status to On.

Next, your designated labelers need to start applying the Training label to your sensitive files.

Train the model

To successfully train the AI model, your designated labelers should label at least 100 files per option. For example, if your label has 3 options, it should be applied to at least 300 files in total. The AI model checks training every 1–2 weeks and shows Ready once it has 100 or more examples for each label option. Learn more about high-quality examples.

During the training period, you can check progress for how many files have been labeled and how the accuracy of the model is improving.

Note: Training files have a 1 million total limit.

To check progress during the training period:

  1. In your Admin console, go to Security

    and then

    Data classification.

  2. Click View model details.

    • For Training label, Training files shows the number of files that have been labeled for each option.

    • Each label option has a Score that shows the percentage of training examples the model classified correctly after testing itself.

      • Low— Below 50%. The model needs better data and isn’t ready yet.

      • Medium—50-80%. The model may be ready on a limited basis.

      • High—Above 80%. The model is ready to classify files for your organization.

Turn on the auto-apply of labels

After the AI model is trained to achieve a high level of accuracy, you’re ready to choose label options and turn on the auto-applying of labels. Follow these steps:

  1. In your Admin console, go to Security

    and then

    Data classification.

  2. In AI Classification, verify that the Classification label shows a status of Ready.

  3. Click View model details.

  4. For Classification label, check the boxes for the label options you want to allow the AI model to auto-apply.

  5. Click Turn on auto-apply.

  6. Search for and select the organizational unit or group to include those user members to automatically apply labels for. For example, if you select the group "Finance", you can then select the labels to be configured for Finance.

  7. Click On - Label is auto-applied.

    Options for how the label is applied are listed under the On option.

  8. Click Save.

  9. On the Data classification main page, the Auto-apply status for the rule changes to On.

Tips & Tricks

  • Use AI classification in combination with standard DLP detectors to achieve more accurate protection.