Friday, 30 April 2021

Data Classification Process (MIS) (MIS 30.04.2021)

Data Classification Process (MIS)

Data classification is the process of analysing structured or unstructured data and organizing it into categories based on file type, contents, and other metadata.

Data classification helps organizations answer important questions about their data that inform how they mitigate risk and manage data governance policies. It can tell you where you are storing your most important data or what kinds of sensitive data your users create most often. Comprehensive data classification is necessary (but not enough) to comply with modern data privacy regulations.

 

Effective Information Classification in Five Steps

1. Establish a data classification policy, including objectives, workflows, data classification scheme, data owners and handling

2. Identify the sensitive data you store.

3. Apply labels by tagging data.

4. Use results to improve security and compliance.

5. Data is dynamic, and classification is an ongoing process.

 

Data Classification Process

Data classification processes differ slightly depending on the objectives for the project. Most data classification projects require automation to process the astonishing amount of data that companies create every day. In general, there are some best practices that lead to successful data classification initiatives:

1. Define the Objectives of the Data Classification Process

·      What are you looking for? Why?

·      Which systems are in-scope for the initial classification phase?

·      What compliance regulations apply to your organization?

·      Are there other business objectives you want to tackle? (e.g., risk mitigation, storage optimization, analytics)

2. Categorize Data Types

·      Identify what kinds of data the organization creates (e.g., customer lists, financial records, source code, product plans)

·      Delineate proprietary data vs. public data

·      Do you expect to find GDPR, CCPA, or other regulated data?

3. Establish Classification Levels

·      How many classification levels do you need?

·      Document each level and provide examples

·      Train users to classify data (if manual classification is planned)

4. Define the Automated Classification Process

·      Define how to prioritize which data to scan first (e.g., prioritize active over stale, open over protected)

·      Establish the frequency and resources you will dedicate to automated data classification

5. Define the Categories and Classification Criteria

·      Define your high-level categories and provide examples (e.g., PII, PHI)

·      Define or enable applicable classification patterns and labels

·      Establish a process to review and validate both user classified and automated results

6. Define Outcomes and Usage of Classified Data

·      Document risk mitigation steps and automated policies (e.g., move or archive PHI if unused for 180 days, automatically remove global access groups from folders with sensitive data)

·      Define a process to apply analytics to classification results

·      Establish expected outcomes from the analytic analysis

7. Monitor and Maintain

·      Establish an ongoing workflow to classify new or updated data

·      Review the classification process and update if necessary due to changes in business or new regulations

 

Purpose of Data Classification

In the most recent Market Guide for File Analysis Software, Gartner lists four high-level use cases:

1. Risk Mitigation

·      Limit access to personally identifiable information (PII)

·      Control location and access to intellectual property (IP)

·      Reduce attack surface area to sensitive data

·      Integrate classification into DLP and other policy-enforcing applications

2. Governance/Compliance

·      Identify data governed by GDPR, HIPAA, CCPA, PCI, SOX, and future regulations

·      Apply metadata tags to protected data to enable additional tracking and controls

·      Enable quarantining, legal hold, archiving and other regulation-required actions

·      Facilitate “Right to be Forgotten” and Data Subject Access Requests (DSARs)

3. Efficiency and Optimization

·      Enable efficient access to content based on type, usage, etc.

·      Discover and eliminate stale or redundant data

·      Move heavily utilized data to faster devices or cloud-based infrastructure

4. Analytics

·      Enable metadata tagging to optimize business activities

·      Inform the organization on location and usage of data

It’s important to note that classifying data—while a foundational first step—is not typically enough to take meaningful action to achieve many of the above use cases. Adding additional metadata streams, such as permissions and data usage activity can dramatically increase your ability to use your classification results to achieve key objectives.

 

Building an Effective Data Classification Policy

A data classification policy is a document that includes a classification framework, a list of responsibilities for identifying sensitive data, and descriptions of the various data classification levels.

In general terms, data classification policies are made up of a classification framework and a list of responsibilities for identifying sensitive data. The classification framework will usually involve a description of the various levels of classification used. Data classification policies should not attempt to provide restrictions for how data is handled, as this is a separate task that requires its own detailed policy document.

Successful Data Classification Policy – Step by Step

There are five key steps you need to take to develop and implement a successful data classification policy. These steps are outlines below:

Step 1 – Getting help and establishing why. You will need to ensure that you have the approval and help of key stakeholders within the business, in particular the board. These people need to understand the importance of data classification and the reasons why a policy is necessary. With the help of these stakeholders, you can develop a pitch as to why a data classification policy is required and the goals your policy is hoping to achieve.

Step 2 – Defining the scope of the policy. You need to define the amount of information within your organization that will fall under the policy, what forms that data takes and where that data is stored.

Step 3 – Define responsibilities. You now need to determine which people within your organization will be responsible for maintaining your classification policy and the roles each person and department will play.

Step 4 – Define your classification levels. We have spoken briefly about classification levels in an earlier blog, but we will go through them again briefly. Separate your content into four different levels based on their risk. Restricted data poses the greatest threat, followed by high risk, medium risk and low risk. Ensure your definition for these levels, whichever language you end up using, is concise and unambiguous.

Step 5 – Schedule regular reviews. The data classification policy needs to be regularly reviewed and refined to ensure it stays current with compliance regulations and your business structure. Define the process and timeline for these reviews in the policy.

 

A good classification policy:

1. Uses criteria that are straightforward and avoid ambiguity, but that are generic enough to apply to different data sets and circumstances

2. Is clear and written in simple language

3. Fits the organization’s business

4. Is limited to 3 or 4 classification levels

5. Contains a point of contact for clarification

6. Establishes a review schedule

 

Successful Data Classification Policy

Each organization will have their own unique data classification policy, there is no one size fits all. However, there are some commonalities that successful policies share. Here a few of the indicators of a successful data classification policy:

·      You have to get your classification criteria nailed down. They should be broad enough that they encompass all data in some way, but specific enough to avoid ambiguity.

·      A successful data classification policy will be written in the language of the business, will be clear and concise, and will resonate with employees.

·      The best policies are the simplest ones. Try and keep it to just a few pages and no more than four classification levels.

·      It should make employees aware of the person within the organization that is responsible for resolving any potential problems with classification policy that might arise.

·      A good data classification policy should make regular reviews a priority and a necessity. Reviews should take place at least quarterly so as to keep abreast of any new compliance regulations or changes within the company.

 

Data Classification Policies Fail

Unfortunately, many data classification policies fail before they even get off the ground. If you don’t want this happening to you then you should avoid the following pitfalls:

·      Using overly complicated language, jargon, abbreviations and other complexities that make it difficult for employees to get to grips with the meaning of the document.

·      Developing policies and practices that do not fit in with the organization’s workflows and is not backed up by employee training for implementation.

·      If fails to explain to employees, the importance of data classification and the reason why the policy has to be implemented.

·      The policy is written and then left for a long period of time without regular reviews to update and improve.

 

How to Select a Data Classification Solution

Look for these features:

1. Compound term search — Improves accuracy by minimizing false positives and false negatives.

2. Index — Enables you to identify sensitive terms without re-crawling the data.

3. Flexible taxonomy manager — Makes it easy to add and modify terms and rules.

4. Workflows — Automatically takes specific actions when a document is classified in a certain way. For example, a workflow might move sensitive data away from a public share.

5. Breadth of coverage — Supports both cloud and on-premises data sources, including both structured and unstructured data.

 

No comments:

Post a Comment