We live in a digital era where businesses use, analyze and rely on data, their processes and databases. An important part of data is its integrity because it ensures data is unchanged, undivided and in its complete, consistent form. More importantly, data integrity means the data is trustworthy. Organizations make data driven decisions, and if the data has been altered or changed it can have a negative impact on the business. Data integrity can also come into play when it comes to meeting different data regulations and compliance standards that are requirements in certain industries today. Overall, it is important to understand what data integrity is, the different types of integrity and how it relates to data quality and security.
What is data integrity?
First and foremost, data integrity is the accuracy and consistency of data that is maintained by a collection of processes, rules or standards. Through these different rules or standards, the data maintains its accuracy and completeness. Other aspects of data integrity are that it’s readable and formatted correctly along with it being original with no duplicate data.
Data is critical to business operations, decision-making and strategy. With its importance, consistent and complete data is necessary not only to keep its integrity but also so businesses remain compliant with industry regulations. Secure, quality data is important to maintain its overall integrity, but there are factors that can affect the data and make it inconsistent.
Factors including human error, transfer error, bugs and viruses or compromised hardware are all risks that can influence the integrity of data. To help eliminate vulnerabilities, here are some considerations:
- Limit access to data: access to data should only be allowed for business needs, place restrictions on unauthorized access
- Take the time to validate data: making sure data is correct when its being collected or being used
- Backup data: making sure you have a copy of your data available, if data loss does occur and there is no backup, that data is irreplaceable
- Audit when data is added, changed or deleted: keep track of changes and of who and what is being changed
- Use an error detection software: this can help detect abnormalities in data based on historical analysis
These steps should be considered when trying to maintain the integrity of any data set. If data is not complete, it can’t have any true value to an organization. These considerations should be addressed to ensure data is rational on a logical or physical level.
Types of data integrity
Data integrity can be categorized by two different types: physical integrity and logical integrity. Physical integrity consists of external factors that have effect on the data such as power outages, data breaches, damage caused by human operators or unexpected disasters. The physical integrity of the data can also be impacted if there are issues storing or retrieving the data. Consider maintenance problems, old storage or design flaws can all come into play. One of the best ways to combat these issues would be to consider redundant hardware or power supply.
The other factor that effects integrity is logical, this can be related to human errors or software bugs. If the logic of the data is flawed, the data will no longer makes sense. If you’re thinking about the integrity of business-critical databases, to keep the data logical you would want to maintain the rationality of your database. Unlike physical integrity there are different types of logical integrity that should be considered, especially when it comes to maintaining the integrity of a database. These are entity, referential, domain and user-defined integrity. Entity integrity refers to the data being uniquely identifiable, so each record in a table is identifiable and singular. Referential maintains the consistency between tables. Domain refers to the range of acceptable values that can be stored in a database, and lastly user-defined is implemented through a set of triggers and stored procedures.
Is data integrity different than data security and data quality?
Data integrity, data security and data quality have distinct differences, but they are connected. Data security and data quality play an important role in accomplishing data integrity.
Data quality is whether the data is useful. It has a broader definition, meaning for data to have quality it needs to be complete, valid, unique, timely, accurate and consistent. If the data does not meet one of these criteria than it is incomplete and likely inaccurate.
Data security on the other hand has a set of standards that needs to be followed to ensure data is protected from unauthorized access or corruption. When it comes to data security, its important consider the following CIA principles: confidentiality, integrity and availability. Data integrity is at risk if the data is not secure, as the data can be altered by unauthorized parties, changed or corrupted. Data security plays an important role in maintaining the integrity of data. As you can see these terms are similar, however different but still connected. Let’s look at an example of each to get a better understanding.
Consider a database that stores names and phone numbers for a group of people. Now, if one digit is wrong in the phone number, then when trying to call that number to reach that specific person, you would not be able to, this would be an example of poor data integrity. Now, say some person in the database changed their phone number, when calling this person, you would not be able to reach them due to the data not being updated in the system. This would be an example of poor data quality. Lastly, say access to this database was infiltrated, and the data had been changed or corrupted, this would be an example of poor data security.
What about GDPR compliance as it relates to data integrity?
The General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for the collection and processing of personal information from individuals who reside in the European Union (EU). Since this regulation applies regardless of where the website is based, it must be followed by all websites that attract European visitors. To maintain GDPR compliance you need to have the appropriate measures in place to protect personal data. One of the six principles of GDPR is integrity and confidentiality, this means the maintaining the integrity of data is an important factor for meeting this regulation. If data integrity is poor, this could be in violation of this regulation.
Keeping data complete and secure
In the age of digital transformation, data is critical to every business. This is an aspect of the current IT landscape that is by far not new and is recognized across the industry. Businesses make data driven decisions, create strategy on data trends and forecast using data collection, so data should be reliable and trustworthy. Data integrity refers to the accuracy of the data throughout its lifecycle, whether it is valid or invalid. Data integrity has an open relationship with security and quality, as the integrity of the data could be compromised if it’s not secure, and if its lacking in quality it can’t be complete or accurate. To help preserve the integrity of data you should validate data input, remove duplicate data, ensure you are protecting the data with a backup software and limit control and access to that data. All these safeguards can help maintain integrity and ensure the company’s data is valid and, most importantly, trustworthy.