Ground truth data is observational data from real-world scenarios. It often involves people, and collecting data from people requires consideration of data privacy and security. Indeed, privacy and security are among the most significant risk factors for firms handling ground truth data.
Q Analysts has been in this industry for more than two decades. Over that time, we’ve developed tried-and-true practices for ensuring data remains safe and private. Q Analysts is an ISO 27001-certified organization (for data security and privacy), and our approach to new collection projects always accommodates the specific needs of that project while also meeting our own strict security policies. In this article, we explain the practices we use to protect, track, and store data while it is being collected, processed, and delivered.
Critical components of a secure ground truth data pipeline
Most people understand that the data they own needs to be kept safe and secure while it’s being stored, but they miss some of the other critical steps. Data should be kept private and secure during all of the steps as it is being handled:
- Collection. Participant data should remain private as it is being collected.
- Storage.Data should be made secure as it is being stored for processing or for delivery.
- Processing. Data should be protected as it is being processed (for example, when being annotated)
- Transfer.Data should be protected as it is being moved between locations or for delivery to the client.
Q Analysts data privacy and security policies and procedures
We follow strict ISO 27001 standards for managing and storing data securely. We have developed the following practices to ensure that we meet our obligations to both our clients and to the participants who have provided us with their data. These have been consistently effective in ensuring that our ground truth data collection projects for our clients are properly protected and secured.
Securing equipment for data privacy
We ensure that all project-related devices and equipment that contain sensitive and confidential data are always secured. When capture and storage devices are in use, they must always be supervised and monitored by authorized personnel during daily operations. We make sure that capture and storage devices are securely stored away when they are not in use.
We also ensure that computers used for data processing are configured with a “lockout” or are put in “sleep mode” after 5 minutes of inactivity. This eliminates the risk of data misuse or unauthorized use in cases where a person who deals with sensitive data leaves the desk and forgets to lock the computer.
Participant management for data privacy
We ask all participants involved in any data collection project to sign an Informed Consent Form before participating. We make sure they are aware that the data they provide will be used in various AI applications and assure them that it will be handled securely and any Personally Identifiable Information (PII) will be anonymized at some stage, if not right at the start of the collection stage itself.
We also explain that they may not view the backend systems of any device. This is to ensure complete data privacy so they cannot see the data from other participants. Guidelines are in place to restrict access to data collected. We further prohibit participants from having their mobile phones or taking photos during the collection process. Mobile phones are securely locked away in personal lockers prior to entry and returned to participants after their sessions are completed for the day.
Data management for optimal data privacy
The sharing of data is the most critical aspect of data collection. Data must not be shared with any parties who do not have permission to access the data and should be accessed only when necessary to perform job functions.
Any moving or manual transfer of data (e.g., physically moving drives to other locations, handing off forms, etc.) must first be logged in the appropriate tracking form and managed by authorized personnel. Further, any transferring of data over the internet must be performed only on a secured network and overseen by authorized personnel.
Data files should also be encrypted before they are moved from one location to another. It is especially relevant when moving files on physical media such as HDD, SSD, or flash memory drives.
Security and data privacy are paramount in ground truth data projects
Any organization that collects ground truth data should be held to a strict standard of data privacy, and appropriate procedures must always be followed. There should be appropriate consequences in the event of data violations, including possible employee termination or even a complete project shutdown.
When proper data collection, storage, and delivery procedures are followed, companies can ensure that the data they collect is neither exposed to nor stolen by bad actors. They will minimize their legal risk and also the possible harm done to their organization, their clients, and their participants.
At Q Analysts, we are experts in collecting ground truth data, and that includes keeping it private and secure. Want to better understand how we minimize risk for our clients? Contact us to learn more.