Testing in a Brave New World: The Importance of Data Masking

As testers today, we face a brave new world. Our conundrum, providing effective testing with less time, is more difficult that it has ever been. Challenges from disruptive technologies such as cloud, mobile devices and big data have taken testing to a whole new level of complexity. At the same time, we are also challenged with the “need for speed” as agile methodologies evolve into continuous delivery and continuous deployment. We can engage in only so much risk-based testing, so often, we are tempted to use production data to speed up the test process. Ironically, those very same technologies make this practice increasingly more dangerous. So what gives?

If production data is also privacy-protected data, our use of it in testing may be illegal. At the very least, it opens up the data for compromise.

Testers must collaborate with security professionals to develop a test data privacy approach which is usually based on data masking. Data masking involves changing or obfuscating personal and non-public information. Data masking does not prevent access to the data; it only makes private data unrecognizable. Data masking can be accomplished by several methods depending upon the complexity required. These range from simply blanking out the data to replacing it with more generic data to using algorithms to scramble the data. The challenge of data masking is that the data not only has to be unrecognizable, but also still useful for testing.

There are two main types of data masking – static and dynamic. The usual approach is static data masking where the data is masked prior to loading into the test environment. In this approach, a new database is created (which is especially important when testing is outsourced). However, the database may not contain the same data or data in the same states as the actual database, issues which are very important in testing.

Dynamic data masking where production data is masked in real time as users request the data. The main advantage of this approach is that even users who are authorized to access the production database never see the private or non-public data. Furthermore, dynamic data masking can be user role specific; what data is masked depends upon the entitlements of the user who is requesting the data.

Automated software tools are required to mask data efficiently and effectively. When evaluating data masking tools, it is important to consider the following attributes. Most important, the tool should mask the data so that it cannot be reversed and is realistic enough for testing. Ideally, the tool should provide both static and dynamic data masking functionality and possibly, data redaction, a technique that is used for data masking in PDFs , spreadsheets and documents. Also, the tool should mask data for distributed platforms including cloud. Here is a brief look at a variety of the vendors in this arena. As with any tool evaluation, organizations must consider their own specific needs when choosing a vendor.

According to Gartner’s Magic Quadrant, IBM, Oracle and Infomatica are the market leaders in data masking for privacy purposes. All offer both static and dynamic data masking as well as data redaction. IBM offers integration with its Rational Suite. Oracle offers an API tool for data redaction and provides templates for Oracle eBusiness Suite and Oracle Fusion. Both IBM and Oracle products are priced relatively high as compared to other vendors.

Infomatica offers data redaction for many types of files and is a top player in dynamic data masking for big data. It offers Dynamic Data Masking for Hadoop, Cloudera, Hortonworks and MapR. Infomatica’s product is integrated with PowerCenter and its Application Information Lifecycle Management (ILM) which makes it a good choice of organizations who use those products.

Mentis offers a suite of products for static and dynamic data masking and data redaction as well as data access monitoring and data intrusion prevention at a reasonable cost. One of the most exciting features of these products is usability; not only are there are templates available for several vendor packages including Oracle eBusiness and Peoplesoft, but also the user interface is designed for use by the business as well as IT. Mentis was rated as a “challenger” by Gartner in 2013.

One of the least expensive products on the market, Net 2000 offers usability as its main feature. Net 2000 provides only static data masking for Oracle and SQL servers. It is rated as a “Niche” player by Gartner in 2013. This tool is a good choice for a small organization with a simple environment.

Data privacy is one of the most important issues facing test managers and testers today. Private and non-public data must not be compromised during testing; therefore, an understanding of data masking methodologies, approaches and tools is critical to effective testing and test management.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s