Testing is, perhaps, not the most exciting aspect for a security consultant. However, there have been some difficult issues commonly cropping up recently.
What data do you use for testing? A more difficult question than you would think.
You have to control access to production data, as in many instances disclosure could have serious legal and compliance issues. What if someone's personal data was taken and sold by someone who works in your development team?
The standard answer is to generate data by SQL scripts. Produce your own data, and there's no problem.
However, there are times when we need to use actual production extracts.
Some applications applying complex algorithms to business data (such as trading calculations) would need a spread of data similar to real ife, otherwise the algorithms could not be tested properly. In this instance, a process called sanitisation is often applied to production data extracts.
Sanitisation involves 'scrambling' any parts of the data that could be traced back to a person or organisation, so that it is unreadable. In a database table, the columns containing name and address of the client, for instance, would be scrambled, and all the other data (as long as it really unidentifiable, and one could not figure out details via aggregation of the other fields) could be left as it is. The spread of the data is true to life, but your compliance issues are satisfied.
If HMRC had used this when they had sent those discs through the post, then there would have been less risk of the names and addresses being compromised.
The algorithm used to scramble the data has to have a high degree of entropy, and have an element of randomness in it, so that the data cvan't be pieced back together again.
However, this doesn't solve everything. Some applications are reliant on others. In the banking world, you may have a payment system that needs data similar to production for testing. First option is to sanitise the data. However, the transactions may have to be reconciled against a number of systems. This would be problematic whether it's done manualy or through a batch process. For instance, how do you match organisation 'w4gbsdf435' against 'MetroBank' (the first is the sanitised version in your test app, the second is the name in the other app against which it has to be reconciled). So now you have to apply the same type of sanitisation against two or more different aplicatons? The randomness required in the algorithm is one of the problems here.
Bottom line is that you're going to have to manage the security of unscrambled production data in some applications. You're going to need to accomplish the following:
-check references of the people working on the development team when hiring
-apply access acontrols to all aspects of the application development. No more admin access for everyone
-make sure your dev environment is well segregated from other networks (either physically or logicaly)
-ensure someone is designated as the data owner
-work with the information security team of your organisation to ensure all possible controls are applied