Test Data Management is often an overlooked part of QA team’s effort.Test Data is usually associated with individual testcases,but there are no tools available which makes managing this test data more efficient or more scientific.Increasingly Web 2.0 applications are richer in user data than system data and it’s more important to use production data as often as possible.
Any production issue starts a chain of trying to replicate end user condition,a essential component being the data that user entered or data existing in that’s user’s domain that could have caused an application failure.This should imply that QA team should more often work with production data then fictional testing data..But this change has it’s own challenges like
- Production data is huge
QA environments are not well equipped(mainly cost constraint) to handle production data.In such scenarios QA should have a subset of production data replicated in test environment.QA can offer their expertise in selection of this subset which serves the testing team’s need as well as not overloading the test environment.
- Cannot expose user data in testing environment
User data is sensitive and cannot be exposed.Apply scramblers(de sensitizers) which will scramble the data but respect the integrity and type of the data that application understands
- Database schemas can change often and makes the task of copying production data difficult
We all know that database schema changes are just around the corner and a mask copy of production data to QA environment will probably fail.This will require making the replication task intelligent enough to understand this and have an option of making those changes while replication is being done
I am sure there must be other issues out there and no single tool today indiviudally serves the problem of test data management.In next part I will talk more about my wish-list for what such a tool should do