Test Automation Architecture Data Driven

Author: Herbert M. Isenberg Ph.D.


The purpose of this paper is to define the Organization of an Automated Testing Architecture that is application independent. Meaning, the architecture is independent of the type of application or the coding language.

The three organizing principles of an independent automation architecture are as follows:

  1. One Point Maintenance
  2. Data Driven
  3. Test Case Independence

This article will address the second principle: Data Driven.


An on-going question in the world of automation is “What drives the system?” Is it the test cases, requirements, traceability matrix or any other testing artifacts? From an architecture perspective these “drivers” may be considered when building the testing framework, however, based on One Point Maintenance (Principle one), it is the data that drives the system.

Data Driven (DD)

One can surf the web and find articles on the concept of a data driven automation system; however, they tend to be complicated and lack a point of reference. For the most part; DD, is described as part of an automation framework instead of a guiding architectural principle. The critical difference is that frameworks tend to vary depending on the application, testing tool, test environment, business domain and other factors.

The point of view here is that the automation architecture is application and tool independent. Cutting through all the “red tape”, the data driven aspect of the overall architecture is a spreadsheet, or common delimited file that contains one row of data per test cases.

We’re not looking to trivialize the importance of data driven, only to point out that it does not need to be complicated or confusing. It is an integral part of a solid automation architecture and that DD is most effective when it is implemented at the architectural versus the framework level; thereby remembering that it is the architecture that is independent and can be applied to any automation system. Whereas the framework is dependent on a variety of factors that change continuously.


Again one may say “The devil is in the detail”; that is, the correct implementation of the Data Driven principle relative to the automation architecture, is what makes a solid architecture. An architecture that is stable and can withstand rapid changes versus one that cannot maintain its independence.

The data is organized by test cases allowing one row per test cases. Using more than one row complicates the situation, requires complex coding to tracking the beginning and ending of data for each test case and undermines the principle of One Point Maintenance.

The data in the spreadsheet contains the following

  1. Input values
  2. Expected results or Verification points
  3. Blanks
  4. Field names
  5. An asterisk to indicate “skip this row”

Input Values

All of the values such as text characters, special characters or numerics for the test cases are based on the navigation sequence. Using the medical application example, where the end-to-end test case navigates 30 screens, the data file would contain all the input and expected results for all the screens. Each field per screen is a column in the data file. The first row of the data file contains the field names and this row has an asterisk at the beginning, as it is for reserved for clarity. One screen may have several input fields and a number of expected results or verification points. Not all input fields have an associated expected results or verification point, which is common. However each screen will have one or more verifications associate with it. The first verification is the screen name itself. If the screen name verification is not found, call an error handler, and depending on the conditions either write a log as a soft error and continue. If it is a hard error then the system cannot continue so take a screen snapshot, write the hard error to the log and continue to the next row of data and begin the next test case.

Note: When reading the next record and starting a new test case be sure to set the context. That is, the concept of “Home Based” is often used to refer to application location where all the test cases begin. The error handler for each screen will have a function that will perform this operation

As the system executes one test case after another, different fields will have input and validation points. If a test case does not require an input for a specific field then leave that field blank in the data file. The scripts will know that when a field is blank, no action is required.

Here is a brief example of a data driven spreadsheet:

* Login Screen name Login Password Scr2, Name Scr2, Street Scr2, City Scr2, Verify Name (to Scr30)
Patient login Jim Abc123 James 123 Maple Ohio James

Summary: Principle Two – Data Driven (DD)

Data driven system reduce complexity and enable scaling when additional screen or object are added to the application. For some applications the data spreadsheet can get quite large. However, due to its organization and naming conventions it is relatively straight forward and easy to maintain, and for others to understand. The size of the data file is not an issue as the system is automated!

The example presented has been developed over twenty years of building automation systems and has proven itself over time in terms of reliability, scalability and easy of maintenance.