Acceptance Testing

If you have completed the foundation lab exercises (and you really should do so) you have already learned about the concept of Unit Tests being used to validate the code in your modules. These tests called the functions you defined and made sure they were returning valid data. During this process you used a process called Test-Driven Development where you wrote tests to define the functionality _before_writing the code to implement these.

Whilst this form of testing is useful to a development team to help them implement the code, it does not describe the system from the end user's point of view. This will be addressed in this lab.

Acceptance testing (also known as Functional Testing, UI Testing and Customer Testing) is testing the system from the end-user's perspective, identifying whether its behaviour meets the customer's needs. Since we are developing software that runs in the web browser it naturally follows that our acceptance tests will need to run in a web browser.

There are many benefits of (automated) acceptance testing:

The tests define the correct behaviour of the system in a clear, unambiguous manner.
The tests run automatically, meaning the development team don't need to manually test the system.
The entire suite of tests are run each time meaning both new defects and regressions get picked up quickly.
The tests are run in precisely the same way every time they are run.

1 Testing With Jest and Puppeteer

Our first version of acceptance tests will be defined using the same Jest testing framework that we used when writing our unit tests but at this point the similarity ends. We will need to run our tests in a web browser environment and the Puppeteer API provides us with a version of the Chrome browser.

Start by locating the exercises/03_acceptance/todo/ directory. You can see that it contains the now familiar todo website. Try installing all the dependencies and running the server. You should be able to access the web page in the usual manner localhost:8080.

Stop the server and close the browser.

1.1 Running the Acceptance Tests

We can run the acceptance tests by using the command:

$ npm run acceptance
  listening on port 8080
    PASS  acceptance tests/ui.test.js
      todo list
        ✓ adding one item (1616ms)

  Test Suites: 1 passed, 1 total
  Tests:       1 passed, 1 total
  Snapshots:   1 passed, 1 total
  Time:        3.95s, estimated 4s

Looking at the console output it is clear that there was a single test in a test suite that has executed and the test has passed, but how? The test must have run is a browser but where was the browser?

The tests run in a headless browser. This only exists in memory and does not have a graphical representation. There are several benefits:

The tests run really quickly as there is no GUI to render.
You can run the tests on a computer without a GUI such as a server.

Lets take this process a step at a time:

The npm run xxx command runs the script alias in the package.json file.
1. If you look in the scripts key in this file you will find an alias called acceptance.
2. This alias runs a shell script called test.sh.
You should be able to find the test.sh script in the project directory. If we open this we will find:
1. The top line is a shebang line that tells your computer to run the command using the bash shell.
2. The script then starts the koa server, the & means run it as a background process (so we don't see the output of this in stdout).
3. Next it runs the jest command, pointing it to the acceptance tests directory.
4. Finally, once the tests are finished it kills the background process (the koa server).

Lets take a look at the test suite in the ui.test.js file. This contains the test that is being executed.

On line 4 we are importing the puppeteer package. This is used to control the Chrome web browser.
On line 5 we import a module that allows us to take snapshots (more on this later).
We then define variables to hold the browser and page objects, we will assign values to these later.
On lines 15-19 we configure the snapshot tool that will be used as part of the test.
The test suite starts with a call to the beforeAll() function that runs once before the tests start:
1. Line 22 launches a _headless_browser with the specified dimensions. Headless means there is no GUI displayed (its hidden). The sloMo key defines how many milliseconds delay should be added between keystrokes.
2. Next we create a new tab (page) in the browser and make sure the page dimensions are correct.
Line 27 closes the browser after all the tests have completed.

The rest of the file contains the test (lines 29-82). There is a lot to understand here but the code is aplit into the three steps we have already covered in unit testing, Arrange, Act and Assert:

Arrange
1. We point the browser at the localhost server on the correct port then refresh the browser.
2. We then take a screenshot and save in in the screenshots/ directory. This allows us to see what is in the headless browser.
Act
1. We enter data into the form (the item name and the quantity).
2. We click on the submit button.
3. We wait for the next page to load (can we see the top level heading?).
Assert
1. We check the page title (in the tab) is correct (lines 52-53).
2. We capture the top level heading and check it is correct (lines 56-60).
3. We grab an array that contains the text in the first column of the table (lines 66-70).
4. We check that there is only a single row in the table (line 75).
5. We check that the first row of the table contains the text "bread".
6. We grab a second screenshot.

1.1.1 Test Your Understanding

You have seen how to use Puppeteer to write a test to see that we can add a single item. We will now write some more tests. Unlike unit tests, we will be changing the state of the system in each test. The tests will run sequentially (one after the other).

Write a second test to check that we can add more than one item.
Write another test to check that if we add the same item twice we increment the quantity. Note that you will also need to implement this functionality, it does not currently exist!

1.2 Snapshots

The acceptance tests in the previous section interact with the Document Model Object, they have no way to understand the appearance of the page as this is influenced by the Cascading Stylesheets that are being applied to the page. Clearly it is important that our acceptance tests should also check that the page renders as it should and Snapshot Testing enables us to achieve this.

In essence we identify the important screens in the app (the ones we wish to monitor), get the layout the way we expect it to be and capture a screenshot of the page which is stored in a __image_snapshots_/ directory then used as a reference whenever the test suite is subsequently run. Since you have already triggered an inital test run you should be able to locate this directory and examine the image file it contains.

The snapshot is captured using the following two lines of code:

const image = await page.screenshot()
expect(image).toMatchImageSnapshot()

Sometimes you will need to modify the page layout however this will cause the snapshot test to fail since there will no longer be a match. The solution is to run the test suite but add a -u flag. This will update the snapshot files.

1.2.1 Test Your Understanding

Modify the stylesheet and make the top level heading larger (perhaps 28pt).
Now re-run the acceptance test suite, this will fail because the layout is now different.
Edit the test.sh script, adding the -u flag, save this change and rerun the test suite.
Remove the -u flag and rerun the tests to check these are still passing.

2 Profiling

When the tests were running we included two profiling tools:

A Performance tick file that uses the Chrome Devtools Protocol.
A HTTP Archive that tracks information between the web browser and the server. This is used to:
1. identify performance issues such as slow load times and,
2. identify page rendering problems.

Both of these profilers save their log files to the trace/ directory. The files contains a lot of information stored in JSON format. To visualise this data we need to load it into visualisation tools.

The tick file can be loaded into the Performance tab of the Chrome Dev Tools using the up arrow button.

This shows the timing of the various steps and the CPU and memory load. Try hovering over the different elements and scrubbing along the grey bar.

The HAR files can be visualised using the GSuite Toolbox HAR Analyser. Use this to open the .har file and you will see something like the following:

There is a lot of information generated. Hover over the different icons to find out more.

An alternative is to use the network tab of the Chrome DevTools. Clicking on the up arrow as shown allows you to load the har file.

There are many more settings you can experiment with to help understand the performance of your web server. Start by looking at this tutorial.

2.1 Test Your Understanding

Look at the graphs and figures that are produced for the todo code profiling and identify the different stages and different time taken for each. Are there any stages where you feel the timing is significantly long.
Run the todo code three times and compare the time taken for each of stages involved. Do they different? Is there any reason why this might be the case?
Using the har analyzer how long does it take for the server to respond.

2.2 Flame Graphs

Another question often asked by software developers is how the software is consuming resources, what exactly is consuming how much, and how did this change since the last software version? These questions can be answered using software profilers, tools that help direct developers to optimize their code and operators to tune their environment. Flame graphs are generated from perf output.

The output of profilers can be verbose, however, making it laborious to study and comprehend. The flame graph provides a new visualization for profiler output and can make for much faster comprehension, reducing the time for root cause analysis.

Flame graphs are a way of visualizing CPU time spent in functions. They can help you pin down where you spend too much time doing synchronous operations. In this exercise we will use a package called 0x which has already been added to the devDependencies in the todo project. You can start the profiler using the script alias in the package manifest or by running the command directly:

$ npm run profiler

> todo@1.0.0 profiler /dynamic-websites/exercises/03_acceptance/todo
> 0x -o index.js

🔥  Profilinglistening on port 8080

You should now launch the app in the web browser and add and remove items from the list. This will capture the perf output in a directory ending in .0x with each run generating a new directory. Once you have made use of the todo list, press ctrl+c to quit.

🔥  Process exited, generating flamegraph
🔥  Flamegraph generated in
file://dynamic-websites/exercises/03_acceptance/todo/24526.0x/flamegraph.html

It will then try to open the specified html document in your default browser which will look something like this:

Hovering over a box will reveal more information.

Each box represents a function in the stack (a "stack frame").
The y-axis shows stack depth (number of frames on the stack).
1. The top box shows the function that was on-CPU. Everything beneath that is ancestry.
2. . The function beneath a function is its parent, just like the stack traces shown earlier.
The x-axis spans the sample population.
1. It does not show the passing of time from left to right, as most graphs do.
2. The left to right ordering has no meaning (it's sorted alphabetically to maximize frame merging).
3. The width of the box shows the total time it was on-CPU or part of an ancestry that was on-CPU (based on sample count).
4. Functions with wide boxes may consume more CPU per execution than those with narrow boxes, or, they may simply be called more often. The call count is not shown (or known via sampling).

2.2.1 Test Your Understanding

Looking at the flame graph which process is consuming the most CPU cycles? Is it possible to optimise this process to reduce the cycles?
Looking at the flame graph which process what is the hot path for the code.

3 Cucumber

Earlier in this lab you wrote a series of acceptance tests to ensure that your software (web application) met the user requirements. These tests were written in JavaScript using the Jest testing frameworks. Whilst these tests are understandable by the edevelopers, they can't be understood by the client/customer who whould be involved in defining them.

In this section of the lab you will learning about some tools that allow the customer to work with the development team to define the requirements. Key to this process is a domain-specific language called Gherkin.

3.1 Gherkin

Gherkin is the format for cucumber specifications. It is a domain specific language which helps you to describe business behavior without the need to go into detail of implementation. This text acts as documentation and skeleton of your automated tests. Gherkin is based on TreeTop Grammar which exists in 37+ languages. Therefore you can write your gherkin in 37+ spoken languages.

This script serves two primary purposes:

Documents user scenarios
Writing an automated test (BDD)

Gherkin files are located in the features/ directory and have the file extension .feature and you can find one of these in the exercises/03_acceptance/cucumber/features/ dirctory. Let's go through this to understand the syntax:

Feature: Adding Items
	The user should be able to add a single item to the list.

	Scenario: add item via webpage
		Given The browser is open on the home page
		When I enter "bread" in the "item" field
		When I enter "42" in the "qty" field
		When I click on the submit button
		Then the heading should be "ToDo List"
		Then the list should contain "1" row
		Then the item should be "bread"

The feature file starts with the Feature keyword. Each file should have one feature that summarises the feature being tested.
Underneath this you will find one or more scenario keywords. These express the expected behaviour of the system being tested, with the steps required to demonstrate this listed (indented) underneath, following the AAA pattern you have already seen in your unit tests:
1. The Given keyword denotes one or more steps used to set up (Arrange) the system. In th example above it is important that the browser is pointing to the correct web page.
2. Then there are one or more When keywords that describe the actions to take place in order to demonstrate the functionality.
3. Finally there will be one or more Then keywords that are used to check (or Assert) the state of the system.

Using this language you can describe the desired functionalty of the system in a language easily understood by both the developer and the customer.

3.1.1 Test Your Understanding

Create .feature files to describe the following functionality:

Adding more than one type of item (add_multiple.feature).
Adding duplicate items should increase the quantity, not add a duplicate row (add_duplicates.feature)
Adding then deleting an item (delete.feature).

3.2 Cucumber

Cucumber is a software tool that parses Gherkin feature files and converts these into a sequence of acceptance tests that can be automatically run as part of an automated testing suite. It was originally written in the Ruby programming language but now supports a range of languages including:

NodeJS using the cucumber-js package.
.NET using the SpecFlow tools.
Java using the cocumber-jvm virtual machine.

You can run the cucumber tests using the cucumber script alias in the package manifest. If you do this you will see the following:

$ npm run cucumber

> todo@1.0.0 cucumber /dynamic-websites/exercises/03_acceptance/cucumber
> ./cucumberTest.sh

listening on port 8080
.......

1 scenario (1 passed)
7 steps (7 passed)
0m00.752s

As in the first exercise you are running a shell script that starts the koa server, runs the tests then kills the server. Take a look at the cucumberTest.sh file.

Key to the operation of Cucumber are Step Definitions. These are functions that are triggered by the steps in the .feature files. These are typically located in the steps/ directory. The project has a single step definition file called todo.steps.js, note that the files must end in .steps.js to be recognised as step definition files. Lets open the todo.steps.js file and take a look...

We import:
1. three objects from the cucumber package, each of these contains a function literal.
2. The assert module so we can validate our data.
3. The page.js module which contains a custom object prototype to handle our browser and page objects.
The rest of the file contains a series of calls to our three functions. Each takes two parameters:
1. The string matcher. This is used to match each step in our .feature files.
2. A callback function. Note that some of the string matchers have variable placeholders {}, these values are assigned to the parameters in thec allback function.

When we run our test, each step in the feature file is compared to the string matchers from each of the functions in the step definition file until a match is found. When this happens the callback function is called with the variables identified in the string matcher. An example will help illustrate how this works:

Assume the step:

When I enter "bread" in the "item" field

When the test reaches this line it matches to the function:

When('I enter {string} in the {string} field', async(value, field) => {
	await page.click(`#${field}`) //field represents the id attribute in html
	await page.keyboard.type(value)
})

The callback function is called with two parameters:

The argument bread is passed to the value parameter.
The argument item is passed to the field parameter.
This triggers the following actions:
1. The input with an id of item is selected in the form.
2. The text b r e a d is entered.

Take time to read through all the matchers in the steps file. Whilst these can handle the steps that are currently listed in the .feature file, as you add more steps you will need to add more matchers...

3.2.1 Output Formats

Currently the console is displaying a dot for each step executed plus a summary and, whilst this is fine for a quick check it does not convey very much information. The cucumber-js package comes with a number of formatters that render the test results in different ways. The default formatter is called progress however there are a lot more. One of the more useful ones is the usage formatter which prints details about which step definitions are used by which step. Open the cucumberTest.sh file and modify it as follows:

#!/usr/bin/env bash

node index.js&
node_modules/.bin/cucumber-js --format usage ./features -r ./steps &
sleep 2
pkill node

When you run the tests you will see the following:

npm run cucumber

> todo@1.0.0 cucumber /dynamic-websites/exercises/03_acceptance/cucumber
> ./cucumberTest.sh

listening on port 8080
┌────────────────────────────────────────┬──────────┬─────────────────────────────┐
│ Pattern / Text                         │ Duration │ Location                    │
├────────────────────────────────────────┼──────────┼─────────────────────────────┤
│ The browser is open on the home page   │ 596ms    │ steps/todo.steps.js:31      │
│   The browser is open on the home page │ 596ms    │ features/add_one.feature:5  │
├────────────────────────────────────────┼──────────┼─────────────────────────────┤
│ I enter {string} in the {string} field │ 151ms    │ steps/todo.steps.js:35      │
│   I enter "bread" in the "item" field  │ 218ms    │ features/add_one.feature:6  │
│   I enter "42" in the "qty" field      │ 84ms     │ features/add_one.feature:7  │
├────────────────────────────────────────┼──────────┼─────────────────────────────┤
│ I click on the submit button           │ 80ms     │ steps/todo.steps.js:40      │
│   I click on the submit button         │ 80ms     │ features/add_one.feature:8  │
├────────────────────────────────────────┼──────────┼─────────────────────────────┤
│ the item should be {string}            │ 9ms      │ steps/todo.steps.js:61      │
│   the item should be "bread"           │ 9ms      │ features/add_one.feature:11 │
├────────────────────────────────────────┼──────────┼─────────────────────────────┤
│ the heading should be {string}         │ 8ms      │ steps/todo.steps.js:44      │
│   the heading should be "ToDo List"    │ 8ms      │ features/add_one.feature:9  │
├────────────────────────────────────────┼──────────┼─────────────────────────────┤
│ the list should contain {string} row   │ 8ms      │ steps/todo.steps.js:52      │
│   the list should contain "1" row      │ 8ms      │ features/add_one.feature:10 │
└────────────────────────────────────────┴──────────┴─────────────────────────────┘

There are nine formatters at the time of writing and they are detailed in the CLI Documentation

3.2.2 Test Your Understanding

Start by testing each of the different output formats, using the CLI Documentation to help you. You may need to use a number of these when debugging your tests.
Now implement the step definitions so that the two features you wrote in step 3.1.1 run as automated tests.

dynamic-websites/03 Acceptance Testing.md

Users who have contributed to this file