Help with concurrent test automation
prof_braino
Posts: 4,313
We having some trouble with handling automation for concurrent tests.
Four Telnet sessions are connected to four cogs. The output from each serial terminal directed to a console (the default serial connection on pins 30 & 31).
When multiple terminals are open and commands are executed the results are send to the console for logging. The since the cogs are running at slightly different times compared to the PC, the results from each terminal session may arrive at the PC in a different order each time. This causes a complication, as our check is whether or not the current run log file is identical to the previous run's log file. Anything not identical must be checked manually for errors. Having to manually check something that is correct just not in the same order, defeats the point of automated testing.
We think we want to get the concurrent tests to respond in the same order every time. Any ideas or suggestions?
Four Telnet sessions are connected to four cogs. The output from each serial terminal directed to a console (the default serial connection on pins 30 & 31).
When multiple terminals are open and commands are executed the results are send to the console for logging. The since the cogs are running at slightly different times compared to the PC, the results from each terminal session may arrive at the PC in a different order each time. This causes a complication, as our check is whether or not the current run log file is identical to the previous run's log file. Anything not identical must be checked manually for errors. Having to manually check something that is correct just not in the same order, defeats the point of automated testing.
We think we want to get the concurrent tests to respond in the same order every time. Any ideas or suggestions?
Comments
Thanks for the input. That might be an option, but it would be a bunch of extra work. We would have to (artificially) decide on an order, then massage the data until it fits. This would quickly become hard to maintain, so we are looking for some other approach.
What we want is the system to respond in whatever order the system wants, but make sure that it always happens the same way of its own accord until something changes it, rather than force it to become that order.
This is kind of a toughie.
A little clarification and an example would be helpful. It sounds like you are saying that you want the system to always respond in the same order as it did initially. If this is the case you could store the initial order of the cogs, buffer the data, and send it out in that order.
The test automation works is built around the idea that the identical software will produce identical results each time it is executed. Each time we add 2 + 5 and log the result, it will be 7. When we print hex 61 the result logged will always be "a", etc. Executing an entire test suite will produce an identical log. The check is simple, we just cjeck it the log file is a character by character match with last time. Anything that is not an exact match is flagged for manual checking. When a change is made sop the software, certain changes will be expected in the logfiles (the first time), and any unexpected changed are analyzed for correctness.
This works really well, we just have to recognize that some items are always going to be different and figure a way to work with this. For example, time stamps will always be different each run. The state of undefined pins will be different. So we can divide the test into sections where the execution result (always identical) are in one section and the time stamp, etc is in a different section, so its easy to tell where non-identical bits are an error and when they are acceptable.
The problem gets a little more interesting with the concurrency tests. We have four terminals open one each to cogs 1, 2, 3, and 4. When the stack overflow test (for example) is executed (on all four cogs at the same time), the error message is sent to the main console terminal on cog 6, since cogs 1 through 4 don't have logging and/or are unable to output due to the error condition being tested. So far so good. What is the issue is that the terminal responses may come back (on cog 6) in a different order each time, depending on when in the round robin the test is launched. Even though the results may be correct for each terminal, being out of order makes the simple "identical check" fail, and the section is flagged for manual user evaluation. Not a huge issue, but the point of the automated test it to let us skip over the stuff that behaves as expected. The concurrent test results are "right" but in a different sequence, which is entirely expected and correct. The goal is to have the automation ONLY flag items that are not as expected, we need to keep operator intervetion to zero if there is no error.
As we do more concurrency tests, the issue becomes a bigger deal. We could bite the bullet and do it manually, or write code to do filtering, or make the test smarter, but anything we write is another items that needs to be maintained and is prone to errors. Right now, the tests do not have anything that is not already part of the kernel generation process. Its just scripts and logging as are used in the kernel generation process to run more scripts and logging.
The goal is to find a way to structure what we're doing so we don't need to support code beyond the tests themselves. The challenge is keeping it simple while getting the job done, through design.
We may have to add an ID to each message to show which cog was the issuer, and log these in separate files. This would solve the order issue, but would increase the complexity of the code and slow down the messages. This might be a last resort.
Is there a nicer way?