Source: nearForm Blog

nearForm Blog Midscene.js: Assessing a natural language AI testing tool

We see how Midscene.js stacks up against traditionally coding Playwright tests With almost everything AI flooding our feeds (regardless of what we do for work) it is difficult to pick out some of the AI tools that stand head and shoulders above the rest. In test engineering and software development, code generation tools are everywhere. So staying in the AI theme, we’ve taken a different direction and looked at an AI tool which does NOT generate code, but rather uses natural language to carry out the test. Midscene.js is a tool that integrates with Playwright or Puppeteer. In this article, we’ll put it through its paces, seeing how it stacks up against traditionally coding Playwright tests in terms of speed, and observing how Midscene.js handles various test scenarios.As AI continues to transform testing tools, this legitimate first take on natural language for writing test automation marks a promising development. Midscene.js and other tools need to move beyond the novelty status and become usable tools in the test engineer’s toolbox. In order to get there, major improvements to speed and the ability for AI to interpret the DOM alongside screenshots are a must. How does Midscene.js. work? Based on the diagram below, we can see that there is no direct interaction under the hood with the underlying APIs. This means that these tests can only function based on what it derives from the frontend. How Midscene.js. measured up Let’s get to testing Midscene.js. We used three scenarios which could represent a smaller part of an entire web app:A simple login pageAn add/remove elements pageNintendo.com and Amazon.com for real-world, e2e scenariosWhat is very exciting with this AI tool is the ability to use natural language — simply describe what the test step is trying to do, and off it went. This brought the wow factor to another level, whereas code generation is almost standard these days. Login For this first example, we used Nearform’s UI Testing Playground — a great place to take a tool through some basic as well as more advanced tasks and tests on a page. The UI Testing Playground also allows anyone to sharpen their UI testing practice.Here we have the login scenario. We’ve used this as a basic test to measure:Ease of use/readabilityExecution timeMaintainabilityFor comparison, here’s what it looks like in Playwright, writing the “traditional way”: typescript Copy to clipboard test("Can log in to the page with valid credentials", async ({ page }) => { const loginPage = new LoginPage(page) await loginPage.goto() await page.waitForLoadState() await loginPage.logIn(user, password) await loginPage.checkLoginResult("success") await loginPage.logout() }) Now, in natural language, powered by AI using Midscene.js: typescript Copy to clipboard test.beforeEach(async ({ page }) => { await page.goto("https://nearform.github.io/testing-playground/#/login-form") await page.waitForLoadState() }); test("Can log in with valid credentials", async ({ ai, aiAssert }) => { await ai("Fill in username field with the value admin") await ai("Fill in password field with the value Passw0rd!") await ai("Click on the Login button") await aiAssert("Check that the user has successfully logged in") await ai("Click on the Logout button") await ai("Check that the username and password fields are visible") }) Slightly longer, but much more readable for say, a non-technical member of a given team. Takeaways from this scenario: Ease of writing/readability:➕ Write it as you think it should execute — which is its strongest point. We found that if it didn’t work, just try another, simpler way to write it! This also supports several languages (English, French, Chinese). Definitely some wow factor here.➖Can look awfully long to read for developers➖You can also see that there is no way to hide the username and password in natural language. You need to send those directly in the prompt, whereas without Midscene.js, you can hide these in a separate data file.Execution time:➖Because this is powered by OpenAI and screen captures, the execution time was very slow. The AI needed time to plan and “think” about what it needed to do. The report and JSON output clearly show the amount of time it takes to do each task.➖Locating elements by role in Playwright is a recommended practice, and inherently tests the a11y of the web app. As Midscene.js uses screenshots, it moves away from this practice. For comparison, running the code above for both, here were the results:With Midscene.js: 45.8 secondsWithout Midscene.js: 1.9 secondsThe time we may gain by writing in natural language is very minimal when we consider the run time accumulation of this test on a recurring basis. Maintainability:➕If this login page were to be refactored (test IDs, accessibility tags) the test would likely still do what it needs to do correctly. Playwright alone would require some maintenance.➖If there would be maintenance to be done on the test, debugging is trial and error by changing the way you prompt the test, with only the test report and JSON dumps to work with.➖You can also notice that the Playwright test is written in Page Object Model form, which helps keep it maintainable and scalable. With Midscene.js, writing the test directly in the test file is the whole purpose of Midscene.js, so while writing in natural language is fast and easy, making modifications in multiple test files would be a nightmare.We also added a negative case, to see how Midscene.js would handle it. We changed the password to trigger an invalid set of credentials:

Read full article »
Est. Annual Revenue
$5.0-25M
Est. Employees
250-500
Ciaran Cosgrave's photo - CEO of NearForm

CEO

Ciaran Cosgrave

CEO Approval Rating

90/100

Read more