What is Selenium?
I like to call Selenium the ultimate browser automation API. Often called WebDriver it is an open-source API designed for developers and testers to emulate user actions accurately against a browser with extensive element identification options, including CSS Selectors and XPath. Selenium WebDriver stands out as a flexible, powerful solution for web testing.
Who Created Webdriver?
Now meet Simon Stewart, the creator of Selenium 2 also know as WebDriver.
If you are involved in test automation at any level, you’ve probably heard of it.
It’s kind of a big deal.
INDEX
How Does the Create of Selenium WebDriver Explain it?
Simon summarized Selenium as:
- A library for browser automation
- Provided for almost every programming language
- You must manage browser version
- Does not have a built-in framework for actually running tests
- Relies on language proved tools like Jasmine, Junit
First and foremost, it is not a standalone testing tool or testing framework.
Selenium WebDriver is an open-source API that allows you to programmatically interact with a browser on an operating system the way a real user would.
Although it is primarily used to help browser testing of web applications is can also be used for any task where you need browser automation.
WebDriver tests can be programmatically created using multiple different programming languages. You also can configure the Proxy Server to know which browser to use for your test run.
Some folks also call it Selenium 2.
In a nutshell, it's an API (not a testing framework) that allows you to open a browser window and drive it using keyboard and mouse emulations just like a real user would.
To automate interaction with the fields of your application you can use multiple ways to identify the elements under test like:
- By Class Name
- By Css Selector
- By Id
- By Link Text
- By Name
- By Partial Link Text
- By Teg name
- By XPath
What is Selenium 4
As you know the open source project is continently being updated.
Selenium 4 is the latest advancement in automated browser testing, introducing significant updates and new features to enhance testing efficiency and effectiveness. It is composed of three main components: Selenium WebDriver, Selenium IDE, and Selenium Grid, each playing a crucial role in the Selenium test project.
Selenium WebDriver is an open-source API that allows for browser automation in a way that mimics real user interactions, while Selenium IDE offers a record and playback tool for automation, and Selenium Grid enables the distribution of tests across multiple machines to save time.
To see the key updates in Selenium 4 check out my post on What is Selenium 4? The Latest in Automated Browser Testing.
Also check out our free course by Simon Stewart on Selenium 4 Quickstart.
Can Selenium Test APIs?
Remember the keyword here is a web browser.
So is it just for testing browser applications—- YES!
It doesn't support thick client applications or APIs automation.
Architecture of WebDriver
There are four main parts that make up the architecture.
- First, you have the API which is a set of commands you can use to interact with WebDriver
- Next, you have a Library that contains the API as well as program languages specific implementation libraries for common languages like Java, Python, C#, etc.
- To actually control the web browser under test you have a driver. These drivers are ideally created by the browser vendor themselves following the W3C standard.
- Finally, you have a framework that contains the Selenium API as well as other libraries needed to create a fully functional automation framework like test runners, logging, test data creation etc.
What Programming Languages can you use to create your Selenium Webdriver tests
As you've seen this testing library is really just an API that was developed so that it drives the browser under test as if it was a real user interacting with a site.
One benefit of this is that you can use pretty much use whatever programming language you want to code your automated test with. Here are the language bindings currently supported by Selenium:
- Selenium WebDriver Ruby
- Selenium WebDriver C#
- Selenium WebDriver Java
- Selenium WebDriver JavaScript
- Selenium WebDriver Python
Noticed I said *supported. You can also find non-supported implementations of the Selenium WebDriver protocol for exotic language like Haskell
Running Selenium against Browsers
In order to run your selenium test against different browsers, you will also need to use different browser specific executables that WebDriver uses to control the browser.
- Chrome:Â Â ChromeDriver
- Firefox: GeckoDriver
- Microsoft Internet Explorer: IeDriver
- Microsoft Edge: EdgeDriver
- GhostDriver: GhostDriver
Why was Selenium WebDriver Created?
When I asked Simon about the origins of developing this solution and he said it actually began as an implementation of the facade design pattern.
There were other tools out there, and they were perfectly fine to help automate browsers, but their APIs were clunky.
Additionally, some of the existing tools grossly violated the rules of object orientation. It was born out of the desire to have an easy-to-use, readable API that also followed solid, object-oriented principles.
Simon Stewart talks about the past, present, and future of Selenium in my interview with him at SauceCon:
What about Selenium Remote Control RC
Before Selenium WebDriver there was the original Selenium, which was based on using JavaScript to control a browser’s actions.
Jason Huggins created the original version while working at ThoughtWorks, and it was a revolutionary invention at the time.
But as some of you may remember, Selenium started to become really slow as time went on. It was unstable and had tons of bugs, and it became pretty clear that the JavaScript sandbox was going to become a limiting factor for the project at some point.
Soon after, Simon decided to develop the new version up to a point where it could replace the original.
What about Selenium IDE?
On August 8, 2017, Firefox released version 55.0, and Selenium IDE officially died.
From Firefox 55 onwards, Selenium IDE no longer worked.
But about a year ago I began hearing another rumor about Appitools taking up the cause to resurrect Selenium IDE.
So if you haven't heard yet there is now a new and improved version of Selenium IDE. Read all about it in my post the Stunning Return of Selenium IDE (Sweet Dream or Nightmare)
Selenium IDE can be run from your CI/CD pipelines using the command line runner.
What about Selenium Grid?
Using a Selenium grid allows you to save time by spreading your test across multiple machines generating multiple browser drivers sessions.
This means you can run your test in parallel, which reduces the amount of total time to run your full automation test suite.
If you want to learn more check out my post on Selenium Grid Getting Started Guide (Plus 2 Must Use Helper Tools)
What is the Future of Selenium?
Modern web browsers are getting more and more complicated, and not all of them are open source.
Many of the changes the Selenium open-source developers want to make, and with the level of privilege they have, will require fairly deep integration with the browser, and the only people who are really in a position to do that are the browser vendors.
So the next milestone is clearly the W3C spec.
I think the most interesting thing about the W3C WebDriver spec is that it’s where the Selenium project ceases to be around one obscure body of open source developers and becomes an industry standard—which, in turn, makes it incredibly hard for a single individual to exert any form of control over it.
This is a good thing.
Also if you're wondering about Selenium Server which is needed in order to run Remote Selenium WebDriver. Selenium 3.X is no longer capable of running Selenium RC directly. How it works now is that it does it through emulation and the WebDriverBackedSelenium interface.
Since Selenium is going to be a Standard, Why isn’t Everyone Using It?
There are a host of new tools out there that for some reason don’t leverage Selenium WebDriver.
It seems to me they are trying to reinvent something that already exists.
Rather than trying to replicate the WebDriver functionality that already exists, I’d like to see them utilize the WebDriver that’s baked into the browsers.
In my opinion, they’d be better off focusing on functionality like getting accurate timing information, setting breakpoints, and seeing when errors are thrown in JavaScript. They should let the WebDriver spec be responsible for controlling the browser and doing things like keyboard and mouse simulation.
The Rest is History
What Simon initially thought was going to be a few months of work has turned into a ten-year project with multiple contributors changing the way we automate browsers.
Simon's Actionable Automation Advice for Using Selenium WebDriver
Simons actionable Selenium WebDriver advice is first if you have an XPath that is literally a path for your document, Html, body, etc. You're doing it wrong! Stop, think, there's a better way of doing this.
The second piece of advice. For a Selenium web driver example: WebDriver.findelement will return an element, and you can keep hold of that reference.
Often I look at people's tests, and they got driver.findelement the same element send keys, driver.findelement, the same element submit and you scratch you're head.
Each WebDriver call is an RPC and if your using system like source, for example, you've just gone out over the Internet. And the problem is you can just keep hold of the element reference and your good.
So really that should be WebDriver.findelement and keep hold of that reference — element.clear, element.sendkeys, element.submit and you've reduced the amount of work you're doing. You'll have sped up the tests and if you find that finding stale elements exceptions being thrown that means the applications is changing the state in a way that you weren't expecting.
And as a tester that should just set off all sorts of alarm bells, or is a developer I should be setting off alarm bells going like how come I don't know the state of the application? So that stale element exception is not a problem. It's an opportunity to go and find out more about what is going on in the application.
How could Selenium testing evolve to include more user experience (UX) testing?
Selenium testing could evolve to incorporate more user experience (UX) testing by adopting additional metrics and techniques focused on evaluating the functionality, usability, and overall user satisfaction of applications. This could involve integrating UX-specific test cases that assess elements such as layout consistency, navigation intuitiveness, visual design coherence, and interactive responsiveness. By expanding its evaluation criteria beyond pure functionality, Selenium testing can help ensure that applications not only work correctly but also offer users a cohesive and user-friendly experience.
But lets be honest.
Selenium shines in validating functional aspects of web applications. However, when it comes to the nuanced terrain of UX testing, its light dims in comparison to tools better suited for the task.
For instance, while Selenium ensures that a button clicks as intended, but integrating it with tools like Applitools is what brings the capability to scrutinize the visual appeal and consistency of UI elements across different states and platforms, including mobile.
How might test reporting and visualization tools be enhanced for Selenium testing?
Test reporting and visualization tools for Selenium testing could be improved by incorporating features that offer deeper insights into test results. For example, these tools could be designed to provide detailed data on trends, failures, and areas for improvement in a more visual and intuitive manner.
Additionally, implementing customizable dashboards that allow for easy customization and interpretation of test results can enhance the overall usability of these tools. By integrating advanced data visualization techniques and providing actionable insights, testers can more effectively analyze test results and make informed decisions to optimize their testing processes.
One examples of this is integrating Selenium with an open source solution like ReportPortal for AI driven reporting.
Why is there an increased focus on security testing in the context of Selenium testing?
Selenium can be leveraged for security testing by simulating user interactions and identifying potential vulnerabilities in web applications. By automating security tests with Selenium, testers can efficiently detect issues like cross-site scripting (XSS), SQL injection, and insecure authentication mechanisms. Integrating Selenium into the security testing process allows for the creation of comprehensive test suites that cover a wide range of security scenarios, ensuring that web applications are thoroughly evaluated for potential weaknesses.
A common integration to give it more Security capabilities is to use it with ZAP.
Zed Attack Proxy (ZAP) is a versatile intercepting proxy tool designed for security testing, acting as a middleman to analyze and manipulate traffic between an application and its server. It boasts an arsenal of features, including passive and active scanning, fuzzing, and spidering capabilities, to uncover vulnerabilities and enhance application security. ZAP integrates seamlessly with popular tools like Selenium WebDriver and CI/CD platforms such as Jenkins, making it a crucial asset for developers and security professionals aiming to embed security into the software development lifecycle.
What advancements are expected in parallel and distributed testing solutions for Selenium?
In the field of parallel and distributed testing solutions for Selenium, advancements are anticipated to focus on increased sophistication, efficiency, and user-friendliness. Specifically, Selenium Grid and other parallel testing solutions are expected to become more sophisticated, offering enhanced functionalities and more robust capabilities to meet evolving testing requirements. Efficiency is also projected to improve through optimizations in resource utilization, test execution speed, and overall performance.
Additionally, a key area of advancement is the enhancement of user-friendliness, with a focus on simplifying test setup and configuration processes to make parallel and distributed testing more accessible to a wider range of users. Furthermore, the emergence of cloud-based solutions is expected to provide scalable and cost-effective options for executing tests across diverse configurations, thereby addressing the challenges associated with managing varied testing environments.
In what ways will Selenium testing be integrated with DevOps and CI/CD pipelines?
Selenium testing will see enhanced integration with DevOps and CI/CD pipelines through improved synchronization and efficiency. The integration will focus on providing smoother alignment with popular CI/CD tools and environments, resulting in faster and more streamlined test execution.
Automation frameworks are expected to offer enhanced capabilities that facilitate easier integration with various tools and resources utilized in DevOps and CI/CD setups, enhancing the overall synergy between Selenium testing and these development processes.
Besides Simon, there are Other Selenium Contributors
Here are a few other TestGuild Automation Podcast (Formally called TestTalks) interviews I’ve had with other Selenium contributors like:
Jim Evans who is one of the key contributors to the Selenium project. He is also the man behind the Selenium .NET bindings, and the Internet Explorer driver.
Dave Haeffner: The Selenium Webdriver Java Guidebook: