Selenium is an open-source and powerful cross-browser automation framework that automates web browsers. Its prominent application is software testing for web applications by performing cross-browser validation. With Selenium, testers and developers can automate and simulate human interactions with web pages, including clicking buttons, entering text, navigating between pages, and verifying the expected outcome.

History of Selenium

Selenium was originally created by Jason Huggins in 2004 at ThoughtWorks. He deployed a web applications testing automation framework based on JavaScript. Since then, Selenium has undergone several increments.

  • Selenium Core (2004): The first iteration of Selenium that utilized JavaScript to automate actions with the browser but was restricted by the Same-Origin Policy.
  • Selenium RC (2006): Created by Paul Hammant, this version circumvented the same-origin policy via proxy server.
  • Selenium WebDriver (2009): Created by Simon Stewart, WebDriver offered direct control over the browser, which was faster and more reliable.
  • Selenium Grid (2008): A tool for running tests in parallel on multiple browsers and environments.
  • Selenium 3 (2016): Phased out Selenium RC and used WebDriver as the main automation tool.
  • Selenium 4 (2021): Introduced enhanced WebDriver capabilities, improved debugging tools, and included native Chromium support.

Selenium Suite of Tools

Selenium is not a single tool but a suite of tools that cater to different automation needs:

Selenium WebDriver

Selenium WebDriver is the most popular and most used component. It enables direct interaction with browsers, eliminating the intermediary server.

  • Supports multiple code programming languages: Java, Python, C#, Ruby, JavaScript, Kotlin, etc.
  • Multiple browser support: Chrome, Firefox, Edge, Safari, Opera
  • Can handle complex interactions like button clicks, form filling, drag and drop, popups, etc.
  • Compatible with other testing frameworks: JUnit, TestNG, PyTest, NUnit, etc.

Selenium IDE

Selenium IDE (Integrated Development Environment) is a record-and-playback tool that allows testers to create test cases without writing code.

  • Available as a browser extension for Chrome and Firefox.
  • Generates test scripts in Java, Python, C#, or JavaScript.
  • Best suited for beginners or quick test prototyping.

Selenium Grid

Selenium Grid enables parallel execution of test cases across multiple browsers and environments.

  • Reduces test execution time.
  • Uses Hub-Node Architecture:
    • Hub: Controls test execution.
    • Nodes: Executes test cases on different environments.

Selenium RC (Deprecated)

Selenium Remote Control was a Java-based tool that acted as an HTTP proxy to interact with browsers. Selenium WebDriver has replaced it.

Selenium WebDriver Architecture

Selenium WebDriver works by sending commands to the browser and retrieving results. Its architecture includes:

Components of WebDriver

  • Test Script: This is the real automation script written in programming languages like Java, Python, or C#. It describes the actions to be performed on the browser, like clicking buttons, typing in text fields, and asserting elements.
  • Selenium API: Serves as an interface that converts high-level test commands to HTTP requests. It makes certain that the test script communicates correctly by the browser through WebDriver commands.
  • JSON Wire Protocol: The communication protocol over HTTP that allows the communication between the test script and the browser driver. It helps to move data in a structured manner with the commands of Selenium.
  • Browser Driver: A targeted driver (Chrome driver, Gecko driver, etc.) that connects Selenium to the real browser. It translates Selenium commands into actions the browser natively understands.
  • Web Browser: The browser (Chrome, Firefox, Edge, etc.) that is instructed by the driver, which executes these commands and sends back responses to the script. The final results are displayed based on the test execution.

How Selenium WebDriver Works

  • The test script communicates with the browser driver by sending an instruction over the JSON Wire Protocol (for example, open a webpage).
  • This request gets converted to a browser-specific validation request by the browser driver, and the execution script gets executed in that browser.
  • The browser then executes the command, produces a response, and forwards it through the browser driver. The Selenium script then takes this response, interprets it , and proceeds to the next steps of the test case execution.

Installing and Setting Up Selenium

Selenium can be installed in different programming environments. Below are the steps for setup in Python and Java.

Setting Up Selenium in Python

  • Install Selenium: pip install selenium
  • Download WebDriver (Chrome)
    • Download ChromeDriver from https://chromedriver.chromium.org/downloads.
    • Place it in a directory and add it to the system PATH.
  • Write a Simple Selenium Test, read more: Selenium with Python Cheat Sheet.

Setting Up Selenium in Java

With Java, it is always recommended to use build tools like Maven. Using Maven simplifies dependency management, ensuring that the required Selenium libraries and browser drivers are automatically downloaded and kept up to date. Read more: Selenium with Java Cheat Sheet.

  • Download and install JDK (Java Development Kit) from Oracle JDK or use OpenJDK.
  • Set up the JAVA_HOME environment variable.
  • Add Selenium Dependencies in Maven
    • Open your IDE and create a new Maven project.
    • Modify the pom.xml file to add Selenium dependencies:
      <dependencies>
         <!-- Selenium Java -->
         <dependency>
             <groupId>org.seleniumhq.selenium</groupId>
             <artifactId>selenium-java</artifactId>
             <version>4.5.0</version>
         </dependency>
         <!-- TestNG for test execution -->
         <dependency>
             <groupId>org.testng</groupId>
             <artifactId>testng</artifactId>
             <version>7.7.0</version>
             <scope>test</scope>
         </dependency>
      </dependencies>
  • Download WebDriver (ChromeDriver, GeckoDriver, etc.)
  • Extract the driver and place it in a suitable directory (e.g., C:\WebDriver\ or /usr/local/bin/).
  • Set up the system property in your test script: System.setProperty("webdriver.chrome.driver", "path/to/chromedriver");
  • Write a Simple Selenium Test in Java

Designing a Selenium Test Case

Designing a Selenium test involves several key steps to ensure effective, maintainable, and scalable test automation. Here’s a structured approach:

Define Test Requirements

Before writing a test, you should clearly define:

  • Test objective: What are you verifying? (e.g., login functionality, form submission)
  • Test scenario: Steps the user would take in the application.
  • Expected outcome: What should happen if the test is successful?

Example: A login test should check if entering valid credentials redirects the user to the dashboard.

Set Up the Selenium Test Environment

To write and execute a Selenium test, you need:

  • Programming language: Choose Java, Python, C#, JavaScript, etc.
  • Selenium WebDriver: Download the appropriate browser driver (chrome driver, gecko driver).
  • Testing framework: Use JUnit/TestNG (Java), PyTest (Python), or NUnit (C#) for structured test execution.
  • IDE: Use an IDE like Eclipse, IntelliJ, VS Code, or PyCharm.

Identify Web Elements Using Locators

Selenium interacts with web elements using locators, which help identify buttons, text fields, links, etc. Here’s an overview of the top 8 element locators you can use:

  • By CSS ID: find_element_by_id
  • By CSS class name: find_element_by_class_name
  • By name attribute: find_element_by_name
  • By DOM structure or XPath: find_element_by_xpath
  • By tag Name: find_element_by_tag_name()
  • By link text: find_element_by_link_text
  • By partial link text: find_element_by_partial_link_text
  • By HTML tag name: find_element_by_tag_name

Using WebDriverManager

WebDriverManager is an open-source Java library used to manage the drivers required by Selenium WebDriver to execute the tests on various browsers like ChromeDriver, GeckoDriver, EdgeDriver, etc. WebDriverManager eliminates the hassle of installing or configuring browser driver binaries manually. It also identifies the installed browser version on the system and downloads the appropriate driver automatically, which is then cached locally (~/. selenium (by default in the cache/) to run seamlessly at execution.

Implement Waits for Synchronization

Web applications can have delays due to AJAX calls or dynamic content. Use waits to ensure Selenium doesn’t interact with elements before they are available.

  • Implicit Wait: Implicit Wait sets a default waiting time for all elements before throwing a NoSuchElementException.
    driver.implicitly_wait(10)  # Waits up to 10 seconds before throwing an exception
  • Explicit Wait: Explicit Wait waits for a specific condition before proceeding with the next step in the script.
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    wait = WebDriverWait(driver, 10)
    wait.until(EC.presence_of_element_located((By.ID, "dashboard")))
  • Fluent Wait: Fluent Wait repeatedly checks for a condition at a specified interval until a timeout occurs.
    Wait<WebDriver> wait = new FluentWait<>(driver)
      .withTimeout(Duration.ofSeconds(20))
      .pollingEvery(Duration.ofSeconds(2))
      .ignoring(NoSuchElementException.class);
    
    WebElement element = wait.until(driver -> driver.findElement(By.id("submit")));
  • Hard Wait (Thread.sleep): A Hard Wait pauses execution for a fixed amount of time, regardless of element availability.
    Thread.sleep(5000);  // Waits for 5 seconds

Use Assertions for Validation

Assertions in Selenium are used to validate test conditions and ensure that an application behaves as expected. They help in verifying whether the actual outcome matches the expected result, allowing testers to determine if a test case has passed or failed.

  • Hard Assertions: If the validation condition fails, Hard Assertions immediately terminate the test execution. When a Hard Assertion fails, the subsequent test steps are skipped, and the test case is marked as failed.
    Assert.assertEquals(actual,expected);
    Assert.assertNotEquals(actual,expected,Message);
    Assert.assertTrue(condition);
    Assert.assertFalse(condition);
    Assert.assertNull(object);
    Assert.assertNotNull(object);
  • Soft Assertions: Soft Assertions allow the test to continue execution even if an assertion fails, logging all failures at the end of the test case. They are useful when multiple validations are required within a single test case, and failing one check should not stop the remaining assertions. To ensure all assertions are reported, softAssert.assertAll() must be called at the end of the test.
    softAssert.assertTrue(false);
    softAssert.assertEquals(1, 2);
    softAssert.assertAll();

Integrate with a Testing Framework

Using a framework like TestNG (Java), PyTest (Python), or JUnit helps organize and execute tests efficiently. Example using PyTest:

import pytest

@pytest.fixture

def setup():
  driver = webdriver.Chrome()
  yield driver
  driver.quit()

def test_login(setup):
  setup.get("https://example.com/login")
  setup.find_element(By.ID, "username").send_keys("testuser")
  setup.find_element(By.ID, "password").send_keys("securepassword")
  setup.find_element(By.ID, "login-button").click()
  assert "Dashboard" in setup.title

Execute Tests in Parallel Using Selenium Grid

For faster execution, Selenium Grid allows running tests on multiple machines and browsers simultaneously.

  • Command to start a Grid hub: java -jar selenium-server.jar -role hub
  • Command to start a Grid node: java -jar selenium-server.jar -role node -hub http://localhost:4444/grid/register

Then, configure WebDriver to run tests on remote nodes.

Generate Reports for Test Execution

Selenium itself doesn’t provide reports, but you can use:

  • Allure Reports (for detailed execution reports)
  • Extent Reports (for interactive HTML reports)
  • JUnit/TestNG Reports (for structured test results)

Popularity of Programming Languages in Selenium

According to various industry reports, survey data from Stack Overflow, GitHub, and Google Trends, the most commonly used languages for Selenium testing are:

Rank Programming Language Usage Percentage Reasons for Popularity
1 Java ~40-45% Strong IDE support (Eclipse, IntelliJ), TestNG integration, widely used in enterprises.
2 Python ~30-35% Simplicity, faster test scripting, PyTest support, growing adoption in AI/ML testing.
3 C# ~10-15% Preferred in Microsoft-based environments (Visual Studio, .NET frameworks).
4 JavaScript ~5-10% Used in modern web automation, integrates well with WebDriverIO, Protractor.
5 Ruby ~2-5% Used in niche markets but declining due to Cypress and Playwright.

Selenium Test Example

Let’s take a sample test script written in TestNG. The script will open the Google homepage, perform a search for “Selenium WebDriver,” validate the search results page title, and log the test execution details to the console.

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.Assert;
import org.testng.annotations.AfterTest;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import io.github.bonigarcia.wdm.WebDriverManager;

public class SeleniumTestNGExample {

  WebDriver driver;

  @BeforeTest

  public void setup() {
    System.out.println("Initializing WebDriver...");
    WebDriverManager.chromedriver().setup();
    driver = new ChromeDriver();
    driver.manage().window().maximize();
    System.out.println("Browser Launched Successfully.");
  }

  @Test

  public void googleSearchTest() {

    System.out.println("Opening Google Homepage...");
    driver.get("https://www.google.com");

    // Locate search box and enter text
    WebElement searchBox = driver.findElement(By.name("q"));
    searchBox.sendKeys("Selenium WebDriver");
    System.out.println("Entered search term: Selenium WebDriver");

    // Submit search
    searchBox.submit();
    System.out.println("Search submitted... Waiting for results.");

    // Verify search results page title
    String expectedTitle = "Selenium WebDriver - Google Search";
    String actualTitle = driver.getTitle();
    System.out.println("Page Title: " + actualTitle);
    Assert.assertTrue(actualTitle.contains("Selenium WebDriver"), "Title validation failed!");
    System.out.println("Test Passed: Search results page loaded successfully.");
  }

  @AfterTest

  public void teardown() {
    System.out.println("Closing Browser...");
    if (driver != null) {
      driver.quit();
    }
    System.out.println("Test Execution Completed.");
  }
}

Understanding the code

Below code snippet helps to import all necessary dependencies.

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.Assert;
import org.testng.annotations.AfterTest;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import io.github.bonigarcia.wdm.WebDriverManager;

@BeforeTest is an annotation provided by testNG. This will be executed before the first @Test annotated method. Here, we use the setup method to initialize WebDriverManager download and setup the chromedriver.

@BeforeTest
  public void setup() {
    System.out.println("Initializing WebDriver...");
    WebDriverManager.chromedriver().setup();
    driver = new ChromeDriver();
    driver.manage().window().maximize();
    System.out.println("Browser Launched Successfully.");

@Test is an annotation in TestNG that is used to identify a test method.

@AfterTest is an annotation provided by TestNG that will be run after all the test methods in the current class have been executed. Here, we are closing the browser instance.

@AfterTest
  public void teardown() {
    System.out.println("Closing Browser...");
    if (driver != null) {
      driver.quit();
    }
  System.out.println("Test Execution Completed.");
}

Test Run Results:

Initializing WebDriver...
Browser Launched Successfully.
Opening Google Homepage...
Entered search term: Selenium WebDriver
Search submitted... Waiting for results.
Page Title: Selenium WebDriver - Google Search
Test Passed: Search results page loaded successfully.
Closing Browser...
Test Execution Completed.
===============================================
Default Suite
Total tests run: 1, Failures: 0, Skips: 0
===============================================
Process finished with exit code 0