Setting up Selenium and Chrome on EC2

To install Google Chrome and the necessary drivers for Selenium scraping on an EC2 instance running Ubuntu, you’ll need to follow a few steps. I’ll just assume you have an AWS EC2 instance running Ubuntu and you have SSH access to your instance. Here’s a step-by-step guide that is tried and tested:

Step 1: Connect to Your EC2 Instance

First, you’ll need to SSH into your EC2 instance. Open your terminal or command prompt and use the SSH command. You’ll need the .pem key file and the public DNS or IP address of your EC2 instance.

ssh -i ssh-key.pem ubuntu@instance-ip

Replace ssh-key.pem with the path to your PEM file, and ec2-ip with the public DNS or IP of your EC2 instance.

Step 2: Update Your Package Lists

Update the package lists for upgrades and new package installations.

sudo apt update

Step 3: Install Google Chrome

Download the latest version of Google Chrome.

wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb

Install Google Chrome.

sudo apt install ./google-chrome-stable_current_amd64.deb

Step 4: Install ChromeDriver

ChromeDriver is necessary for Selenium to interact with the Chrome browser. First, find out the version of Google Chrome you installed:

google-chrome --version

Then, download the corresponding ChromeDriver from https://chromedriver.chromium.org/downloads. Make sure the ChromeDriver version matches your Chrome browser version.

As an example, if your Google Chrome version is XX, replace XX with your Chrome version in the command below to download ChromeDriver:

wget https://chromedriver.storage.googleapis.com/XX.XX.XX/chromedriver_linux64.zip

Unzip the downloaded file.

unzip chromedriver_linux64.zip

Move the chromedriver to /usr/local/bin/ so it’s in the PATH.

sudo mv chromedriver /usr/local/bin/

Step 5: Install Selenium

You can install Selenium using pip. If you don’t have pip installed, you can install it by running:

sudo apt install python3-pip

Then, install Selenium:

pip3 install selenium

Step 6: Test Your Setup

Create a simple Python script to test your Selenium setup with Chrome.

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('--headless')  # Run Chrome in headless mode (without a UI)
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

driver = webdriver.Chrome(options=options)

driver.get("https://www.example.com")
print(driver.title)

driver.quit()

Save this script as test_selenium.py and run it:

python3 test_selenium.py

If everything is set up correctly, this script should print the title of the page without opening a graphical web browser.

This setup will allow you to run Selenium with Chrome on your EC2 Ubuntu instance for web scraping or automated testing.

I have wasted around 3 hours setting up chrome on EC2. Normally it works out of the box, but sometimes it will be a endless loop of issues we are solving, so if you are facing the issue, just clear every reference of the chrome in your instance and then follow this guide