DevsSecOps

Hello everyone! This is Shawal Ahmad. I got my Master's degree in Computer Science from Hazara University Mansehra after that I started working as a Freelancer on different platforms, now a days i am working on different programming languages(Php, Laravel, Python, Odoo, Javascript, Vuejs, ReactJs+ReactNative, Bash Scripts, Servers and Automation etc), i started this platform to share knowledge and various IT problems solution and also about new trending technologies.

Breaking News

Multiple Proxy Servers in Selenium Web-driver Python

 

While scraping data from a website, due to large number of requests, website stop their service for user IP address or block user for further requests. In such case Proxying is very useful when conducting intensive web crawling/scrapping or when you just want to hide your identity (anonymization).

Content:

  1. Proxies
  2. Install http-request-randomizer
  3. Collect proxy list
  4. Use proxy in Selenium Web driver
  5. Iterate over proxy list
  6. Use Country specific proxy
  7. Next Totorial

1.0 Proxies

Proxies provide a way to use server P (the middleman) to contact server A and then route the response back to you. In more nefarious circles, it’s a prime way to make your presence unknown and pose as many clients to a website instead of just one client. Often times websites will block IPs that make too many requests, and proxies is a way to get around this.

2.0 Install http-request-randomizer using pip

pip install http-request-randomizer

3.0 Collect proxy list

from http_request_randomizer.requests.proxy.requestProxy import RequestProxy req_proxy = RequestProxy() #you may get different number of proxy when you run this at each
proxies = req_proxy.get_proxy_list() #this will create proxy list

3.1 Check IP address and country of proxy

>> proxies[0].get_address()
>> '179.127.241.199:53653'
>>proxies[0].country
>>'Brazil'

4.0 Use proxy in Selenium Web driver

4.1 Select proxy

PROXY = proxies[0].get_address()
print(PROXY)
>>'179.127.241.199:53653'

4.2 Use in Firefox

from selenium import webdriver
PROXY = proxies[0].get_address()
webdriver.DesiredCapabilities.FIREFOX['proxy']={
"httpProxy":PROXY,
"ftpProxy":PROXY,
"sslProxy":PROXY,
"proxyType":"MANUAL",
}
driver = webdriver.Firefox(executable_path= r"C:\Users\siddhartha\Downloads\geckodriver-v0.25.0-win64\geckodriver.exe")

4.3 Use in Chrome


from selenium import webdriverPROXY = proxies[0].get_address() webdriver.DesiredCapabilities.CHROME['proxy']={ "httpProxy":PROXY, "ftpProxy":PROXY, "sslProxy":PROXY, "proxyType":"MANUAL", } driver = webdriver.Chrome(executable_path= r"C:\Users\siddhartha\Downloads\chromedriver_win32\chromedriver.exe")

4.3 Check your IP address in browser

driver.get('https://www.expressvpn.com/what-is-my-ip')

you will see proxy address which you have selected as your IP address not your actual IP address, you can see your actual IP address on https://www.expressvpn.com/what-is-my-ip

5.0 Iterate over proxy list

Now you can iterate over proxy list, use web-driver one by one with each proxy and close it. check how many requests a particular websites allows and use a single proxy for that many requests

6.0 Use Country specific proxy



ind = [] #int is list of Indian proxy
for proxy in proxies:
if proxy.country == 'India':
ind.append(proxy)

Usually proxy address slows down internet speed, to overcome this problem you can use proxies of your country, which may be faster than other proxies. but this may reduce number of proxy significantly, for example if I have total 800 proxies and if I select only proxies from India then I may get less than 50 proxies out of that.

In this way you can create list of any specific country.

In this way you can user proxy, you may get proxy list other then “http-request-randomizer” Module, On internet there are lots of proxy providers.

Note: These proxies require a good internet connection, and some of these proxies may not work, so try with different proxies.

Thanks 🙂

No comments