Your subscription could not be saved. Please try again.
Your subscription has been successful.

Subscribe to receive our monthly OpenRefine roundups with new tutorials, release updates and community announcements.

27.3.20

Solving Google’s reCAPTCHA v2 with ParseHub Agent


ParseHub is a great point and click web scraping software. While projects run on ParseHub servers, you can connect with third party proxies like Luminati or captcha resolution service like 2Captcha

In this tutorial, we will show you how to bypass Google Recaptcha v2 test page with ParseHub Agent and 2Captcha service. You will need to create an account with 2Captcha and have an API key to complete this tutorial. 

Don't hesitate to contact us if you want to access the ParseHub project, have questions or need help to implement web scraping projects.




The ParseHub project mainly comprises of four "Go-To" Templates. ParseHub go-to template command is used to jump to another page and/or run a new template, from the start, with its list of commands. The current scope is preserved when jumping to another template. 


  1. You upload a captcha on 2captcha.com/in.php
  2. 2captcha stores your captcha and returned you the ID of your request
  3. 2captcha immediately distributes your captcha to a worker
  4. The worker solves the captcha and sends the answer back to 2captcha
  5. You are sending a request to 2captcha using the ID to get the answer
Our 12 steps implementation plan.  You can click on images to enlarge them.  Send us a note, if you need help with your web scraping projects.

Step 1: Create the MainTemplate 

Set the website https://www.google.com/recaptcha/api2/demo as starting point.



Step 2: Find the siteKey variable

Find the HTML tag with the attribute data-sitekey and extract its value and it will be stored in the variable named siteKey
 


Step 3:  Extract Page URL requested in step 1 and store it in the variable named page


Step 4 Go to Template CaptchaRequest:


Submit a request to API with 

  • URL: http://2captcha.com/in.php 
  • method set to “userrecaptcha” 
  • googlekey: the value found in step 2
  • pageurl: the page URL found in step 3


Final URL will be something like

If everything is fine, the server will return the ID of your CAPTCHA as JSON {“status”:1,”request”:”2122988149″}.






Step 4.1. Extract request parameter value and store it in variable named requestid



Step 4.2. Wait for 60-70 seconds to allow 2captcha service to break the captcha



Step 4.3. Go To Template GetResolveCaptchaTemplate: 

Submit a request to API located at http://2captcha.com/res.php to get the result

The requestid is the variable extracted in step 4.1

If CAPTCHA is already solved, the server will return JSON with a token that looks like this:

03AHJ_Vuve5Asa4koK3KSMyUkCq0vUFCR5Im4CwB7PzO3dCxIo11i53epEraq-uBO5mVm2XRikL8iKOWr0aG50sCuej9bXx5qcviUGSm4iK4NC_Q88flavWhaTXSh0VxoihBwBjXxwXuJZ-WGN5Sy4dtUl2wbpMqAj8Zwup1vyCaQJWFvRjYGWJ_TQBKTXNB5CCOgncqLetmJ6B6Cos7qoQyaB8ZzBOTGf5KSP6e-K9niYs772f53Oof6aJeSUDNjiKG9gN3FTrdwKwdnAwEYX-F37sI_vLB1Zs8NQo0PObHYy0b0sf7WSLkzzcIgW9GR0FwcCCm1P8lB-50GQHPEBJUHNnhJyDzwRoRAkVzrf7UkV8wKCdTwrrWqiYDgbrzURfHc2ESsp020MicJTasSiXmNRgryt-gf50q5BMkiRH7osm4DoUgsjc_XyQiEmQmxl5sqZP7aKsaE-EM00x59XsPzD3m3YI6SRCFRUevSyumBd7KmXE8VuzIO9lgnnbka4-eZynZa6vbB9cO3QjLH0xSG3-egcplD1uLGh79wC34RF49Ui3eHwua4S9XHpH6YBe7gXzz6_mv-o-fxrOuphwfrtwvvi2FGfpTexWvxhqWICMFTTjFBCEGEgj7_IFWEKirXW2RTZCVF0Gid7EtIsoEeZkPbrcUISGmgtiJkJ_KojuKwImF0G0CsTlxYTOU2sPsd5o1JDt65wGniQR2IZufnPbbK76Yh_KI2DY4cUxMfcb2fAXcFMc9dcpHg6f9wBXhUtFYTu6pi5LhhGuhpkiGcv6vWYNxMrpWJW_pV7q8mPilwkAP-zw5MJxkgijl2wDMpM-UUQ_k37FVtf-ndbQAIPG7S469doZMmb5IZYgvcB4ojqCW3Vz6Q

  

Step 4.3.1. Extract request parameter value and store it in the variable named CaptchaToken

 


Step 5: Extract CaptchaToken and store it in variable named Token

Step 6: Locate the form field with ID “g-recaptcha-response” and set the form field with the token value retrieved in the previous step 4.3.1




Step 7: Click submit 




Step 8: Finally you will be able to see the following resultant page after submit


Congratulation you made it to the end! Contact us if you have questions, want to see the ParseHub project or need help to implement web scraping projects.

This post was

0 comments:

Post a Comment