Tag Archive for 'google'

Google Captcha Extraction

Please start by reading this post where I explain everything about this code, thanks!

This is the code for the Google Image Captcha, and the one for the audio captcha is below the jump :)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
//This script pulls CAPTCHAs URL from $urlGoogle, then gets the CAPTCHA and saves them to folder $saveGoogle from the range $startImage to $endImage.
	$urlGoogle = "https://www.google.com/accounts/NewAccount?service=mail&continue=http%3A%2F%2Fmail.google.com%2Fmail%2Fe-11-10ba05aeaa8e9b701e5151437f9a44d3-64aeae753cc34f1c864f7edc97a046ccdc96987b&type=2";
	$saveGoogle = "google/";
	$startImage = 0;
	$endImage = 999;
 
	//These two lines force the output to be constantly flushed and updated for the user. (ideally)
	ob_implicit_flush(true);
	ob_end_flush();
	echo "Script Started.\n";
 
	//Pull in the CAPTCHA image as a string with cURL, and save to a file. The curl extension must first be enabled in php.ini.
	for ($i=$startImage;$i<=$endImage;$i++) {
		//First extract a unique URL for each CAPTCHA from the $urlGoogle.
		$ch = curl_init();
		curl_setopt($ch, CURLOPT_URL, $urlGoogle."&amp;rand=".$i);
		curl_setopt($ch, CURLOPT_HEADER, 0);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
		//If you're having difficulties with SSL, this may need to be enabled.
		//curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
		$result = curl_exec($ch);
		//Enable this if you're having difficulties.
		//echo "Error is: ".curl_error($ch);
		curl_close($ch);
 
		//Parse out the URL, and retrieve the CAPTCHA for it.
		$result = substr($result,strpos($result,"gaia captchahtml desc"));
		$resultArray = explode('"',$result);
		$ch = curl_init();
		curl_setopt($ch, CURLOPT_URL, rawurldecode($resultArray[2]));
		curl_setopt($ch, CURLOPT_HEADER, 0);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
		//If you're having difficulties with SSL, this may need to be enabled.
		//curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
		$image = curl_exec($ch);
		//Enable this if you're having difficulties.
		//echo "Error is: ".curl_error($ch);
		curl_close($ch);
 
		//Save CAPTCHA to a file with the same name as $i.
		if(!is_dir($saveGoogle)) mkdir($saveGoogle);
		$fh = fopen($saveGoogle.$i.".jpg","w");
		fwrite($fh,$image);
		fclose($fh);
 
		//Don't allow it to timeout.
		set_time_limit(40);
		//Output occasional progress.
		if ($i%10 == 0) {
			echo $i." CAPTCHA captured.\n";
			flush();
		}
	}
 
	echo "Script Complete.";
	//-maluc

About this captcha:

length: 5-8
range: a-z
case-sensitive: no
background: always white
overlay: none
text color: solid blue,green,or red. single color.
size: 2000-3900 bytes
width: always 200px
height: always 70px
other: tilting seemingly random, 5chars is rare, red is rare, shade of solid colors may change between captchas

Here is the code for the Google Audio Captcha:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
//This script pulls CAPTCHAs URL from $urlGoogleAudio, then gets the CAPTCHA and saves them to folder $saveGoogleAudio from the range $startSound to $endSound.
	$urlGoogleAudio = "https://www.google.com/accounts/NewAccount?service=mail&amp;continue=http%3A%2F%2Fmail.google.com%2Fmail%2Fe-11-10ba05aeaa8e9b701e5151437f9a44d3-64aeae753cc34f1c864f7edc97a046ccdc96987b&amp;type=2";
	$saveGoogleAudio = "googleaudio/";
	$startSound = 0;
	$endSound = 999;
 
	//These two lines force the output to be constantly flushed and updated for the user. (ideally)
	ob_implicit_flush(true);
	ob_end_flush();
	echo "Script Started.\n";
 
	//Pull in the CAPTCHA image as a string with cURL, and save to a file. The curl extension must first be enabled in php.ini.
	for ($i=$startSound;$i<=$endSound;$i++) {
		//First extract a unique URL for each CAPTCHA from the $urlGoogleAudio.
		$ch = curl_init();
		curl_setopt($ch, CURLOPT_URL, $urlGoogleAudio."&amp;rand=".$i);
		curl_setopt($ch, CURLOPT_HEADER, 0);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
 
		//If you're having difficulties with SSL, this may need to be enabled.
		//curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
		$result = curl_exec($ch);
		//Enable this if you're having difficulties.
		//echo "Error is: ".curl_error($ch);
		curl_close($ch);
 
		//Parse out the URL, and retrieve the CAPTCHA for it.
		$result = substr($result,strpos($result,"wavURL"));
		$resultArray = explode('"',$result);
		$ch = curl_init();
		curl_setopt($ch, CURLOPT_URL, str_replace('\75',"=",$resultArray[1]));
		curl_setopt($ch, CURLOPT_HEADER, 0);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
		//If you're having difficulties with SSL, this may need to be enabled.
		//curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
		$sound = curl_exec($ch);
		//Enable this if you're having difficulties.
		//echo "Error is: ".curl_error($ch);
		curl_close($ch);
 
		//Save CAPTCHA to a file with the same name as $i.
		if(!is_dir($saveGoogleAudio)) mkdir($saveGoogleAudio);
		if(strlen($sound) &gt; 146) {
			$fh = fopen($saveGoogleAudio.$i.".wav","w");
			fwrite($fh,$sound);
			fclose($fh);
		}
		else $i--;
 
		//Don't allow it to timeout.
		set_time_limit(40);
		//Output occasional progress.
		if ($i%10 == 0) {
			echo $i." CAPTCHA captured.\n";
			flush();
		}
	}
 
	echo "Script Complete.";
	//-maluc

And info about the audio captchas as well:

length: not certain (5-10?)
range: 0-9
case-sensitive: N/A
background: equally loud gibberish and noise, really gets in the way.
size: 200044-440044 bytes
other: way too hard for a human - don’t know how blind people do it. pace varies but pitch seems to remain fairly similar.

Google Human Reviewer Guidelines (Leaked)

When trying to decide if a page is Spam, it is helpful to ask yourself this question: If I remove the scraped (copied) content, the ads, and the links to other pages, is there anything of value left? If the answer is no, the page is probably Spam.

Download it here: quality-rater-guidelines-2007.pdf!

And before you start bitching… This file is everywhere now. At [YACG]’s forum, BlackHatWorld, Syndk8, IranJava…

Google Bowling + DDoS = Pwnage!

Guess you all know about the ongoing controversy about someone abusing Eli’s tool QUIT and a DDoS against BlueHatSeo.com. For what I’ve read I think somebody is pissed with Eli because his blog is disclosing too much information (?). Who knows what is the real motive… But this made me realize that no matter how good it seems things are going for your sites and rankings… You can NEVER get to comfy.

I know a DDoS may not seem that bad; Your sites goes offline, you burn all your bandwdith and after a couple of days your back online… right?

Well, maybe.. But what if the perpetrator of the DDoS decides to take it one step further… Like a DDoS + Google Bowling attack:

Fatality

That’s right, it would be a fatality. A long-lasting DDoS attack and some Google Bowling that results on de-indexing could be the worst nightmare of your site… And if it’s executed correctly, there is no going back for your site… Ever.
Now if you want to make it suck even more, add to it some hacking and some imagination.