用戶:下一次登錄/rfget.pl
這是一個Wikipedia:維基工具,是我參照User:下一次登錄/mwpush.pl寫的,沒有版權。我只提供代碼,並不對代碼中存在的錯誤負責。請使用者自己小心。
描述
編輯這個工具是使用Perl腳本語言寫成的,在WindowsXPCygwin的ActivePerl環境中測試通過。
這個工具的功能是:
- 連接任意頁面Special:Random,讀取其中的條目名稱,輸出到「fname.txt」文件中。
- 讀取該條目中的內容,輸出到「fout.txt」文件中。
使用
編輯步驟參考User:下一次登錄/mwpush.pl中的說明。
保存文件名為「rfget.pl」,幾個變量賦值的說明:
my $username="username"; (username是已经申请的用户名,只能是英文,例如ilikewikipedia) my $password="****"; (****填入对应的密码) my $WIKI_PATH="zh.wiki.x.io"; (缺省是中文维基,可以修改成为其他语言,比如en.wiki.x.io)
最後的命令行為
rfget.pl
代碼
編輯#!/usr/bin/perl # rfget.pl - Get content from a random wikipedia article # By WikiPedia:User:下一次登录 # Portions largely taken or based on upload.pl by WikiPedia:User:Eloquence # and mwpush.pl by WikiPedia:User:KeithTyler # call requirements use Getopt::Std; use LWP::Simple; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use HTTP::Cookies; #use warnings; my $username="username"; my $password="****"; my $WIKI_PATH="zh.wiki.x.io"; my $onlyifempty=$opt_e; ### Login to wiki # Set up connection data my $browser=LWP::UserAgent->new(); my @ns_headers = ( 'User-Agent' => 'MediaWiki Pusher 0.02 by Gfdrtf', #Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20041107 Firefox/1.0', 'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*', 'Accept-Charset' => 'iso-8859-1,*,utf-8', 'Accept-Language' => 'en-US', ); # Hold cookies $browser->cookie_jar( {} ); # Make login request $response=$browser->post("http://".$WIKI_PATH."/w/index.php?title=Special:Userlogin&action=submitlogin", @ns_headers, Content=>[wpName=>$username,wpPassword=>$password,wpRemember=>"1",wpLoginAttempt=>"Log in"]); # After logging in, we should be redirected to another page. # If we aren't, something is wrong. # if($response->code!=302) { print "We weren't able to login. This could have the following causes: * The username ($username) or password may be incorrect. Solution: Re-run script with correct credentials. * The MediaWiki software on the target host has been upgraded. Solution: Make some corresponding changes of the codes. * You are trying to hack this script for other wikis. The wiki you are uploading to has cookie check disabled. Solution: Try setting \$ignore_login_error to 1. Regardless, we will now try to write the output from the server to rfget.debug.out....\n\n"; open(DEBUG,">rfget.debug.out") or die "Could not write file.\n"; print DEBUG $response->as_string; print "This seems to have worked. Take a look at the file for further information or send it to moeller AT scireview DOT de if you need help debugging the script.\n"; close(DEBUG); exit 1; } my $URL="http://".$WIKI_PATH."/wiki/Special:Random"; $response=$browser->get($URL, @ns_headers); my $filename1=$response->as_string; # extract the filename my $filestartstr = "<li id=\"t-permalink\"><a href=\"/w/index.php?title="; my $filestart = index($filename1, $filestartstr); $filestart+=49; my $fileendstr="&oldid="; my $fileend = index($filename1, $fileendstr); my $filename=substr($filename1, $filestart, $fileend-$filestart); $URL="http://".$WIKI_PATH."/w/index.php?title=".$filename."&action=edit"; $response=$browser->get($URL, @ns_headers); my $content1=$response->as_string; $filestartstr="<textarea tabindex='1' accesskey=\",\" name=\"wpTextbox1\" id=\"wpTextbox1\" rows='25'"; $fileendstr="</textarea>"; $filestart= index($content1, $filestartstr); $filestart+=92; $fileend= index($content1, $fileendstr); my $content2=substr($content1, $filestart, $fileend-$filestart); open INPUT, ">fout.txt"; print INPUT $content2; close INPUT; open INPUT, ">fname.txt"; print INPUT $filename; close INPUT; print "Everything seems to be OK. Log will be written to rfget.log.\n"; open(LOG,">rfget.log") or die "Could not write file.\n"; print LOG $response->as_string;