NAME WWW::GoKGS - KGS Go Server (http://www.gokgs.com/) Scraper SYNOPSIS use WWW::GoKGS; my $gokgs = WWW::GoKGS->new; # Game archives my $game_archives_1 = $gokgs->scrape( '/gameArchives.jsp?user=foo' ); my $game_archives_2 = $gokgs->game_archives->query( user => 'foo' ); # Top 100 players my $top_100_1 = $gokgs->scrape( '/top100.jsp' ); my $top_100_2 = $gokgs->top_100->query; # List of tournaments my $tourn_list_1 = $gokgs->scrape( '/tournList.jsp?year=2014' ); my $tourn_list_2 = $gokgs->tourn_list->query( year => 2014 ); # Information for the tournament my $tourn_info_1 = $gokgs->scrape( '/tournInfo.jsp?id=123' ); my $tourn_info_2 = $gokgs->tourn_info->query( id => 123 ); # The tournament entrants my $tourn_entrants_1 = $gokgs->scrape( '/tournEntrans.jsp?id=123&sort=n' ); my $tourn_entrants_2 = $gokgs->tourn_entrants->query( id => 123, sort => 'n' ); # The tournament games my $tourn_games_1 = $gokgs->scrape( '/tournGames.jsp?id=123&round=1' ); my $tourn_games_2 = $gokgs->tourn_games->query( id => 123, round => 1 ); DESCRIPTION This module is a KGS Go Server ("http://www.gokgs.com/") scraper. KGS allows the users to play a board game called go a.k.a. baduk (Korean) or weiqi (Chinese). Although the web server provides resources generated dynamically, such as Game Archives, they are formatted as HTML, the only format. This module provides yet another representation of those resources, Perl data structure. This class maps a URI preceded by "http://www.gokgs.com/" to a proper scraper. The supported resources on KGS are as follows: KGS Game Archives (http://www.gokgs.com/archives.jsp) Handled by WWW::GoKGS::Scraper::GameArchives. Top 100 KGS Players (http://www.gokgs.com/top100.jsp) Handled by WWW::GoKGS::Scraper::Top100. KGS Tournaments (http://www.gokgs.com/tournList.jsp) Handled by WWW::GoKGS::Scraper::TournList, WWW::GoKGS::Scraper::TournInfo, WWW::GoKGS::Scraper::TournEntrants and WWW::GoKGS::Scraper::TournGames. ATTRIBUTES $UserAgent = $gokgs->user_agent Returns an LWP::UserAgent object which is used to "GET" the requested resource. This attribute is read-only. use LWP::UserAgent; my $gokgs = WWW::GoKGS->new( user_agent => LWP::UserAgent->new( agent => 'MyAgent/1.00' ) ); $CodeRef = $gokgs->html_filter Returns an HTML filter. Defaults to an anonymous subref which just returns the given argument ("sub { $_[0] }"). The callback is called with an HTML string. The return value is used as the filtered value. This attribute is read-only. my $gokgs = WWW::GoKGS->new( html_filter => sub { my $html = shift; $html =~ s/<.*?>//g; # strip HTML tags $html; } ); $CodeRef = $gokgs->date_filter Returns a date filter. Defaults to an anonymous subref which just returns the given argument ("sub { $_[0] }"). The callback is called with a date string such as "2014-05-17T19:05Z". The return value is used as the filtered value. This attribute is read-only. use Time::Piece qw/gmtime/; my $gokgs = WWW::GoKGS->new( date_filter => sub { my $date = shift; # => "2014-05-17T19:05Z" gmtime->strptime( $date, '%Y-%m-%dT%H:%MZ' ); } ); $GameArchives = $gokgs->game_archives $gokgs->game_archives( WWW::GoKGS::Scraper::GameArchives->new(...) ) Can be used to get or set a scraper object which can "scrape" "/gameArchives.jsp". Defaults to a WWW::GoKGS::Scraper::GameArchives object. $Top100 = $gokgs->top_100 $gokgs->top_100( WWW::GoKGS::Scraper::Top100->new(...) ) Can be used to get or set a scraper object which can "scrape" "/top100.jsp". Defaults to a WWW::GoKGS::Scraper::Top100 object. $TournList = $gokgs->tourn_list $gokgs->tourn_list( WWW::GoKGS::Scraper::TournList->new(...) ) Can be used to get or set a scraper object which can "scrape" "/tournList.jsp". Defaults to a WWW::GoKGS::Scraper::TournList object. $TournInfo = $gokgs->tourn_info $gokgs->tourn_info( WWW::GoKGS::Scraper::TournInfo->new(...) ) Can be used to get or set a scraper object which can "scrape" "/tournInfo.jsp". Defaults to a WWW::GoKGS::Scraper::TournInfo object. $TournEntrants = $gokgs->tourn_entrants $gokgs->tourn_entrants( WWW::GoKGS::Scraper::TournEntrants->new(...) ) Can be used to get or set a scraper object which can "scrape" "/tournEntrants.jsp". Defaults to a WWW::GoKGS::Scraper::TournEntrants object. $TournGames = $gokgs->tourn_games $gokgs->tourn_games( WWW::GoKGS::Scraper::TournGames->new(...) ) Can be used to get or set a scraper object which can "scrape" "/tournGames.jsp". Defaults to a WWW::GoKGS::Scraper::TournGames object. INSTANCE METHODS $HashRef = $gokgs->scrape( '/gameArchives.jsp?user=foo' ) $HashRef = $gokgs->scrape( 'http://www.gokgs.com/gameArchives.jsp?user=foo' ) A shortcut for: my $uri = URI->new( 'http://www.gokgs.com/gameArchives.jsp?user=foo' ); my $game_archives = $gokgs->game_archives->scrape( $uri ); See WWW::GoKGS::Scraper::GameArchives for details. $HashRef = $gokgs->scrape( '/top100.jsp' ) $HashRef = $gokgs->scrape( 'http://www.gokgs.com/top100.jsp' ) A shortcut for: my $uri = URI->new( 'http://www.gokgs.com/top100.jsp' ); my $top_100 = $gokgs->top_100->scrape( $uri ); See WWW::GoKGS::Scraper::Top100 for details. $HashRef = $gokgs->scrape( '/tournList.jsp?year=2014' ) $HashRef = $gokgs->scrape( 'http://www.gokgs.com/tournList.jsp?year=2014' ) A shortcut for: my $uri = URI->new( 'http://www.gokgs.com/tournList.jsp?year=2014' ); my $tourn_list = $gokgs->tourn_list->scrape( $uri ); See WWW::GoKGS::Scraper::TournList for details. $HashRef = $gokgs->scrape( '/tournInfo.jsp?id=123' ) $HashRef = $gokgs->scrape( 'http://www.gokgs.com/tournInfo.jsp?id=123' ) A shortcut for: my $uri = URI->new( 'http://www.gokgs.com/tournInfo.jsp?id=123' ); my $tourn_info = $gokgs->tourn_info->scrape( $uri ); See WWW::GoKGS::Scraper::TournInfo for details. $HashRef = $gokgs->scrape( '/tournEntrants.jsp?id=123&s=n' ) $HashRef = $gokgs->scrape( 'http://www.gokgs.com/tournEntrants.jsp?id=123&s=n' ) A shortcut for: my $uri = URI->new( 'http://www.gokgs.com/tournEntrants.jsp?id=123&s=n' ); my $tourn_entrants = $gokgs->tourn_entrants->scrape( $uri ); See WWW::GoKGS::Scraper::TournEntrants for details. $HashRef = $gokgs->scrape( '/tournGames.jsp?id=123&round=1' ) $HashRef = $gokgs->scrape( 'http://www.gokgs.com/tournGames.jsp?id=123&round=1' ) A shortcut for: my $uri = URI->new( 'http://www.gokgs.com/tournGames.jsp?id=123&round=1' ); my $tourn_games = $gokgs->tourn_games->scrape( $uri ); See WWW::GoKGS::Scraper::TournGames for details. $scraper = $gokgs->get_scraper( $path ) Returns a scraper object which can "scrape" a resource located at $path on KGS. If the scraper object does not exist, then "undef" is returned. my $game_archives = $gokgs->get_scraper( '/gameArchives.jsp' ); # => WWW::GoKGS::Scraper::GameArchives object $gokgs->set_scraper( $path => $scraper ) $gokgs->set_scraper( $p1 => $s1, $p2 => $s2, ... ) Can be used to set a scraper object which can "scrape" a resource located at $path on KGS. You can also set multiple scrapers in one "set_scraper" call. use Web::Scraper; use WWW::GoKGS::Scraper::FooBar; # isa WWW::GoKGS::Scraper $gokgs->set_scraper( '/fooBar.jsp' => WWW::GoKGS::Scraper::FooBar->new, '/barBaz.jsp' => scraper { process '.bar', baz => 'TEXT; ... } ); CLASS METHODS $class->mk_accessors( $path ) $class->mk_accessors( @paths ) Creates the accessor method for a scraper which can "scrape" $path. You can also create multiple accessors in one "mk_accessors" call. use parent 'WWW::GoKGS'; # Generates foo_bar() whose builder is _build_foo_bar() __PACKAGE__->mk_accessors( '/fooBar.jsp' ); # Build a scraper object which can scrape /fooBar.jsp sub _build_foo_bar { my $self = shift; ... } $CodeRef = $class->make_accessor( $path ) Returns a subroutine reference which acts as an accessor for the scraper which can "scrape" $path. $accessor_name = $class->accessor_name_for( $path ) Returns the accessor name of a scraper which can "scrape" $path. my $accessor_name = $class->accessor_name_for( '/fooBar.jsp' ); # => "foo_bar" $builder_name = $class->builder_name_for( $path ) Returns the builder name of a scraper which can "scrape" $path. my $builder_name = $class->builder_name_for( '/fooBar.jsp' ); # => "_build_foo_bar" WRITING SCRAPERS KGS scrapers should use a namespace which starts with "WWW::GoKGS::Scraper::", and also should be a subclass of WWW::GoKGS::Scraper so that the users can not only use the module solely, but also can add the scraper object to "WWW::GoKGS" object as follows: use WWW::GoKGS::Scraper::FooBar; # your scraper # using set_scraper() $gokgs->set_scraper( '/fooBar.jsp' => WWW::GoKGS::Scraper::FooBar->new ); # by subclassing use parent 'WWW::GoKGS'; __PACKAGE__->mk_accessors( '/fooBar.jsp' ); sub _build_foo_bar { WWW::GoKGS::Scraper::FooBar->new } ENVIRONMENTAL VARIABLES AUTHOR_TESTING Some tests for scrapers send HTTP requests to "GET" resources on KGS. When you run "./Build test", they are skipped by default to avoid overloading the KGS server. To run those tests, you have to set "AUTHOR_TESTING" to true explicitly: $ perl Build.PL $ env AUTHOR_TESTING=1 ./Build test Author tests are run by Travis CI once a day. You can visit the website to check whether the tests passed or not. ACKNOWLEDGEMENT Thanks to wms, the author of KGS Go Server, we can enjoy playing go online for free. SEE ALSO KGS Go Server , Web::Scraper AUTHOR Ryo Anazawa (anazawa@cpan.org) LICENSE This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.