Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

rss reader in perl

Name: Anonymous 2011-01-19 23:04

Not sure if this board is helpful with this kind of stuff, but I'm trying to teach myself perl by writing simple scripts. This is an rss reader that prints with text I want to the command prompt. It will sometimes print items that don't contain the text, it also displays items with the text at least twice.
Anything obvious?

#!/usr/bin/perl

$feed_link="http://boards.4chan.org/n/index.rss";
$feed_file="/scripts/rss/feed.txt";
@linkarray = (" ");
while(1) {
system("wget -q $feed_link -O /scripts/rss/feed.txt");

open(RSSFILE, "<", $feed_file);

while(<RSSFILE>)
{
    if(/<item>/../<\/item>/)
    {
      if(/<title>/../<\/title>/)
      {
        $whitespace = index $_, "<";
        $title_string = substr $_, $whitespace;
        $title_string =~ s/<title>//g;
        $title_string =~ s/<\/title>//g;
        chomp($title_string);
      }
      elsif(/<link>/../<\/link>/)
      {
        $whitespace = index $_, "<";
        $link_string = substr $_, $whitespace;
        $link_string =~ s/<link>//g;
        $link_string =~ s/<\/link>//g;
        chomp($link_string);
      }
      elsif(/<description>/../<\/description>/)
      {
        $whitespace = index $_, "<";
        $description_string = substr $_, $whitespace;
        $description_string =~ s/<description>//g;
        $description_string =~ s/<\/description>//g;
        chomp($description_string);
      } 
    }
  if($link_string ~~ @linkarray) { }
  else
  {

    if(($title_string =~ m/(deen|studio|a)/i) or ($description_string =~ m/(deen|studio|a)/i))
    {
      print "\n********************\n$title_string\n$link_string\n********************\n";
      push(@linkarray, $link_string);
    }
  }
}
sleep(60);
}

Name: Anonymous 2011-01-19 23:56

Parsing XML with regexen is never pretty.  Try something like this:

#!/usr/bin/perl

use 5.008001;    # 5.010 if you really care about ~~
use strict;
use LWP::UserAgent;
use XML::Twig;

our $feed_link="http://boards.4chan.org/n/index.rss";
our @links = ...;

my $parser = new XML::Twig twig_handlers => {
    item => sub {
        my ($twig, $elem) = @_;
        my $title = $elem->field('title');
        my $link = $elem->field('link');
        my $description = $elem->field('description');
        ... # do you're printing here!
        $twig->purge;
    },
};

my $ua = new LWP::UserAgent;
my $response = $ua->get($feed_link);
$parser->parse($response->content) if $response->is_success;

Name: Anonymous 2011-01-20 0:37

>>2
Says
Can't locate XML/Twig.pm in @INC (@INC contains: C:/strawberry/perl/lib C:/straw
berry/perl/site/lib C:\strawberry\perl\vendor\lib .) at new.perl line 6.
BEGIN failed--compilation aborted at new.perl line 6.

So I download Twig.pm and put it in the inc folders. Still didn't work.
I suck at this

Name: Anonymous 2011-01-20 1:23

Name: Anonymous 2011-01-20 1:54

So I [ didn't use CPAN ]
I suck at this

Yes, you do. It's okay. Use CPAN.

Name: Anonymous 2011-01-20 11:57

>>3
I prefer XML::Parser. It gives you 4 ways for dealing with XML data. And, IMO, is simpler

Name: Anonymous 2011-03-21 22:58

[b]d[/b]

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List