1
Name:
Anonymous
2008-10-06 16:35
Hey EXPERT PROGRAMMERS
I want to make something that'll archive the imageboards I want. I know the text field is limited to 2000 characters, but what about the name/mail/subject ones? Oh, and how does it work with unicode characters?
24
Name:
Anonymous
2008-10-07 15:31
>>23
Can somebody explain now how can I use regexes?
Sure.
$text=~m! <td \s id="(\d+)"[^>]*> \s*
<input[^>]*><span \s class="replytitle">(?>(.*?)</span>) \s*
<span \s class="commentpostername">(?:<span [^>]*>)?(?:<a \s href="mailto:([^"]*)"[^>]*>)?([^<]*?)(?:</a>)?(?:</span>)?</span>
(?: \s* <span \s class="postertrip">(?:<span [^>]*>)?([a-zA-Z0-9\.\+/\!]+)(?:</a>)?(?:</span>)?</span>)?
(?: \s* <span \s class="commentpostername"><span [^>]*>\#\# \s (.?)[^<]*</span></span>)?
\s ([^>]*) \s \s* <span[^>]*> \s*
(?>.*?</span>) \s*
(?:
<br> \s*
<span \s class="filesize">File \s :
<a \s href="([^"]*/src/\d+\.\w+)"[^>]*>[^<]*</a> \s*
\- \s* \((Spoiler \s Image,)?([\d\sGMKB\.]+)\, \s (\d+)x(\d+)(?:, \s* <span \s title="([^"]*)">[^<]*</span>)?\)
</span> \s*
(?:
<br>\s*<a[^>]*><img \s+ src=\S* \s+ border=\S* \s+ align=\S* \s+ (?:width=(\d+) \s height=(\d+))? [^>]*? md5="?([\w\d\=\+\/]+)"? [^>]*? ></a> \s*
|
<a[^>]*><span \s class="tn_reply"[^>]*>Thumbnail \s unavailable</span></a>
)
|
<br> \s*
<img [^>]* alt="File \s deleted\." [^>]* > \s*
)?
<blockquote>(?>(.*?)</blockquote>)</td></tr></table>
!xs or $self->troubles("error parsing post\n------\n$text\n------\n") and return;
And that's how you parse a post using the power of
[o] [u] [b] regexps [b] [u] [o] .