Name: Anonymous 2007-09-27 6:43 ID:x5s6Nq2a
i Have just written this function in php. It is a small part of a search engine, the function searches through a string (in this case it's the content field of a page) and attempts to pick chunks of text out of the string and join them using ...
This is similar to how google works although i'm sure their algorithm is more advanced than this 5 minute job.
NOTES: getMeaningfulContent($s = string, $k = keywords (string|array), $n = int (number of characters to seek))
<?php
function getMeaningfulContent($s, $p, $n = 250) {
if (strlen($s) > $n) {
$buffer = '';
$t_end = strlen($s)-1;
if (is_array($p)) {
foreach ($p as $k=>$v) {
$t_poz_start = strpos(strtoupper($s), strtoupper($v), 0);
if ($t_poz_start !== false) { // we've found an occurence
$t_percentage_to_read = ($t_end - $t_poz_start) / count($p);
if (($t_poz_start + $t_percentage_to_read) >= $t_end) {
$buffer .= '...' . substr($s, $t_poz_start, $t_end-$t_poz_start);
}
else {
$buffer .= '...' . substr($s, $t_poz_start, $t_end);
}
}
}
}
else {
$t_poz_start = strpos(strtoupper($s), strtoupper($p), 0);
if ($t_poz_start !== false) { // we've found an occurence
if ($t_end - $t_poz_start > 150) {
$buffer = substr($s, $t_poz_start, $t_end-$t_poz_start);
}
else { $buffer = substr($s, $t_poz_start, $t_end); }
}
}
return $buffer;
}
else { return $s; }
}
?>
This is similar to how google works although i'm sure their algorithm is more advanced than this 5 minute job.
NOTES: getMeaningfulContent($s = string, $k = keywords (string|array), $n = int (number of characters to seek))
<?php
function getMeaningfulContent($s, $p, $n = 250) {
if (strlen($s) > $n) {
$buffer = '';
$t_end = strlen($s)-1;
if (is_array($p)) {
foreach ($p as $k=>$v) {
$t_poz_start = strpos(strtoupper($s), strtoupper($v), 0);
if ($t_poz_start !== false) { // we've found an occurence
$t_percentage_to_read = ($t_end - $t_poz_start) / count($p);
if (($t_poz_start + $t_percentage_to_read) >= $t_end) {
$buffer .= '...' . substr($s, $t_poz_start, $t_end-$t_poz_start);
}
else {
$buffer .= '...' . substr($s, $t_poz_start, $t_end);
}
}
}
}
else {
$t_poz_start = strpos(strtoupper($s), strtoupper($p), 0);
if ($t_poz_start !== false) { // we've found an occurence
if ($t_end - $t_poz_start > 150) {
$buffer = substr($s, $t_poz_start, $t_end-$t_poz_start);
}
else { $buffer = substr($s, $t_poz_start, $t_end); }
}
}
return $buffer;
}
else { return $s; }
}
?>