Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Automatic language classification

Name: Anonymous 2011-06-24 18:25



#!/usr/bin/python2

import sys
import bz2

def classify(text, langs=('english', 'german', 'french')):
    results = {}
    for lang in langs:
        with open(lang + '.txt') as f:
            corpus = f.read()

        compressed = len(bz2.compress(corpus))
        results[lang] = len(bz2.compress(corpus + text)) - compressed

    return sorted(results, key=results.__getitem__)

if __name__ == '__main__':
    print "Most likely %s." % classify(sys.stdin.read())[0].capitalize()

$ wget -qO - http://www.gutenberg.org/ebooks/31469.txt.utf8 | ./classific.py
Most likely English.
$ wget -qO - http://www.gutenberg.org/ebooks/22367.txt.utf8 | ./classific.py
Most likely German.
$ wget -qO - http://www.gutenberg.org/ebooks/4968.txt.utf8 | ./classific.py
Most likely French.

Name: Anonymous 2011-06-26 0:17

>>22>>23
lol, she thinks she won't lose her job for being a sexist pig with ugly penis.

One of British soccer's leading television commentators was fired Tuesday, a day after being taken off the air and temporarily suspended for making sexist remarks about a female match official.

Andy Gray, the face of Sky Sports' soccer coverage for the past two decades, was dismissed by the broadcaster after "new evidence of unacceptable and offensive behavior" that took place off-air last month.

The former Scotland striker and broadcast colleague Richard Keys had been reprimanded and removed from duty Monday for making derogatory comments about lineswoman Sian Massey, former referee Wendy Toms and West Ham executive Karren Brady.

In an off-air exchange with Andy Gray, Keys commented that "Somebody better get down there and explain offside to her." After Gray suggested that "Women don't know the offside rule", Keys remarked "Course they don't. I can guarantee you there will be a big one today.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List