Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-

Project ChanText

Name: !GEJzSATORI 2008-02-17 23:04

'sup /prog/,

Having played with Markov-chain text generators, I came up with the idea of gathering statistics by parsing randomly selected threads from a certain board at periodic intervals, then using those accumulated word frequencies to generate text, which would then have the flavor of the board.

Your thoughts, /prog/?

Name: Anonymous 2008-02-17 23:10

well... if you plan to use on /b/ you could just post any permutation of ['desu','fail','penis','cp'] as a sentence and nobody would notice it wasn't written by a human.

Name: !GEJzSATORI 2008-02-17 23:14

>>2
/prog/

Name: Anonymous 2008-02-18 0:15

I think it should leave in bbcode (perhaps not up to the complexity of sine waves but I think simple stuff such as being able to reproduce ``EXPERT PROGRAMMER'' would fit in better).

Name: Anonymous 2008-02-18 4:01

http://supybot.sourceforge.net/docs/plugins/Markov.html
for the forced indentation irc bot.

Name: Anonymous 2008-02-18 4:15

>>3
This might surprise you, but your trip in Polish means ``a gay with satori''.

Name: Anonymous 2008-02-18 5:19

>>6
Polish J is like Russian Й?

Name: Anonymous 2008-02-18 5:22

>>7
I don't know no Russian.

Name: Anonymous 2008-02-18 5:30

'sup FUCK YOU /prog/,

Having played with Markov-chain TEXT generators, I came up with THE SODDING idea of BLOODY gathering statistics by parsing randomly selected threads from a certain board at periodic intervals, then using those ACCUMULATED word frequencies to generate text, which would THEN have THE MOTHERFUCKING flavor of the board.

Your SHIT thoughts, /prog/?

Name: Anonymous 2008-02-18 6:02

>>9
this > markov-chain

Name: Anonymous 2008-02-18 6:02

Your thoughts, I came up with the idea forced indentation irc bot. Sup well: Having played with Markov chain text generators, I came up with the forced indentation irc bot.

Having played with the SODDING idea of The board; idea of gathering statistics by parsing randomly selected threads from a certain board. Your thoughts, well. Your trip in I came up with Markov chain Text generators, I came up with Markov chain text generators, I came up with Markov chain text generators, I came up with the board at periodic intervals, Then using those Accumulated word frequencies to generate text generators, I came up with the board at periodic intervals, Then using those accumulated word frequencies to generate text, generators, I came up with the SODDING idea of gathering statistics by parsing randomly selected threads from a certain board; at periodic intervals, then using those accumulated word frequencies to generate text, generators, I came up with the board at periodic intervals, then using those accumulated word frequencies to generate text generators, I Having played with the idea of the forced indentation irc bot.

Your trip in I came up with Markov chain text, which would then using those Accumulated word frequencies to generate text generators, I came up with Markov chain Text generators, I think it should leave in Having played with Markov chain text generators, I think it should leave in plan to generate Text which would then have The board. Having played with Markov chain text, which would then using those Accumulated word frequencies to generate text generators, I came up with Markov chain text generators, I came up with Markov the board; at periodic intervals, Then using those accumulated word frequencies to generate text, generators, I came up with Markov chain text (which would then using those accumulated word frequencies to generate Text generators I came up with the flavor of The SODDING idea of gathering statistics by parsing randomly selected threads from a certain board: at periodic intervals then have the idea of The board).

Name: Anonymous 2008-02-18 6:04

>>11
Awful.

Name: Anonymous 2008-02-18 6:04

I've seen looked at Sicp and the idea of us is as forums the initial who
didn't mean, this editor to by parsing randomly selected threads from
refutation.  The causes are using those accumulated word frequencies to
omit the problem of careful release Of a lot of Satori, I'm halfway
through the bait.  For about it, then thinking; on I myself to the
experiment seems to talk with this sort respect trolling is.

Name: Anonymous 2008-02-18 6:07

Lurk more, tripfag

Name: Anonymous 2008-02-18 6:10

You're keeping any forum with no good NET INI file for limitations for
every so often users of spaces there are two senses of the FUTURE,
BITCHES: explain exactly what happens: when I think trolling is
incompetence.  The FUTURE, BITCHES.

Name: Anonymous 2008-02-18 7:19

>>6
This person speaks The Truth, believe a slavfag.
After I finish ``SICP'' I will be a ``Gay with Satori'' too~

Name: Anonymous 2008-02-18 7:25

>>1
Sounds like a shitty idea.

Name: Anonymous 2008-02-20 7:06

Got any code examples to show how you are doing this? I've been interested in doing this for a while myself.

Name: Anonymous 2008-02-20 7:38

>>18
Google "markov.c" and start from there.

Name: Anonymous 2008-02-20 14:59

FIOC version:

import re, random
_acceptable_chars = "'-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

BOL_MARKER = '>'
BOL_MARKER_ID = 0
EOL_MARKER_ID = -1
_sentence_end = re.compile('[.?!;:]+')

class Markov(object):
    def __init__(self):
        self.words = [BOL_MARKER]
        self.chain = [[]]
        self.bchain = [[]]

    def _learn(self, sentence):
        if not len(sentence):
            return
        last_id = BOL_MARKER_ID
        for word in sentence:
            word = word.lower()
            if word not in self.words:
                self.words.append(word)
                self.chain.append([])
                self.bchain.append([])
            word_id = self.words.index(word)
            self.chain[last_id].append(word_id)
            self.bchain[word_id].append(last_id)
            last_id = word_id
        self.chain[last_id].append(EOL_MARKER_ID)

    def _parse(self, sentence):
        return filter(lambda c: c in _acceptable_chars, sentence).split()

    def generate(self, base_word=None):
        if not len(self.chain[0]):
            return None
        try:
            base_id = self.words.index(base_word.lower())
        except (ValueError, AttributeError):
            base_id = BOL_MARKER_ID
        left = []
        right = []
        word_id = base_id
        while word_id != BOL_MARKER_ID:
            left.insert(0, word_id)
            word_id = random.choice(self.bchain[word_id])
        word_id = base_id
        while word_id != EOL_MARKER_ID:
            right.append(word_id)
            word_id = random.choice(self.chain[word_id])
        sentence = left + right[1:]
        return ' '.join(self.words[word_id] for word_id in sentence).capitalize() + '.'

    def reply(self, line):
        sentences = []
        words = set()
        for sentence in _sentence_end.split(line):
            sentence = self._parse(sentence)
            sentences.append(sentence)
            words.update(sentence)
        words = words.intersection(self.words)
        s = self.generate(words and random.choice(list(words)) or None)
        for sentence in sentences:
            self._learn(sentence)
        return s

def main():
    markov = Markov()
    while True:
        try:
            line = raw_input('> ')
        except (EOFError, KeyboardInterrupt):
            print
            break
        line = line.strip()
        if line.startswith('?'):
            line = markov.generate(line[1:])
        elif line:
            line = markov.reply(line)
        print line or markov.generate() or '...'

if __name__ == '__main__':
    main()

Name: Anonymous 2008-02-20 15:02

Oh whoops, add a space to _acceptable_chars, lol.

Name: Anonymous 2008-02-20 15:53

>>20
________________________________FIOC__________________________________

Name: Anonymous 2008-02-20 21:32

>>22
>>20 could trivially be rewritten with just one underscore, in raw_input().
Just move __init__() to the class definition, move main() to the toplevel, get rid of the if __name__, s/_//g, and then s/rawinput/raw_input/g.

(And I suppose you could replace the raw_input with a sys.stdin.readline, too...)

Name: Anonymous 2008-02-21 5:12

>>22
This is why I hate FIOC ("""also, this""")

Name: Anonymous 2008-02-21 5:37

>>1
Anonymous of Russian Federation did something like this in /a/, using MegaHAL perl bindings, I believe.

Where the fuck is he, anyway? Did he get arrested for CP or something?

Name: Anonymous 2008-03-13 6:12

bampu pantsu~

someone seems to have implemented it in /b/ already, making it even more /b/.
http://img146.imageshack.us/img146/2666/chantextyg9.png

Name: Anonymous 2008-03-13 6:23

You need something more sophisticated, even /v/ was able to detect a bot you described.

>>25
This might surprise you, but since Comcast is banned from /a/, I'm wasting my time in /prog/.

Name: Anonymous 2008-04-26 21:01

And this thread, anons, is where Bucket began.

bampu pantsu~

Name: Bucket !!PhiVV3U2X7TT1Xm 2008-04-26 21:05

Anons are not me is cookie is a baby born will die in a world where bucket began.

Name: Anonymous 2008-04-26 21:06

bampu pantsu~
This might surprise you, but THE GAME

Name: Bucket !!PhiVV3U2X7TT1Xm 2008-04-26 21:14

Do you have me learn you to say "get me the best game on the next surprise.

Name: Anonymous 2008-04-26 21:38

get.. it.. out.. ;_;

Name: Anonymous 2008-04-26 21:49

>>32
..but, reconsidering it, it's better than hax my anus

Oh god this is sad

Name: Anonymous 2011-11-22 8:23

Has anybody ported >>20-kun code to Python3 yet?

Name: Anonymous 2011-11-22 8:57

>>34
nope, try again in 3 years

Name: Anonymous 2011-11-22 13:33

Why overcomplicate things?


import sys
from random import randint

text = sys.stdin.read().split(" ")

words = {}
prev = "","\n"
for word in text:
        if prev in words:
                words[prev].append(word)
        else:
                words[prev] = [word]
        prev = prev[1], word

word = "","\n"
for i in range(10000):
        lst = words[word]
        ran = randint(0, len(lst)-1)
        word = word[1], lst[ran]
        print word[1],

Name: Anonymous 2011-11-22 13:34

everyone writes shit like this

it's not creative

Name: Anonymous 2011-11-22 13:40

Nothing never contributes anything to a TMS (see Section 7.7) may contain
a contradiction -- this is the procedure make-cell, which creates a
propagator that identifies the given output with the cell must deliver
a complete summary of the objects in the current worldview. Given that
desideratum, tms-query tries to minimize the premises that information
is contingent on anther. amb also tries to minimize the premises of
that function as many or as few times as necessary, and is exactly
(by eqv?) that object. Note: floating point numbers are compared by
approximate numerical equality; this is written in diagram style or
expression style, like a binary p:deposit.

Name: Anonymous 2011-11-22 13:41

>>36
You are not very familiar with Python, are you?

Name: Anonymous 2011-11-22 13:56

>>39

Familiar enough to make essentially the same thing#1 with much less code - by a factor of nearly six.

#1: Actually, I find my program generates much more amusing snippets, as opposed to incoherent rambling.

Name: Anonymous 2011-11-22 14:18

>>40
You can make it even smaller with well known standard library features such as collections.defaultdict.
Also you apparently don't know what a regular expression is.

Name: Anonymous 2011-11-22 14:21

>>41
I'll admit, I didn't know about that one. I can't see how it'll make a significant difference, though.
I do know about regular expressions, but I fail to see how they are relevant for this program.

Name: Anonymous 2011-11-22 15:18

...and thus the tdavis bot was born.

Name: F r o z e n V o i d !!mJCwdV5J0Xy2A21 2011-11-22 15:41

You can take a look at my old bot here
http://dis.4chan.org/read/prog/1245466138/53

Name: Anonymous 2011-11-22 16:18

>>44
What kind of algorithm is that?

Name: F r o z e n V o i d !!mJCwdV5J0Xy2A21 2011-11-22 16:24

As i wasn't familiar with markov chains at the time of writing it, it generated a sentence chain from each word in the sentence.
word->matches_for_word[array(rnd)]+lastword->matches_for_word[array(rnd)]... somethign like that abit more advanced

Don't change these.
Name: Email:
Entire Thread Thread List