Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

md5 shit in python

Name: Anonymous 2009-01-20 23:22

import os, md5;

for root, dirs, files in os.walk(os.path.abspath('')):
    for f in files:
        e = os.path.splitext(f)[1]
        if (e and e not in [".py",".db"]):
            h = md5.new(file(f,'rb').read()).hexdigest()
            n = h + e
            try:
                os.rename(f,n)
            except WindowsError:
                os.unlink(n)
                os.rename(f,n)
            print n + " " + f


How do I optimize this, I'm watching SICP right now

Name: Anonymous 2009-01-21 10:07

If you're working with files of any significant size, reading them in chunks will speed things up, save memory, and increase your number of lines of code, thus providing job security.

shit = hashlib.md5()
bitch = file(whatever, 'rb')
fuck = 'nigger'
while fuck:
    fuck = bitch.read(4096) # whatever. look up the size of your disk blocks, make it a multiple of that
    shit.update(fuck)
dick = shit.hexdigest()


But really, the speed of this code is pretty much entirely dependent on the implementation of your hashing algorithm and the speed of your disk, and since an md5 isn't thoroughly processor intensive, the speed of even a modest implementation will surpass typical disk read speeds. You can tell where the major bottleneck lies by comparing the processor load and disk activity graphs in a performance monitoring tool; I'd be almost certain that, for your code, you will be spending more time waiting for I/O than actually computing anything, and only a tiny fraction of that is in the Python interpreter.

... oh, fuck, I just accidentally started writing a serious answer. Sorry.

REWRITE YOUR CODE IN LISP

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List