Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-4041-

Unique IDs for sorting images

Name: Anonymous 2009-09-14 3:51

I've got a problem. I been using MD5 hash as the name for images, but it's too damn long. What would you guys suggest as a shorter alternative solution?

Name: Anonymous 2009-09-14 4:29

Are you aware the MD5 hash of an image bears no relation to any potential, natural, or possibly inferred "ordering" of images and that it is indeed debatable whether or not there even exists an ordering mechanism for images that a human would appreciate?
How about a descriptive naming scheme with a decent folder hierarchy?

Name: Anonymous 2009-09-14 4:45

why not just use a number that's incremented for each file?
seems like the sensible thing to do

Name: Anonymous 2009-09-14 4:56

If your database can't handle a 128-bit number, I suggest you move to another, stat.
That said, you really ought to be using a non-broken hashing algorithm for any new application.

Name: Anonymous 2009-09-14 5:24

CRC32?

Name: Anonymous 2009-09-14 5:49

>>5
Enjoy your collisions.

Name: Patrick the Ginger 2009-09-14 6:15

>>6
( ≖‿≖)

Name: Anonymous 2009-09-14 6:19

>>2
Sorry! I used the wrong term, I did not mean sorting. Just for the sake of giving them a unique identifier.

>>4
Has nothing to do with the database, just my own personal preference for naming my files. I just wanted something shorter and fixed-length because it looks less pig disgusting.

Name: Anonymous 2009-09-14 6:34

unix time, until 2038, then you're fucked

Name: Anonymous 2009-09-14 6:35

rot13 date with day/year ratio: zba-frc-0.006968641.jpg
LAWL

Name: Anonymous 2009-09-14 6:47

What about tripcodes?

Name: Anonymous 2009-09-14 6:56

>>10
if you're gonna do date
why not just, 20090914035249.jpg (YYYYMMDDHHMMSS)

less cryptic and easy if you want to sort your pic by date

Name: Anonymous 2009-09-14 7:17

this is moron shit

Name: Anonymous 2009-09-14 8:13

>>12
less cryptic and easy if you want to sort your pic by date
That's just too stupid. When you upload an image, it is created in the server. You can simply sort by creation time and you've got the same feature. OP has to consider things like, how it will look as a link (http://.../1839128398120983912083921.jpg isn't likable)

Name: Anonymous 2009-09-14 8:34

>>14
but that's what image urls look like on /b/

Name: Anonymous 2009-09-14 9:00

>>14
Your link is broken.

Name: Anonymous 2009-09-14 9:22

>>6
For CRC32 that is 1 in 2^32.

>>1
Just take last $whatever characters from MD5.

Name: Anonymous 2009-09-14 9:32

>>15
/b/ *DOESN'T* want to be linked

Name: Anonymous 2009-09-14 9:38

>>17
Yes, in theory, if you have two images, the chances of the two images having the same hash might be 2³². Now, what if you have three images? Or four? Or 10000?

Name: Anonymous 2009-09-14 9:49

>>17
For CRC32 that is 1 in 2^32.
You're going to love finding out about the birthday paradox.

Name: Anonymous 2009-09-14 10:02

>>20
...why is why CRCs are used for error checking, not for comparison.

Name: Anonymous 2009-09-14 10:05

Here's one that will fit into 4 bytes
<?php
$file = file_get_contents("image.jpg");
$hash = md5($file);
$hash = $hash.substring(3);
?>

Name: Anonymous 2009-09-14 10:58

>>22

Here's one that will fit into 1 byte
<?php
$file = file_get_contents("image.jpg");
$hash = md5($file);
$hash = $hash.substring(1);
?>

Name: Anonymous 2009-09-14 12:55

<?$hash md5(file_get_contents("image.jpg")).substring(1);?>

You might as well be terse.

Name: Anonymous 2009-09-14 13:13

>>23
I wouldn't do this if I were you. If you're going to use a 1-byte hash, there's probably a faster way of calculating it than md5. For example, you could do something like this :

<?php
$file = file_get_contents("image.jpg");
$hash = 0;

for ($i = 0; $i < strlen($file); $i++) {      
  $character = ord($file{$i});
  $hash = ($hash + $character) % 256;
}

$hash = ord($hash);
?>

Name: Anonymous 2009-09-14 13:33

This thread makes me mad inside my head.

Name: Anonymous 2009-09-14 13:58

Name: Anonymous 2009-09-14 14:04

>>26
It depicts the madness that are PRO PHP WEB DEVELOPERS

Name: Anonymous 2009-09-14 14:06

>>25

That's amazing

Name: Anonymous 2009-09-14 14:52

If you used a base64 representation of the hash instead of hexadecimal, it would only be 22 characters instead of 32. Still pretty long, though.

In your case, you only want the IDs to be unique, so you can just use an incrementing counter. (MD5's only benefit here¹ is that it produces IDs which aren't sequential, which you don't need.)

>>17
Birthday paradox.
The birthday problem in this more generic sense applies to hash functions: the expected number of N-bit hashes that can be generated before getting a collision is not 2N, but rather only 2N/2❞.
Yes, you expect a collision after 216 = 65536 items. Enjoy your collisions, Patrick.

_____
¹ Probably. Too lazy to think of any others.

Name: bbcode expert!!! 2009-09-14 15:07

just use the first few letters of hash

___
i am a [b][i]bcode expert!

Name: Anonymous 2009-09-14 15:09

>>9
I think file names can handle another digit

Name: Anonymous 2009-09-14 16:46

>>26
Really?! Just this one?

Name: Anonymous 2009-09-14 16:57

>>33
It's a very simple sentence and you still failed to parse it.

Name: Anonymous 2009-09-14 17:19

>>30
How is that a ``paradox''?

Name: Anonymous 2009-09-14 18:28

>>31
Your suggestion was already covered by >>22, >>23, >>25. Please read before posting.

Name: Anonymous 2009-09-14 18:31

>>15
4chan image boards, as well as many others, use unix time for the file name, which is 13 characters long

Name: Anonymous 2009-09-14 18:41

Get the last 6 characters of the md5/ sha1 hash as humans easily remember up to 7 characters.

Name: Anonymous 2009-09-14 19:08

Name: Anonymous 2009-09-14 19:16

>>36
yhbt.
     have a nice day!

Name: Anonymous 2009-09-14 19:17

>>37
But.. you're wrong. Unix time is only 10 characters. Filenames on 4chan are 13 chars because the last 3 chars are there in case two files are uploaded at the same time.

Name: Anonymous 2009-09-14 19:56

just a simple increment.  That's all you need.

Name: Anonymous 2009-09-14 20:06

Use the full contents of the file as an ID.

Name: Anonymous 2009-09-15 18:51

>>43
Some sort of reversible AND unique hash?
BRILLIANT!

Name: Anonymous 2009-09-15 20:04

>>41
I always assumed the last three numbers were milliseconds.

Name: Anonymous 2009-09-15 21:19

the number of bytes in the file converted to base36

Name: Anonymous 2009-09-15 21:23

>>46
actually, why not just use MD5 instead?
assuming you're using a hex string representation of the hash value, converting that to base 36 effectively halves its length.

Name: Anonymous 2009-09-16 0:16

>>47
Why use a string representation at all?

Name: Anonymous 2009-09-16 1:00

>>48
because most OS's won't let you use something like öÕöw—éçæAÞê¿ê in a file name

Name: Anonymous 2009-09-16 2:10

>>49
Insomuch that most OS's don't have a file system at all. But the ones that do, including all the most popular desktop and server OS's, tend to be permissive to a fault, requiring you only to escape somewhere between one and a dozen control characters.

Name: Anonymous 2010-12-26 17:11

Name: Anonymous 2011-02-03 0:04

<

Name: Anonymous 2011-02-03 4:54


Don't change these.
Name: Email:
Entire Thread Thread List