Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

Finding Duplicate File Namesin Many Folders

Name: Anonymous 2009-01-19 18:45

Recently, a bug was found in the software my employer makes in which the same record number was issued to two or more clients, so that there would be different records with the same number on more than one client.

The bug was fixed, but we wanted to find out if any of our customers had been affected by this bug. A normal directory compare wouldn't work, because all the software we found could only compare between two folders; however, we need to compare the folders of multiple clients to see if any of them share file names with any others, which would be an indication that the bug had occurred.

So, I came up with an algorithm and wrote the code to do just that. The C# program I wrote can find 200,000+ duplicated records in 60+ network folders in about two minutes.

Challenge: come up with an algorithm that, given multiple folders, gives a list of file names shared between two or more folders, and for each duplicated file name found, a list of folders in which it is found.

Example:

Folders:

Folder1\
        1.txt
        2.txt
        3.txt
Folder2\
        3.txt
        9.txt
        100.txt
Folder3\
        2.txt
        9.txt
        55.txt


Results:

2.txt
      Folder1
      Folder3
3.txt
      Folder1
      Folder2
9.txt
      Folder2
      Folder3

Name: Anonymous 2009-01-20 5:03

What difference does it make if an algorithm finishes in a second or a millisecond? Wouldn't the fact that you can write the program in a fraction of the time matter more?

Name: =+=*=F=R=O=Z=E=N==V=O=I=D=*=+= !FrOzEn2BUo 2009-01-20 5:10

>>41
"What difference does it make if an algorithm finishes in a second or a millisecond?"
What difference would it make if algorithm finishes in 100hours or 6 minutes?

_________________________
orbis terrarum delenda est

Name: =+=*=F=R=O=Z=E=N==V=O=I=D=*=+= !FrOzEn2BUo 2009-01-20 5:12

1 second=1000milliseconds
100hours=100*60=6000minutes(about 4 days)
6 minutes=6000/1000minutes
The difference between 4 days and 6minutes is obvious,now?
_________________________
orbis terrarum delenda est

Name: =+=*=F=R=O=Z=E=N==V=O=I=D=*=+= !FrOzEn2BUo 2009-01-20 5:25

Another example would be 100years vs 36.5days


_________________________
orbis terrarum delenda est

Name: Anonymous 2009-01-20 5:41

Enjoy wasting time writing your programs to scale to billions of items and then never actually using it for that

Name: Anonymous 2009-01-20 6:20

>>42
>>43
>>44
Expert unemployed Javascripter, no-one with a meaningful job can spend so much time posting that much shit.

Name: Anonymous 2009-01-20 6:31

recursive md5

Name: Anonymous 2009-01-20 6:43

>>43
Compilers can optimize better then you.

Name: Anonymous 2009-01-20 6:50

>>48
unless you're using gcc

Name: Anonymous 2009-01-20 7:44

>>48
unless you're using yhbt

Name: Anonymous 2009-01-20 9:00

Why would a performance obsessed retard have Javascript as his favorite language?
IAGTBT

Name: Anonymous 2009-01-20 10:04

bump for trollage

Name: Anonymous 2009-01-20 10:04

>>48
[spoiler]That is true, you should use a compiler, and then do further optimizations by hand.

Name: Anonymous 2009-01-20 10:09

>>53
Failure to preview post detected.

Name: Anonymous 2009-01-20 12:40

>>30
Actually, the total program is about 350 lines, but that includes all the GUI shit. The algorithm is about 40 lines, but that includes extra shit to handle creating TreeNodes and filtering to only look at certain files.

Name: Anonymous 2009-01-20 13:14

>>55
shit, C# still sucks. Also why write a GUI for it? Isn't a command line tool enough for your needs?

Name: =+=*=F=R=O=Z=E=N==V=O=I=D=*=+= !FrOzEn2BUo 2009-01-20 13:26

>>56
are you going to open a command line prompt and type out the command with parameters every time you use the tool?
For single-purpose one-use program its ok,but what about more permanent solution,like a convenient GUI interface?


_________________________
orbis terrarum delenda est

Name: Anonymous 2009-01-20 13:29

>>57
While your main point is correct, >>1's program fits in the category of single or rare use

Name: =+=*=F=R=O=Z=E=N==V=O=I=D=*=+= !FrOzEn2BUo 2009-01-20 13:39

>>58
However its not confined to the domain of problem of >>1
and can be used with any duplicated files.
_________________________
orbis terrarum delenda est

Name: Anonymous 2009-01-20 13:47

>>59
But how often would it need to be used?

Name: Anonymous 2009-01-20 14:11

>>60
The more important question is, who will use it? I rightly assume that our clients and my manager aren't familiar enough with the command line, so I made a nice GUI. If this were purely a one-time internal tool, I might not care to write a GUI, but this tool will be used by multiple people at multiple times, to ensure that they aren't affected by the bug, and that they won't be in the future if the bug regresses.

Name: Anonymous 2009-01-20 15:22

Write a man page. Thread over.

Name: Anonymous 2009-01-20 15:58

>>57
Leave the command line open all the time. /thread

Name: Anonymous 2009-01-20 19:54

Stop replying to nonexistent posts, you dumbasses.

Name: Anonymous 2011-01-31 20:14

<-- check em dubz

Name: Anonymous 2013-01-19 23:45

/prog/ will be spammed continuously until further notice. we apologize for any inconvenience this may cause.

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List