Name: Anonymous 2005-10-11 17:08
Hi, I need help.
Though I'm looking for a specific program rather than a programming solution, I figure this is the most proper board to ask.
The situation is, I have a huge text file that goes generally like this:
----- cut -----
1 1234 - Lorem ipsum dolor sit amet
2 1235 - consectetuer adipiscing elit
3 1236 - Pellentesque pellentesque vehicula velit
4 1237 - Nunc sit amet sapien at libero euismod auctor
5 1238 - Aenean turpis
6 1239 - Ut nec ipsum
7 1240 - Lorem ipsum dolor sit amet
8 1241 - Aenean turpis
9 1242 - consectetuer adipiscing elit
10 1243 - Aenean turpis
11 1244 - Nunc sit amet sapien at libero euismod auctor
----- cut -----
Many lines are randomly repeated throughout the file. What I need to do, is to remove all duplicates, leaving only one occurence of a specific line. The numbers are there, but they are to be ignored (and they're not exactly sequential either); only text strings are to be compared.
I suppose this task could have been achieved by means of a Killer PERL One-liner Of Doom, which should be fine, as I have a Linux distro handy. But I'm rather looking for a Windows-based solution, and one that is usable to a generally programming-ignorant user. I don't mind configuring or writing a *simple* script for an all-purpose text parser, but I'd rather if it didn't required me to dive into the deepest depths of regexp syntax.
I hope you get the idea, I'm looking for an app capable of the task, that's rather easy to handle on the user side. Any help will be greatly appreciated.
Though I'm looking for a specific program rather than a programming solution, I figure this is the most proper board to ask.
The situation is, I have a huge text file that goes generally like this:
----- cut -----
1 1234 - Lorem ipsum dolor sit amet
2 1235 - consectetuer adipiscing elit
3 1236 - Pellentesque pellentesque vehicula velit
4 1237 - Nunc sit amet sapien at libero euismod auctor
5 1238 - Aenean turpis
6 1239 - Ut nec ipsum
7 1240 - Lorem ipsum dolor sit amet
8 1241 - Aenean turpis
9 1242 - consectetuer adipiscing elit
10 1243 - Aenean turpis
11 1244 - Nunc sit amet sapien at libero euismod auctor
----- cut -----
Many lines are randomly repeated throughout the file. What I need to do, is to remove all duplicates, leaving only one occurence of a specific line. The numbers are there, but they are to be ignored (and they're not exactly sequential either); only text strings are to be compared.
I suppose this task could have been achieved by means of a Killer PERL One-liner Of Doom, which should be fine, as I have a Linux distro handy. But I'm rather looking for a Windows-based solution, and one that is usable to a generally programming-ignorant user. I don't mind configuring or writing a *simple* script for an all-purpose text parser, but I'd rather if it didn't required me to dive into the deepest depths of regexp syntax.
I hope you get the idea, I'm looking for an app capable of the task, that's rather easy to handle on the user side. Any help will be greatly appreciated.