Helpful Information
 
 
Category: Dev Shed Lounge
Eliminating duplicate lines in TXT files

Anybody know of a program to eliminate duplicate lines from CSV files ( usually ranging from 1-20MB)? I've found one but it is actually a HTML parser and doesn't support files over 2.5MB in size.

ie


Barbara Meyers,Associate,97401
Barbara Yvette Santiago,Member,90210
Barbara J Green,Associate,67541
Barbara Jean Hall,Moderator,43521
Barbara Yvette Santiago,Member,90210
Barbara Yvette Santiago,Member,90210
Barbara Jean Hall,Moderator,43521
Barbara Yvette Santiago,Member,90210


after being parsed I'd want all original records intact, but the exact duplicates removed.


Barbara Meyers,Associate,97401
Barbara Yvette Santiago,Member,90210
Barbara J Green,Associate,67541
Barbara Jean Hall,Moderator,43521


Thanks,
Zach Sniezko

I don't know if you've ever dabbled in VBA, but it's not at all difficult, and you could probably write that program in 10 minutes. Hit ALT-F11 in microsoft Word, and you're in the IDE.
There should be a few tutorials that do exactly what you're after.
Other than that, don't know any programs out there.

Give it a go - Visual Basic for Applications (VBA).
It basically automates anything in the Office Suite and then some.

Adam Mellor
www.chamele.com

If you are using office vba then you might as well use access to load this into a recordset and then write the output of a query (no duplicates) back to the text file...










privacy (GDPR)