![]()
The Word VBA program was very slow. That program tried to access each line individually using the paragraphs collection object. However, when you delete a line (paragraph), Word has to update that entire collection. This is a terrible breakdown.
Using kixtart, a script could also parse the log file. Eventually, that took 3 hours! I started wondering if there is a better method to this all...
Welcome to Perl
Withing half a hour I wrote the script, while my computer was downloading the file. I added some Benchmark tests to the script, and started it up. Guess what? The script needed 1.5 seconds to parse the entire file ;)
Here it is!
P.S. UNIX people would properly laugh about this story,
knowing AWK provides a simple way of handing this problem.
Running awk '/^\QDirectory / {print $0}' bk2_Vr.log would be my first guess.
#!/usr/bin/perl -w ## UNIX: Put path to perl here. ##### # Remove lines from a logfile. # # Copyright (c) 2001 - All rights reserved. # Diederik van der Boor ("vdboor" at "hotmail.com") # # Permission to use/edit this code as long # as the copyright notice remains intact. # # # Note that... # # There is no command-line, or user-iterface. # Below here some settings van be defined. # # Building this core took about 15 minutes. # Fine-tuning (+variable-setting-features) # another 15 minutes ;-) Go go go Perl! # #### #################################################################### ## Requirements use strict; use Benchmark; use Fcntl qw(:flock); use constant KEEP => 1; use constant REMOVE => 0; use constant TRUE => 1; use constant FALSE => 0; #################################################################### ## settings my $File = 'bk2_Vr.log'; # The File my $Match = qq[^\QDirectory ]; # Regular expression what to match (see `perldoc perlre` manpage) my $Action = REMOVE; # What to do when match (KEEP | REMOVE) my $Display = FALSE; # Should those lines be displayed? (TRUE | FALSE) my $Overwrite = FALSE; # Overwrite the original file when done? (TRUE | FALSE) #################################################################### ## Regexp test eval{ "aa" =~ m[$Match] }; if($@) { $@ =~ s/at .+ line \d+\.\n$//; die "Bad regexp: '$Match': $@\n"; } #################################################################### ## Opening the files my $Temp = "$File.new"; open(LOG, $File) or die "Can't open $File: $!"; open(TEMP, ">$Temp") or die "Can't open $Temp: $!"; flock(LOG, LOCK_EX); flock(TEMP, LOCK_EX); #################################################################### ## Benchmark test my $BM_StartTime = new Benchmark; END { if(defined $BM_StartTime) { my $TimeString = timestr(timediff(new Benchmark, $BM_StartTime)); print qq[Debugging: Execution time is $TimeString\n]; } } #################################################################### ## Finding the data. { $SIG{'__WARN__'} = $SIG{'__DIE__'}; my $Keep = ($Action == KEEP); my $Remove = ($Action == REMOVE); # ! $Keep my $NumKept = 0; my $NumRemoved = 0; while(my $Line = <LOG>) { if($Line =~ m[$Match]) { if($Keep) { # FOUND=KEEP $NumKept++; print TEMP $Line; } else { # FOUND=REMOVE $NumRemoved++; print " $Line" if $Display; } } else { if($Keep) { # NOT_FOUND=NOT_KEEP $NumRemoved++; print " $Line" if $Display; } else { # NOT_FOUND=NOT_REMOVE $NumKept++; print TEMP $Line; } } } print qq[\nSearch complete; $NumKept line(s) kept, $NumRemoved line(s) removed\n]; } #################################################################### ## File close and update flock(TEMP, LOCK_UN); flock(LOG, LOCK_UN); close(TEMP); close(LOG); if($Overwrite) { unlink $File or die "Can't unlink: $!"; rename $Temp => $File or die "Can't rename: $!"; } #################################################################### print qq[\n\nDone.\n\n]; print qq[File saved as $Temp\n\n] unless $Overwrite;
Written by Diederik van der Boor at 30 January 2002