Coding Domain

Perl Modules: File::PlainIO


Downloads
Perl Modules
Name
File::PlainIO - Useful routines for file input and output with plain text files

Synopsis
use File::PlainIO;

# Write some data into a file
$file1 = new File::PlainIO( "file1.txt", MODE_WRITE_NEW, "Can't write to file1.txt" );
$file1->writeline("Hello there");
$file1->writelines("This is a test...", "...using File::PlainIO", "The End");
undef $file1;


# Read one line
$file2 = new File::PlainIO( "file2.txt", MODE_READ, "Can't open file2.txt" );
$file2->seeklineno(3)           or die "Can't seek line 3: $!";
$line3 = $file2->readline()     or die "Can't read line 3: $!";
@lines = $file2->readlines(3)   or die "Can't read 3 lines: $!";
@rest  = $file2->readlines()    or die "Can't read remaining lines: $!";
undef $file2;


# Update some lines
my $changelines = sub
                  {
                     $_ = ""  if($. == 3);
                     $_ = "!" if($. == 4);
                  };
$file3 = new File::PlainIO( "file3.txt", MODE_RDWR, "Can't open file3.txt" );
$file3->update($changelines)    or die "Can't update lines: $!";
undef $file3;

Description
This OO module contains useful methods to simplify the access to files. The standard Perl functions are wrapped, sometimes there are extra features included. Files are automatically opened and locked when you create a new object.

Creating a new Object
$fileobj = new File::PlainIO( FILENAME, MODE[, ERROR_MSG][, SHOULD_EXIST]);

Arguments
FILENAME - The name of the file at the user's system
MODE - One of the MODE_ constants exported by this package
ERROR_MSG - Error message to 'die' when the file can't be opened
  When not provided, the returned value will be undef.
SHOULD_EXIST - Set to true if the file should exist when opening

Constants
MODE_READ - Open the file for read-only mode
MODE_WRITE - Open the file for write mode
MODE_RDWR - Open the file for both read/write modes
MODE_WRITE_ADD - Open the file for append mode
MODE_WRITE_NEW - Open the file for write mode (clears the file first)
MODE_RDWR_NEW - Open the file for both read/write modes (clears first)

Method Notes
Most methods are wrappers for Perl file functions. However, they usually provide extra functionality or error checking. Except for that, some methods are easier to handle then the standard Perl functions.

All methods will undef on failure. You can use that to check for errors, for example like this:

  $file->seekpos(34) or die "Can't seek in the file: $!";
  $file->clear()     or die "Can't clear the contents: $!";

For file-IO methods, it's very important you do so. Otherwise, you might get unexpected program results. Some perl programs are filled with such problems. :'(

Basic Methods
filename
Returns the filename used when creating the object.

mode
Returns the mode constant used.

stat()
Returns an array produced by the perl stat function. Maybe this sounds useless, but this method is added, because I don't want anyone to access the hidden private fields. And so preventing bugs. That what OO is all about.

close()
Closes the file before the object is destroyed. Fortunately, you don't need to call this method every time. Every file opened with this module will be closed when the object is destroyed, in other words, when the last reference to it falls out of scope.

unlink()
Closes the file when it's still open and removes it from the file system.

countlines()
Counts the number of lines in the file. This is done very efficient, and might be very useful. Well, I already programmed the method for you ;-) It simply counts the occurences of the newline character, *NOT* the $/ code. Even if the last line is not terminated by a newline character, it will be added to the returned result.

Methods for Reading Data
The read/write methods will use $_ a lot when that is appropriate. In other words, you can use these coding structures as well:
# Reading line by line
while(defined $file->readline())
{
  print "Line = $_\n";
}

# Copying a file
$file2->write     while         $file1->read(1024);
$file2->writeline while defined $file1->readline;

Note that the defined statement is required here, because the readline method returns an empty string (=false) when it find an empty line!

read([NUMBER_OF_BYTES])
Reads a specified number of bytes from the file, or a default amount. Normally, you'd better use other read methods provided by this package.

readall()
Reads all bytes from the current position in the file. This is the "slurp" reading method.

readline()
Reads one line from the file. That means, keeps reading until a \n or an EOF symbol is found. The \n character is automatically removed from the line read. This is quite usefull, but remember that en empty line results in false, within a boolean expression. That means that while($obj->readline()) stops when an empty line is found. Use while(defined $obj->readline()) instead.

readlines([NUMBER_OF_LINES][, MINIMAL_ARRAY_SIZE])
Returns an array with the lines read from the file. All end-of-line characters are automatically removed from the lines.

If the NUMBER_OF_LINES parameter is specified, the function will only read that amount of lines, or less when an EOF symbol is found.

The MINIMAL_ARRAY_SIZE parameter can be used to assure that the returned array has a certain size. All the elements added to the array contain a zero-length string (""). This can be quite usefull what your program uses the -w switch, and you don't want to check all elements first. Note that, when using this function. the returned result always evaluates to true!

Methods for Writing Data
The write methods will convert the specified string, so all line breaks are set correctly, and match the line break type of the current OS. This can be very useful when your CGI program writes the contents of a <TEXTAREA> field into a text file. (That string might contain internet-linebreaks)

write([TEXT])
Writes the text (or $_) back into the file. No linebreaks will be removed or added.

writeline([LINE])
Writes one line back into the file. This method will remove any double linebreaks at the end of the string, to avoid any silly bugs in your program causing you to print two lines.

writelines(ARRAY_WITH_LINES)
Does the same thing as the writeline method, but for each element in the array. This method is also a little more efficient when you have a large array filled with scalars.

Moving the Read/Write Pointer
The methods move the pointer where the data is read from, or written at. The methods don't need a detailed explanation, so you have to do it with this:

tellpos() - Returns the current position
seekpos(POS) - Seeks that position
movepos(OFFFSET) - Seeks relative from the current position
seekbegin() - Seeks the begin of the file
seekeof() - Seeks the end of the file
seeklineindex(I) - Seeks a line, based on an array index
seeklineno(N) - Seeks a line, based on a natural number

Read/Write Operations
rewind()
Same as seekbegin() for now. In other words, you can start over reading the file again.

truncate([TO_BYTE_SIZE])
Clears the contents of the entire file. When the size argument is provided, the file will be truncated to that size, preserving the first bytes/lines in the file.

clear()
Almost the same as calling the previous truncate() method without any arguments. The difference, is the fact this method also moves the read/write pointer back to the beginning of the file. This method should be used when working with MODE_RDWR files.

update(UPDATE_FUNCTION_REFERENCE)
This is a very powerful method, and very useful for update operations with read/write files. The function will loop through the lines of the file, calling the subroutine, which reference has been provided by the caller.

The subroutine doesn't receive any parameters. The line is provided by the $_ variable, so it can be examined by a regexp directly. The \n character at the end has already been removed, so don't worry about that. You can get the line number through the $. variable. If the $_ is changed, that line in the file will change as well. If you set $_ to undef, the line will be removed from the file.

This method will determine automatically which kind of update method is the most efficient. That is either slurping in the entire file into memory, or using an extra temporary file to store the result in.

Anyway, using this method saves you a lot of coding with sysopens, reading, storing the new data elsewhere, and writing it all back in the file. An example of this can be found at the SYNOPSIS section, at the top of this manual.

P.S. Maybe, this method can even be optimized by using tests to determine if anything should be written back, or by remembering that the first 4765 bytes of the file weren't changed at all. And so on. Just let me know if you can implement anything!

Author
Copyright (c) 2001, Diederik van der Boor - All Rights Reserved