We have seen how to manipulate files as a whole, now let's see how to manipulate their content!
We will start by discussing input and output (I/O) using the java.io
package.
Data for a program may come from several sources and may be sent to several destinations.
The connection between a program and a data source or destination is called a stream.
In the picture, each "O" is a piece of data. The data is streaming from the source into the program. Computed results are streaming from the program to the destination.
There are many types of I/O devices. Some can be a source, others a destination, and some can be both!
Some devices switch roles depending on what program is running. For example a disk file might be the destination for the output of one program, and later it may be the source for another program.
Often the destination for one stream is the source for another.
Here are some examples.
Object | source, destination or both? |
---|---|
disk file | both |
running program | both |
monitor | destination |
keyboard | source |
internet connection | both |
image scanner | source |
A stream object may be classified according to several dimensions, i.e.,
These dimensions can be combined to form many kinds of stream objects, e.g.,
Not only are there many types of streams, there are also many ways to combine them!
A processing stream operates on the data supplied by another stream.
Often a processing stream acts as a buffer for the data coming from another stream. A buffer is a block of main memory used as a work area. For example, disks usually deliver data in blocks of 512 bytes, no matter how few bytes a program has asked for. Usually the blocks of data are buffered and delivered from the buffer to the program in the amount the program asked for.
In the picture:
InputStream System.in
, InputStreamReader
stream,BufferedReader
stream.System.in
is a stream object that the Java system automatically creates when your program starts running.
The data is transformed along the way.
The raw bytes from the keyboard are grouped together into a String object that the program reads using stdin.readLine()
.
A program can set all this up by declaring a BufferedReader
as follows.
BufferedReader stdin =
new BufferedReader (
new InputStreamReader (System.in) );
This may seem like an unnecessary complication, but java.io
gives you a collection of parts that can be assembled to do nearly any I/O task you need.
Character streams
Byte streams
Fundamentally all data consist of patterns of bits grouped into 8-bit bytes. So, logically, all streams could be called "byte streams". However, streams that are intended for bytes that represent characters are called character streams and all others are called byte streams.
Character streams are optimized for character data and perform some other useful character-oriented tasks. Often the source or destination of a character stream is a text file--a file that contains bytes that represent characters.
Data sources and destinations often contain non-character data. For example, the bytecode file created by the Java compiler contains machine instructions for the Java virtual machine. These are not intended to represent characters, and input and output of them must use byte streams.
The diagram below shows the top of the hierarchy for the java.io
package. The dotted clouds are abstract classes, which act as base classes for specialized streams.
InputStream
: byte-oriented input streamOutputStream
: byte-oriented output streamReader
: character-oriented input streamWriter
: character-oriented output streamWe will not instantiate these classes directly, but their subclasses. For example, a BufferedReader
is a Reader
.
Reader
is an abstract class from which all character-oriented input streams are derived.
Readers deliver 16-bit char data to a program which may come from a variety of sources and formats, such as UTF format on a disk file.
The diagram below shows several concrete Reader
classes.
Writer
is an abstract class from which all character-oriented output streams are derived.
Writers receive 16-bit char data from a program and sending it to some destination, which may use a different character formats, such as UTF format on a disk file.
The diagram below shows several concrete Writer
classes.
InputStream
is an abstract class from which all byte-oriented input streams are derived.
Its descendant classes are used for general-purpose input (non-character input).
These streams deliver data to a program in groups of 8-bit bytes, but these bytes can be grouped into the size necessary for the type of data.
For example, if a disk file contains 32-bit int data, data can be delivered to the program in 4-byte groups in the same format as Java primitive type int
.
OutputStream
is an abstract class from which all byte-oriented output streams are derived.
Its descendant classes are used for general-purpose (non-character output).
These streams are aimed at writing groups of 8-bit bytes to output destinations. The bytes are in the same format as Java primitive types. For example, 4-byte groups corresponding to type int can be written to a disk file.
We have used PrintStream many times already, because System.out is an object of that type.