Bytes and streams in Java

Ugh, again I fight this battle. While hacking on my XML compression tool ‘rngzip’, I'm subclassing Java's input/output stream hierarchy. The read and write methods use int instead of byte so that we may use -1 to represent end-of-file.

But surprisingly, Java bytes are signed. There is no way to specify unsigned numbers, and the primitive casts automatically do sign extension So when you cast a byte containing 0xFF into an integer you get 0xFFFFFFFF, which is -1. This can cause a great many bugs, some of them not apparent until you're processing a binary stream with bytes equal to 0xFF.

The decision to use int here is questionable too, but Java has no lightweight way to specify ‘byte option’.

©20022015 Christopher League