Bits in Haskell
Bitwise arithmetic is not exotic to Haskell; it works just like any other language.
The Haskell Prelude doesn’t have any bitwise arithmetic operations, so to the C or Java programmer it may initially appear that writing Haskell doesn’t or can’t involve bit manipulation. But it is right there in the base library, not an exotic thing, in the Data.Bits module.
The fixed-width integer types too, rather than being built into the language, come from the standard library. In Data.Int we have the signed integers, and in Data.Word we have the unsigned integers. Personally, I really appreciate Haskell’s naming convention for these types; I have an awful time remembering what C’s type names mean, particularly because C and Java disagree on terminology. Java’s “int” is C’s “long”, and Java’s “long” is C’s “long long”.
A typical Haskell program certainly doesn’t do as much bit work as a typical C program, but it’s not at all unusual. In this article I give a few examples to demonstrate that it works just about the same in Haskell as any other language.
Demo setup
Even for small demonstrations, I find it’s easier to use a Cabal package. Here’s my bits-demo.cabal file:
cabal-version: 3.0
name: bits-demo
version: 0
common base
default-language: GHC2021
build-depends: base, bits-show ^>= 0.0.0
library
import: base
exposed-modules: BitsDemo
hs-source-dirs: library
Note that I’m using GHC 2021. I’m also using bits-show, a small package that I published for the sake of this article; it gives a convenient way to print numbers in binary representation using the showFiniteBits function. That functionality isn’t in base presumably because it’s not such a common thing to want unless one is writing a tutorial.
Preliminaries in library/BitsDemo.hs:
module BitsDemo where
import Bits.Show
import Data.Bits
import Data.Int
import Data.Word
import Data.Foldable
printBits :: Int8 -> IO ()
printBits x = putStrLn (showFiniteBits x)
I’ll be playing with the Int8 type. This is an 8-bit signed (two’s complement) integer. In Java this is the “byte” primitive, and in C they call it “signed char.”
If you run cabal repl from the directory where the .cabal and .hs files reside, then for a quick reminder of what 8-bit two’s complement integers look like, you can try things like this:
λ> traverse_ printBits [-2 .. 2]
11111110
11111111
00000000
00000001
00000010
As I add definitions to the .hs file, I’ll use the :reload command to pull them into the REPL.
Bit operations compared with Java
For comparison, let’s look at an introduction to Bitwise and Bit Shift Operators in Java.
Complement
The first thing Oracle tells us about Java’s bit utilities is:
The unary bitwise complement operator "
~
" inverts a bit pattern; […]
Since I don’t do this very often, I am glad that Haskell doesn’t make me remember what a character like “~” means. Instead our function is the word “complement.”
printComplement :: Int8 -> IO ()
printComplement x = putStrLn $
showFiniteBits x <> " -> " <>
showFiniteBits (complement x)
λ> traverse_ printComplement [-2 .. 2]
11111110 -> 00000001
11111111 -> 00000000
00000000 -> 11111111
00000001 -> 11111110
00000010 -> 11111101
Shift
Next we having shifting:
The signed left shift operator "
<<
" shifts a bit pattern to the left, and the signed right shift operator ">>
" shifts a bit pattern to the right.
We call those functions shiftL and shiftR.1
printLeftShift :: Int8 -> Int -> IO ()
printLeftShift x howFar = putStrLn $
"Shifted left by " <> show howFar <> ": " <>
showFiniteBits (shiftL x howFar)
λ> traverse_ (printLeftShift 0b00000101) [0 .. 6]
Shifted left by 0: 00000101
Shifted left by 1: 00001010
Shifted left by 2: 00010100
Shifted left by 3: 00101000
Shifted left by 4: 01010000
Shifted left by 5: 10100000
Shifted left by 6: 01000000
Observe those bits sliding toward the left as we shift farther.
Bitwise arithmetic
The bitwise
&
operator performs a bitwise AND operation.The bitwise
^
operator performs a bitwise exclusive OR operation.The bitwise
|
operator performs a bitwise inclusive OR operation.
For these Haskell gives us infix operators. They’re named using the same conventions as Java and C, but slightly different because in Haskell those particular symbols are already used for other things: & is flipped function application, ^ is exponentiation and the pipe |
is a language keyword used for pattern guards. The Haskell bitwise operators instead look like this:
(.|.) :: Bits a => a -> a -> a -- OR
(.&.) :: Bits a => a -> a -> a -- AND
(.^.) :: Bits a => a -> a -> a -- Exclusive OR (XOR)
λ> putStrLn $ showFiniteBits (0b01010101 .|. 0b00001111 :: Int8)
01011111
λ> putStrLn $ showFiniteBits (0b01010101 .&. 0b00001111 :: Int8)
00000101
λ> putStrLn $ showFiniteBits (0b01010101 .^. 0b00001111 :: Int8)
01011010
The Java introduction then gives us a demo which prints the number 2:
class BitDemo { public static void main(String[] args) { int bitmask = 0x000F; int val = 0x2222; System.out.println(val & bitmask); } }
Since Java’s int corresponds to Haskell’s Int32 and Java’s (&) corresponds to Haskell’s (.&.). Hexadecimal literals are written the same way, prefixed by 0x
. The equivalent code in Haskell would therefore be:
main :: IO ()
main = do
let bitmask :: Int32 = 0x000F
val :: Int32 = 0x2222
print (val .&. bitmask)
Translating from C++
I’ve recently been toying around with graphics using Vulkan. The version number for the Vulkan API is described in the documentation as:
[…] the version numbers are packed into a 32-bit integer as follows:
The variant is a 3-bit integer packed into bits 31-29.
The major version is a 7-bit integer packed into bits 28-22.
The minor version number is a 10-bit integer packed into bits 21-12.
The patch version number is a 12-bit integer packed into bits 11-0.
For a Haskell program that wants to deal with Vulkan version numbers, there are therefore two datatypes we might want to write: One that represents the version as a Word32 as the Vulkan API does, and another where the four numbers are broken out and labeled as record fields.
newtype VulkanVersionPacked =
VulkanVersionPacked Word32
deriving (Eq, Ord, Show)
data VulkanVersionUnpacked =
VulkanVersionUnpacked
{ variant :: Word32
, major :: Word32
, minor :: Word32
, patch :: Word32
}
deriving (Eq, Ord, Show)
The packed version is what I need when I’m making calls to the Vulkan API (such as when initializing a Vulkan instance). The unpacked version is what I would probably want for any other purpose, like for printing information to a log file:
printVulkanVersion :: VulkanVersionUnpacked -> IO ()
printVulkanVersion v = putStrLn $
"Using Vulkan version " <>
List.intercalate "."
(
(\f -> show (f v)) <$>
[variant, major, minor, patch]
)
λ> printVulkanVersion (VulkanVersionUnpacked 0 1 3 245)
Using Vulkan version 0.1.3.245
The Vulkan documentation tells us in the language of C++ how version number information is converted to and from the packed 32-bit representation. The code to turn four numbers into the packed form is given as:
#define VK_MAKE_API_VERSION(variant, major, minor, patch) \ ((((uint32_t)(variant)) << 29U) | \ (((uint32_t)(major)) << 22U) | \ (((uint32_t)(minor)) << 12U) | \ ((uint32_t)(patch)))
To translate this to Haskell, it doesn’t take much more than swapping in Haskell’s names for the functions. The bit shift operator (<<
) changes to shiftL
, and the bitwise OR operator (|
) changes to (.|.
).
pack :: VulkanVersionUnpacked -> VulkanVersionPacked
pack v = VulkanVersionPacked $
shiftL (variant v) 29 .|.
shiftL (major v) 22 .|.
shiftL (minor v) 12 .|.
id (patch v)
To try out an example:
λ> let VulkanVersionPacked v = pack (VulkanVersionUnpacked 0 1 3 245) in putStrLn (showFiniteBits v)
00000000010000000011000011110101
Unpacking is given in C++ terms as four separate functions, one to read each part of the packed form.
#define VK_API_VERSION_VARIANT(version) \ ((uint32_t)(version) >> 29U) #define VK_API_VERSION_MAJOR(version) \ (((uint32_t)(version) >> 22U) & 0x7FU) #define VK_API_VERSION_MINOR(version) \ (((uint32_t)(version) >> 12U) & 0x3FFU) #define VK_API_VERSION_PATCH(version) \ ((uint32_t)(version) & 0xFFFU)
In a style that is more canonical for Haskell, I’ll instead write this as a single function that returns a record of all four components. Again the translation of the bit manipulations is rather straightforward. The shift operator (>>
) turns to shiftR
, and the bitwise AND (&
) changes to (.&.
).
unpack :: VulkanVersionPacked -> VulkanVersionUnpacked
unpack (VulkanVersionPacked x) = VulkanVersionUnpacked
{ variant = shiftR x 29
, major = shiftR x 22 .&. 0x7F
, minor = shiftR x 12 .&. 0x3FF
, patch = x .&. 0xFFF
}
We can test that unpack is the inverse of pack:
λ> unpack (VulkanVersionPacked 0b00000000010000000011000011110101)
VulkanVersionUnpacked {variant = 0, major = 1, minor = 3, patch = 245}
When we have a pair of functions that convert back and forth like this, it’s often a great candidate for property testing.
Property testing
Keep reading with a 7-day free trial
Subscribe to Type Classes to keep reading this post and get 7 days of free access to the full post archives.