Bits in Haskell

Bitwise arithmetic is not exotic to Haskell; it works just like any other language.

Apr 03, 2023

∙ Paid

The Haskell Prelude doesn’t have any bitwise arithmetic operations, so to the C or Java programmer it may initially appear that writing Haskell doesn’t or can’t involve bit manipulation. But it is right there in the base library, not an exotic thing, in the Data.Bits module.

The fixed-width integer types too, rather than being built into the language, come from the standard library. In Data.Int we have the signed integers, and in Data.Word we have the unsigned integers. Personally, I really appreciate Haskell’s naming convention for these types; I have an awful time remembering what C’s type names mean, particularly because C and Java disagree on terminology. Java’s “int” is C’s “long”, and Java’s “long” is C’s “long long”.

Table comparing fixed-width integer types in Haskell, C, and Java

A typical Haskell program certainly doesn’t do as much bit work as a typical C program, but it’s not at all unusual. In this article I give a few examples to demonstrate that it works just about the same in Haskell as any other language.

Demo setup

Even for small demonstrations, I find it’s easier to use a Cabal package. Here’s my bits-demo.cabal file:

cabal-version: 3.0
name: bits-demo
version: 0

common base
    default-language: GHC2021
    build-depends: base, bits-show ^>= 0.0.0

library
    import: base
    exposed-modules: BitsDemo
    hs-source-dirs: library

Note that I’m using GHC 2021. I’m also using bits-show, a small package that I published for the sake of this article; it gives a convenient way to print numbers in binary representation using the showFiniteBits function. That functionality isn’t in base presumably because it’s not such a common thing to want unless one is writing a tutorial.

Preliminaries in library/BitsDemo.hs:

module BitsDemo where

import Bits.Show
import Data.Bits
import Data.Int
import Data.Word
import Data.Foldable

printBits :: Int8 -> IO ()
printBits x = putStrLn (showFiniteBits x)

I’ll be playing with the Int8 type. This is an 8-bit signed (two’s complement) integer. In Java this is the “byte” primitive, and in C they call it “signed char.”

If you run cabal repl from the directory where the .cabal and .hs files reside, then for a quick reminder of what 8-bit two’s complement integers look like, you can try things like this:

λ> traverse_ printBits [-2 .. 2]
11111110
11111111
00000000
00000001
00000010

As I add definitions to the .hs file, I’ll use the :reload command to pull them into the REPL.

Bit operations compared with Java

For comparison, let’s look at an introduction to Bitwise and Bit Shift Operators in Java.

Complement

The first thing Oracle tells us about Java’s bit utilities is:

The unary bitwise complement operator "~" inverts a bit pattern; […]

Since I don’t do this very often, I am glad that Haskell doesn’t make me remember what a character like “~” means. Instead our function is the word “complement.”

printComplement :: Int8 -> IO ()
printComplement x = putStrLn $
    showFiniteBits x <> " -> " <> 
    showFiniteBits (complement x)

λ> traverse_ printComplement [-2 .. 2]
11111110 -> 00000001
11111111 -> 00000000
00000000 -> 11111111
00000001 -> 11111110
00000010 -> 11111101

Shift

Next we having shifting:

The signed left shift operator "<<" shifts a bit pattern to the left, and the signed right shift operator ">>" shifts a bit pattern to the right.

We call those functions shiftL and shiftR.1

printLeftShift :: Int8 -> Int -> IO ()
printLeftShift x howFar = putStrLn $
    "Shifted left by " <> show howFar <> ": " <>
    showFiniteBits (shiftL x howFar)

λ> traverse_ (printLeftShift 0b00000101) [0 .. 6]
Shifted left by 0: 00000101
Shifted left by 1: 00001010
Shifted left by 2: 00010100
Shifted left by 3: 00101000
Shifted left by 4: 01010000
Shifted left by 5: 10100000
Shifted left by 6: 01000000

Observe those bits sliding toward the left as we shift farther.

Bitwise arithmetic

The bitwise & operator performs a bitwise AND operation.
The bitwise ^ operator performs a bitwise exclusive OR operation.
The bitwise | operator performs a bitwise inclusive OR operation.

For these Haskell gives us infix operators. They’re named using the same conventions as Java and C, but slightly different because in Haskell those particular symbols are already used for other things: & is flipped function application, ^ is exponentiation and the pipe | is a language keyword used for pattern guards. The Haskell bitwise operators instead look like this:

(.|.) :: Bits a => a -> a -> a  -- OR
(.&.) :: Bits a => a -> a -> a  -- AND
(.^.) :: Bits a => a -> a -> a  -- Exclusive OR (XOR)

λ> putStrLn $ showFiniteBits (0b01010101 .|. 0b00001111 :: Int8)
01011111

λ> putStrLn $ showFiniteBits (0b01010101 .&. 0b00001111 :: Int8)
00000101

λ> putStrLn $ showFiniteBits (0b01010101 .^. 0b00001111 :: Int8)
01011010

The Java introduction then gives us a demo which prints the number 2:

class BitDemo {
    public static void main(String[] args) {
        int bitmask = 0x000F;
        int val = 0x2222;
        System.out.println(val & bitmask);
    }
}

Since Java’s int corresponds to Haskell’s Int32 and Java’s (&) corresponds to Haskell’s (.&.). Hexadecimal literals are written the same way, prefixed by 0x. The equivalent code in Haskell would therefore be:

main :: IO ()
main = do
    let bitmask :: Int32 = 0x000F
        val :: Int32 = 0x2222
    print (val .&. bitmask)

Translating from C++

I’ve recently been toying around with graphics using Vulkan. The version number for the Vulkan API is described in the documentation as:

[…] the version numbers are packed into a 32-bit integer as follows:
The variant is a 3-bit integer packed into bits 31-29.
The major version is a 7-bit integer packed into bits 28-22.
The minor version number is a 10-bit integer packed into bits 21-12.
The patch version number is a 12-bit integer packed into bits 11-0.

For a Haskell program that wants to deal with Vulkan version numbers, there are therefore two datatypes we might want to write: One that represents the version as a Word32 as the Vulkan API does, and another where the four numbers are broken out and labeled as record fields.

newtype VulkanVersionPacked =
    VulkanVersionPacked Word32
    deriving (Eq, Ord, Show)

data VulkanVersionUnpacked =
    VulkanVersionUnpacked
      { variant :: Word32
      , major :: Word32
      , minor :: Word32
      , patch :: Word32
      }
    deriving (Eq, Ord, Show)

The packed version is what I need when I’m making calls to the Vulkan API (such as when initializing a Vulkan instance). The unpacked version is what I would probably want for any other purpose, like for printing information to a log file:

printVulkanVersion :: VulkanVersionUnpacked -> IO ()
printVulkanVersion v = putStrLn $
    "Using Vulkan version " <>
    List.intercalate "."
      (
        (\f -> show (f v)) <$>
            [variant, major, minor, patch]
      )

λ> printVulkanVersion (VulkanVersionUnpacked 0 1 3 245)
Using Vulkan version 0.1.3.245

The Vulkan documentation tells us in the language of C++ how version number information is converted to and from the packed 32-bit representation. The code to turn four numbers into the packed form is given as:

#define VK_MAKE_API_VERSION(variant, major, minor, patch) \
    ((((uint32_t)(variant)) << 29U) | \
    (((uint32_t)(major)) << 22U) | \
    (((uint32_t)(minor)) << 12U) | \
    ((uint32_t)(patch)))

To translate this to Haskell, it doesn’t take much more than swapping in Haskell’s names for the functions. The bit shift operator (<<) changes to shiftL, and the bitwise OR operator (|) changes to (.|.).

pack :: VulkanVersionUnpacked -> VulkanVersionPacked
pack v = VulkanVersionPacked $
    shiftL (variant v) 29 .|.
    shiftL (major   v) 22 .|.
    shiftL (minor   v) 12 .|.
    id     (patch   v)

To try out an example:

λ> let VulkanVersionPacked v = pack (VulkanVersionUnpacked 0 1 3 245) in putStrLn (showFiniteBits v)
00000000010000000011000011110101

Unpacking is given in C++ terms as four separate functions, one to read each part of the packed form.

#define VK_API_VERSION_VARIANT(version) \
    ((uint32_t)(version) >> 29U)
#define VK_API_VERSION_MAJOR(version) \
    (((uint32_t)(version) >> 22U) & 0x7FU)
#define VK_API_VERSION_MINOR(version) \
    (((uint32_t)(version) >> 12U) & 0x3FFU)
#define VK_API_VERSION_PATCH(version) \
    ((uint32_t)(version) & 0xFFFU)

In a style that is more canonical for Haskell, I’ll instead write this as a single function that returns a record of all four components. Again the translation of the bit manipulations is rather straightforward. The shift operator (>>) turns to shiftR, and the bitwise AND (&) changes to (.&.).

unpack :: VulkanVersionPacked -> VulkanVersionUnpacked
unpack (VulkanVersionPacked x) = VulkanVersionUnpacked
    { variant = shiftR x 29
    , major = shiftR x 22 .&. 0x7F
    , minor = shiftR x 12 .&. 0x3FF
    , patch = x .&. 0xFFF
    }

We can test that unpack is the inverse of pack:

λ> unpack (VulkanVersionPacked 0b00000000010000000011000011110101)
VulkanVersionUnpacked {variant = 0, major = 1, minor = 3, patch = 245}

When we have a pair of functions that convert back and forth like this, it’s often a great candidate for property testing.

Property testing

Keep reading with a 7-day free trial

Subscribe to Type Classes to keep reading this post and get 7 days of free access to the full post archives.