Writing a Server in Brainfuck Writing a Server in Brainfuck | Muhammed Ali CAN

Writing a Server in Brainfuck

image
Muhammed Ali CAN 25.04.2024

Introduction

Esoteric languages have always been an interesting subject for me. The challenge of creating something out of thin air and doing so with wild constraints is intriguing. But more than writing in those languages, writing those languages have always been more interesting to me. Writing tokenizers and parsers, creating an interpreter or even trying to compile a code that I designed is thrilling.

Another passion of mine lately has become the 'Brainfuck' language. It is so simple yet so capable. Maybe not as capable as I would like it to be but I digress! Or do I? No I don't! I want it to be more capable and I am going to do it here with you. But first, I need to explain the language to you.

Brainfuck is a dead simple Turing machine that adds some simple I/O commands to another dead simple Turing machine Pā€²ā€². Suppose you have a limited tape, and in that tape you have cells that can hold a byte of value. In BF you can crawl around these cells, mutate their values and maybe output their values or even read a value from the terminal.

8 commandments

You have a byte array, at least 30.000 in length, and a pointer that starts at the beginning of this list. With commands > and <, you can change the position of your pointer and with + and - you can increment or decrement the value a cell holds to your heart's content. With the . command you can write the value of your current cell, the cell your pointer points to, to the terminal and with the , command you can read a single byte from the terminal and store it in the cell your pointer points to. And last but not least, we have the loops. With the [ and ] commands you can create loops. What is inside these brackets becomes your loop to run. If the cell your pointer points to is zero, you skip the loop and continue with the next command. If it is not zero, you execute the commands inside the loop until the cell your pointer points to becomes zero.

They Are All Just Files!

To accomplish what I want to do, I need another command and hear me out, this will all make sense. I want to make an HTTP server with Brainfuck and serve a webpage with it. Brainfuck already has the primitives to deal with I/O, but it is not enough so let me take you aside while I explain what I know about I/O.

Sidenote, they are not actually just all files, but they behave like so, so in our case there is no functional difference.

The Blasphemy

You are probably seeing where I am going with this. What if, in Brainfuck, I could specify the destination of my I/O operations so that I can read from an HTML file and write to an HTTP connection.

The idea is simple, I will add another command to the Brainfuck language to change its target file descriptor. I will then create a simple server in another language and accept connections. Whenever I get a connection, I will pass the connection's and my HTML file's descriptors to my own Brainfuck runtime and read and write switching between my descriptor targets.

The * command will read the value the pointer is currently pointing to and set that as the target file descriptor. Any . or , operations from that point on, will work on that target.

Runtime

I will create the Brainfuck language from scratch with Go, doing what I love to do, writing lexers and parsers, creating an interpreter. We'll have fun I promise.

Writing lexers is easy anyone can do it, especially programmers.

Sike! I'm going to use this tokenizer to lex my input stream. Contrary to Rob Pike's belief, I am not capable of such things. I am a soy dev react fan boy who day dreams about writing svelte in his full time job (Currently unemployed btw šŸ˜‰šŸ˜‰). Jokes aside, I just want to get over this part so I can get to the HTTP part. So this was a lie. Don't believe everything you read on the internet.

Here is the lexer, it takes an input and splits it into these tokens so that I can parse it.

        
func new_lexer() *tokenizer.Tokenizer {
  lexer := tokenizer.New()
  lexer.DefineTokens(PlusToken, []string{"+"})
  lexer.DefineTokens(MinusToken, []string{"-"})
  lexer.DefineTokens(ChevronRightToken, []string{">"})
  lexer.DefineTokens(ChevronLeftToken, []string{"<"})
  lexer.DefineTokens(OpenBracketsToken, []string{"["})
  lexer.DefineTokens(CloseBracketsToken, []string{"]"})
  lexer.DefineTokens(DotToken, []string{"."})
  lexer.DefineTokens(CommaToken, []string{","})
  // Here is our new command
  lexer.DefineTokens(StarToken, []string{"*"})
  // Here is a cheeky little debug token to help me write this insanity
  lexer.DefineTokens(DebugToken, []string{"debug"})

  return lexer
}
        
      

We will need AST for parsing so, here is the simplest ast you can see.

        
type op_kind string

type op interface {
  kind() op_kind
  String() string
}

type ast struct {
  body []op
}
        
      

The op interface describes an operation like moving left on the tape or incrementing a cell. I then create each operation as a struct that implements this interface. Here is the move operation.

        
// a direction for moving the pointer
type dir string

const (
  move_left  dir = "move:left"
  move_right dir = "move:right"
)

type move_op struct {
  dir dir
}
        
      

I am sure you can imagine the rest of the structs yourself.

And here is my parse function, although heavily edited for your viewing pleasure, it captures the gist of the action.

        
func parse(input []byte) ast {
  // I create a lexer and feed my input to it
  lexer := new_lexer()
  stream := lexer.ParseBytes(input)
  defer stream.Close()

  // Create my ast
  program := ast{body: []op{}}

  // Iterate over the tokens and create operations
  // Brainfuck is a relatively easy language to parse
  for stream.IsValid() {
    field := stream.CurrentToken()

    switch field.Key() {
    case PlusToken:
      program.body = append(program.body, mutate_op{mutation: increment})
    case MinusToken:
      program.body = append(program.body, mutate_op{mutation: decrement})
    case ChevronRightToken:
      program.body = append(program.body, move_op{dir: move_right})
    case ChevronLeftToken:
      program.body = append(program.body, move_op{dir: move_left})
    case DotToken:
      program.body = append(program.body, io_op{io: write})
    case CommaToken:
      program.body = append(program.body, io_op{io: read})
    case StarToken:
      program.body = append(program.body, target_op{})
    case DebugToken:
      program.body = append(program.body, debug_op{})
    default:
      // If token is none of these, then it is a comment
      program.body = append(program.body, comment_op{raw: field.ValueString()})
    }

    stream.GoNext()
  }

  return program
}
        
      

Now we are in the crux of it all. We have our AST, we will create an interpreter and run the monstrosity. This time I will bore you with the details.

Here, I have an Interpreter struct that holds our machine.

        
type Interpreter struct {
  // our tape
  tape    [30000]byte
  pointer int
  ast     ast
  // this is the important part
  fd      int
}

// let's create a function that initializes our interpreter
func NewInterpreter(program ast) Interpreter {
  return Interpreter{
    tape:    [30000]byte{},
    pointer: 0,
    ast:     program,
    fd:      1, // 1 is stdout's file descriptor, so initially everything will be written to or read from the terminal
  }
}
        
      

Simple enough right? Now we need to run our program. We will iterate over our AST and execute the operations. Here is the run_context function that interprets a slice of ops.

        
func (i *Interpreter) run_context(body []op) {
  for _, op := range body {
    switch op := op.(type) {
    // Increment or decrement, this is simple
    case mutate_op:
      if op.mutation == increment {
        i.tape[i.pointer]++
      } else {
        i.tape[i.pointer]--
      }
    // Let's move our pointer
    case move_op:
      if op.dir == move_right {
        i.pointer++
      } else {
        i.pointer--
      }

      // I'm going to clamp the pointer to the tape's bounds.
      // I am not sure what is the behaviour of Brainfuck 
      // but why not create some fragmantation, 
      // beats having index out of bounds errors.
      if i.pointer < 0 {
        i.pointer = 0
      }
      if i.pointer > 30000 {
        i.pointer = 30000
      }
    // Loop operations have a body field, let's recursively run 'em
    case loop_op:
      for i.tape[i.pointer] != 0 {
        i.run_context(op.body)
      }
    // This is the important part
    case io_op:
      if op.io == write {
        // We look at our tape and write its value to 
        // the current file descriptor with syscall
        syscall.Write(i.fd, []byte{i.tape[i.pointer]})
      } else {
        // Again, we read from our source with syscall
        b := [1]byte{}
        syscall.Read(i.fd, b[:])
        i.tape[i.pointer] = Cell(b[0])
      }
    // This is the command that changes our file descriptor
    case target_op:
      i.fd = int(i.tape[i.pointer])
    case debug_op:
      // prints some debug info
    }
  }
}
        
      

As you can see, we convert our single byte to an int to keep it as a file descriptor. This then raises the concern of 'what happens if our file descriptor does not fit 8 bits?'. Come to think of it, how would we read from a utf-8 file that does not just hold 8 bit characters?

I could make the runtime use uint32 values instead of bytes but that would break other programs written in Brainfuck, cause you see, when a program runs a - command on a cell that has a 0 value, the byte wraps around and you get a 255 value on the cell or vice versa.

But then again, if I made the cell values bunch of uint32s it would be easy to support unicode, because Go has a nifty rune type that I can just use. So... more fragmantation. I will use uint32 instead of a byte, but maybe we can add a flag to the program to use bytes as well.

        
type Interpreter struct {
-  tape    [30000]byte
+  tape    [30000]uint32
}
        
      

With that, our runtime is ready! I cooked up a custom http server that just uses syscalls and gives me the control over the connection. And when I say 'I', I mean I found an excellent github gist that does what I want. So let me show you the server, where it matters.

Run Time

        
// let's create the server
socket, err := create_socket()
if err != nil {
  panic(err)
}
defer socket.Close()

for {
  // and one by one accept connections
  w, e := socket.Accept()
  if e != nil {
    log.Println(e.Error())
  }

  // here I have my file and I can get its descriptor with .Fd() function
  content, _ := os.OpenFile("./content.html", os.O_RDONLY, 0644)
  // and here I have my connection's descriptor
  w.fd

  // I need to run a Brainfuck program that uses these to respond to the client
	
  content.Close()
  w.Close()
}
        
      

I will pass the file descriptors I acquired to the programs tape, so it will have a head start. Let's write Brainfuck!

        
our program will have the HTML file's descriptor at index 0 and the http descriptor at index 1
we will read the entire file and write to the connection
so as not to mess with our descriptors I will first go to 
cell index 2

* switch to HTML source
>> go to empty cell

+ increment by 1 because we will go into a loop and we don't want to skip it
we will loop until the byte we read from the file is 0
[
  , this will read the byte from the file into this cell
  < * go back to http target and switch to it
  > come back to data cell
  . write the data cell to the connection
  << * go back to the file and switch to it
  >> come back to the data cell
  we leave the pointer state and the file descriptor state as we started the loop
]
        
      

Because Brainfuck deals with I/O 1 byte at a time, we switch between targets and sources and loop over each byte of the HTML file and write to the http connection. One quick test with my browser of choice shows that this code is not working. Why? Because the raw implementation of the http server requires us to send the http headers, as it is only a socket and nothing else. So I am going to preare a file that contains the http headers and before sending the HTML file, we will send the headers. So let me show you the final Go code.

        
// let's create the server
socket, err := create_socket()
if err != nil {
  panic(err)
}
defer socket.Close()

for {
  // and one by one accept connections
  w, e := socket.Accept()
  if e != nil {
    log.Println(e.Error())
  }

  // here I have my header file and I can get its descriptor with .Fd() function
  header, _ := os.OpenFile("./header.html", os.O_RDONLY, 0644)
  // here I have my file and I can get its descriptor with .Fd() function
  content, _ := os.OpenFile("./content.html", os.O_RDONLY, 0644)
  // and here I have my connection's descriptor
  w.fd

  // I will run my server.bf file, with descriptors as arguments
  bfx.RunFile("./server.bf", uint32(header.Fd()), uint32(content.Fd()), uint32(w.Fd()))
	
  header.Close()
  content.Close()
  w.Close()
}
        
      

Here is the header file.

        
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: // I will hard code this manually based on the content


        
      

Let's go over the final Brainfuck code once and for all.

        
let's go to cell index 3 this time because we have
header at 0
content at 1
http at 2

* switch to HTML source
>>> go to empty cell

+ increment by 1 for the loop
we first read/write the header
[
  , this will read the byte from the file into this cell
  < * go back to http target and switch to it
  > come back to data cell
  . write the data cell to the connection
  <<< * go back to the header and switch to it (notice the triple left)
  >>> come back to the data cell
]

the last written byte was 0 so our current cell is 0
+ increment by 1 for the loop
we now write the content
[
  , this will read the byte from the file into this cell
  < * go back to http target and switch to it
  > come back to data cell
  . write the data cell to the connection
  << * go back to the header and switch to it (only double left)
  >> come back to the data cell
]
        
      

And with that, our Brainfuck program is serving the HTML file to our clients. Don't believe me? This blog post is served with Brainfuck.

Conclusion

It is possible to serve HTML files with Brainfuck.

It is possible to serve a lot of things with Brainfuck.

If you can manage to parse it, you also can read the request and do routing as well, I guess.

In my M2 Macbook, my Brainfuck server serves this HTML file just under 100ms which is not that bad. If someone wants make a benchmark I would love to take a look. The whole code is in my github so go check it out if you want.