Setting up a FreeBSD playground for Elbrus2000 architecture

Posted on September 5, 2022

Elbrus 2000 is a Russian microprocessor architecture based on an uncommon VLIW instruction set. It is now possible to play with its assembly language by running it under a QEmu usermode emulation.

Setting up stuff

To not pollute system ${LOCALBASE} (/usr/local) we will be installing everything into a custom prefix designated by the following environment variable. Change the directory to your liking:

# for csh
setenv E2KDIR /home/arr/e2k
# for sh/bash
export E2KDIR=/home/arr/e2k

cd $E2KDIR

Let's start with building binutils fork with E2K support.

# install required dependencies
pkg install -A devel/gettext-runtime devel/gettext-tools devel/gmake lang/perl5.32 math/gmp math/mpfr ports-mgmt/pkg print/indexinfo print/texinfo lang/python39 devel/meson devel/ninja shells/bash

# get the source
git clone --depth=1 https://git.mentality.rip/OpenE2K/binutils-gdb.git binutils-e2k

# configure, compile and install
cd binutils-e2k
./configure CFLAGS=-isystem/usr/local/include CXXFLAGS=-isystem/usr/local/include LDFLAGS=-L/usr/local/lib --target=e2k-mcst-linux-gnu --prefix=$E2KDIR
gmake -j4 && gmake install

This will install the usual set of binutils tools into $E2KDIR/bin. We're mostly interested in the e2k-mcst-linux-gnu-as assembler and e2k-mcst-linux-gnu-ld linker.

Now we need an emulator to run compiled binaries on:

cd $E2KDIR
git clone -b e2k-bsd-user-blitz https://git.mentality.rip/OpenE2K/qemu-e2k.git
cd qemu-e2k
mkdir build
cd build
../configure --extra-ldflags=-L/usr/local/lib  --extra-cflags=-I/usr/local/include --extra-cflags=-Werror=implicit-function-declaration --enable-debug  --enable-debug-info --disable-docs --disable-capstone  --static --target-list=e2k-bsd-user
ninja

After a successful build we'll get the qemu-e2k executable which is capable of running and debugging Elbrus ELF binaries.

Assembler “Hello World”

It is now possible to play with E2K assembly language. Let's compile and run a simple “Hello World” program:

.section ".data"

$hello_msg:
    .ascii    "Hello World\n\000"

.section ".text"
    .global _start

_start:
    {
      sdisp %ctpr1, 0x3
      addd, 0 0x0, 12, %b[3]
      addd, 2 0x0, [ _f64, _lts1 $hello_msg ], %b[2]
      addd, 1 0x0, 0x1, %b[1]
      addd, 3 0x0, 0x4, %b[0]
    }

    {
      call %ctpr1, wbs = 0x4
    }

    {
      sdisp %ctpr2, 0x3
      addd, 0 0x0, 0x0, %b[1]
      addd, 1 0x0, 0x1, %b[0]
    }

    {
      call %ctpr2, wbs = 0x4
    }

# qemu-e2k has to be run from its build dir
cd $E2KDIR/qemu-e2k/build/

$E2KDIR/bin/e2k-mcst-linux-gnu-as -mcpu elbrus-v3 -o hello.o hello.s
$E2KDIR/bin/e2k-mcst-linux-gnu-ld -o hello hello.o
./qemu-e2k ./hello
Hello World

Here is a short description of what's going on in this program. Because E2K is a VLIW architecture the minimal computational unit in its assembly language is not a single instruction, but a block of instructions wrapped with { and } symbols called wide command.

The first wide command on our program consists of sdisp and addd instructions. The former is used to prepare a special call, a system one in our case. Up to 3 calls can be prepared simultaneosly by using %ctpr1, %ctpr2 and %ctpr3 registers. The second operand of this instruction is a type of the call being made. We are doing a system call and using 0x3 for that.

The addd instruction performs addition, but can also be used as x86 mov instruction by adding zero. In our case it is used to fill %b[] registers with arguments for the syscall:

The %b[0] register is filled with 0x4 value, which is an ID for the write() syscall. You can find syscall numbers in the sys/kern/syscalls.master file.
The %b[1] register is passed 1, a descriptor for the stdout I/O stream.
The %b[2] register is filled with an address of the string constant.
The last %b[3] register contains string's length.

Integer values right after the instruction name define an execution unit that will be in charge of running it. The programmer can actually omit these and let the assembler to distribute instructions between units. Some units can execute instructions that other units can not. Detailed information on this matter can be mined from this document.

The second wide command consists only of the call instruction which actually performs the prepared call. It is passed not with the callee address but one of %cptrX registers. The wbs = 0x4 parameter specifies how much registers of the register window should be shifted before entering the callee. Register windows mechanism is quite a complex topic, so I'll omit any further details there.

The last two wide commands perform the exit() syscall in the same way.

Compiling C

The official Elbrus c/C++ compiler called lcc is closed source, so to compile C for E2K we’d need something else. For example, Elbrus fork of Little C Compiler. Follow steps from its README to build it. Another roadblock to play with C is absence of FreeBSD CSU, a part of libc that initializes C runtime. This means that we’d have to write entry point and system calls code ourselves:

    .text
    .global syscall1
syscall1:
{
    setwd wsz=0x8, nfx=1
    setbn rbs=0x4, rsz=0x3, rcur=0x0
    sdisp %ctpr1, 0x3
}
{
    addd,0,sm 0, %r0, %b[0]
    addd,1,sm 0, %r1, %b[1]
    addd,2,sm 0, 0, %b[2]
    addd,3,sm 0, 0, %b[3]
    addd,4,sm 0, 0, %b[4]
    addd,5,sm 0, 0, %b[5]
}
{
    nop 2
    addd,0,sm 0, 0, %b[6]
    addd,1,sm 0, 0, %b[7]
}
    call %ctpr1, wbs=0x4
{
    nop 5
    return %ctpr3
}
    ct %ctpr3

    .text
    .global syscall3
syscall3:
{
    setwd wsz=0x8, nfx=1
    setbn rbs=0x4, rsz=0x3, rcur=0x0
    sdisp %ctpr1, 0x3
}
{
    addd,0,sm 0, %r0, %b[0]
    addd,1,sm 0, %r1, %b[1]
    addd,2,sm 0, %r2, %b[2]
    addd,3,sm 0, %r3, %b[3]
    addd,4,sm 0, 0, %b[4]
    addd,5,sm 0, 0, %b[5]
}
{
    nop 2
    addd,0,sm 0, 0, %b[6]
    addd,1,sm 0, 0, %b[7]
}
    call %ctpr1, wbs=0x4
{
    nop 5
    return %ctpr3
}
    ct %ctpr3

The setwd/setbn incantations at the start of each procedure is another part of register window operation that is not covered in this post.

On the C side we have this:

#define _NR_exit    1
#define _NR_write   4

#define STDOUT_FILENO 1

long syscall1(long nr, long a0);
long syscall3(long nr, long a0, long a1, long a2);

long write(int fd, const void *buf, long count) {
    return syscall3(_NR_write, fd, (long) buf, count);
}

void exit(int status) {
    syscall1(_NR_exit, status);
}

const char msg[] = "Hello, world\n";

void _start(void) {
    write(STDOUT_FILENO, msg, sizeof(msg));
    exit(0);
}

Now we can assemble and link both files:

lcc-build/lcc -target=e2k/linux hello.c -S -o hello.s
$E2KDIR/bin/e2k-mcst-linux-gnu-as -mcpu=elbrus-v3 -o start.o start.s
$E2KDIR/bin/e2k-mcst-linux-gnu-as -mcpu=elbrus-v3 -o hello.o hello.s
$E2KDIR/bin/e2k-mcst-linux-gnu-ld -o hello start.o hello.o
./qemu-e2k ./hello
Hello World

Hacking on the emulator

The QEmu E2K port on FreeBSD is an very early work-in-progress. To give you an idea of how raw it is, the e2k-bsd-user-blitz branch is based on:

e2k-v5-v7.0.0 branch, which is a WIP for E2K target for QEmu.
blitz branch from the https://github.com/qemu-bsd-user/qemu-bsd-user repository. This is a QEmu fork that implements FreeBSD usermode emulation. This branch is in the process of upstreaming.
e2k-bsd-user branch, which contains bsd-e2k specific code.

If you want to hack on the emulator, start with e2k-bsd-user-blitz and send me your patches.