Setting up a FreeBSD playground for Elbrus2000 architecture |
Elbrus 2000 is a Russian microprocessor architecture based on an uncommon VLIW instruction set. It is now possible to play with its assembly language by running it under a QEmu usermode emulation.
Setting up stuff
To not pollute system ${LOCALBASE}
(/usr/local
) we will be installing
everything into a custom prefix designated by the following environment variable.
Change the directory to your liking:
# for csh
setenv E2KDIR /home/arr/e2k
# for sh/bash
export E2KDIR=/home/arr/e2k
cd $E2KDIR
Let's start with building binutils fork with E2K support.
# install required dependencies
pkg install -A devel/gettext-runtime devel/gettext-tools devel/gmake lang/perl5.32 math/gmp math/mpfr ports-mgmt/pkg print/indexinfo print/texinfo lang/python39 devel/meson devel/ninja shells/bash
# get the source
git clone --depth=1 https://git.mentality.rip/OpenE2K/binutils-gdb.git binutils-e2k
# configure, compile and install
cd binutils-e2k
./configure CFLAGS=-isystem/usr/local/include CXXFLAGS=-isystem/usr/local/include LDFLAGS=-L/usr/local/lib --target=e2k-mcst-linux-gnu --prefix=$E2KDIR
gmake -j4 && gmake install
This will install the usual set of binutils tools into $E2KDIR/bin
. We're
mostly interested in the e2k-mcst-linux-gnu-as
assembler and e2k-mcst-linux-gnu-ld
linker.
Now we need an emulator to run compiled binaries on:
cd $E2KDIR
git clone -b e2k-bsd-user-blitz https://git.mentality.rip/OpenE2K/qemu-e2k.git
cd qemu-e2k
mkdir build
cd build
../configure --extra-ldflags=-L/usr/local/lib --extra-cflags=-I/usr/local/include --extra-cflags=-Werror=implicit-function-declaration --enable-debug --enable-debug-info --disable-docs --disable-capstone --static --target-list=e2k-bsd-user
ninja
After a successful build we'll get the qemu-e2k
executable which is capable of
running and debugging Elbrus ELF binaries.
Assembler “Hello World”
It is now possible to play with E2K assembly language. Let's compile and run a simple “Hello World” program:
.section ".data"
$hello_msg:
.ascii "Hello World\n\000"
.section ".text"
.global _start
_start:
{
sdisp %ctpr1, 0x3
addd, 0 0x0, 12, %b[3]
addd, 2 0x0, [ _f64, _lts1 $hello_msg ], %b[2]
addd, 1 0x0, 0x1, %b[1]
addd, 3 0x0, 0x4, %b[0]
}
{
call %ctpr1, wbs = 0x4
}
{
sdisp %ctpr2, 0x3
addd, 0 0x0, 0x0, %b[1]
addd, 1 0x0, 0x1, %b[0]
}
{
call %ctpr2, wbs = 0x4
}
# qemu-e2k has to be run from its build dir
cd $E2KDIR/qemu-e2k/build/
$E2KDIR/bin/e2k-mcst-linux-gnu-as -mcpu elbrus-v3 -o hello.o hello.s
$E2KDIR/bin/e2k-mcst-linux-gnu-ld -o hello hello.o
./qemu-e2k ./hello
Hello World
Here is a short description of what's going on in this program. Because E2K is
a VLIW architecture the minimal computational unit in its assembly language is not
a single instruction, but a block of instructions wrapped with {
and }
symbols
called wide command.
The first wide command on our program consists of sdisp
and addd
instructions.
The former is used to prepare a special call, a system one in our case. Up to
3 calls can be prepared simultaneosly by using %ctpr1
, %ctpr2
and %ctpr3
registers. The second operand of this instruction is a type of the call being made.
We are doing a system call and using 0x3
for that.
The addd
instruction performs addition, but can also be used as x86 mov
instruction by adding zero. In our case it is used to fill %b[]
registers with
arguments for the syscall:
- The
%b[0]
register is filled with0x4
value, which is an ID for thewrite()
syscall. You can find syscall numbers in the sys/kern/syscalls.master file. - The
%b[1]
register is passed1
, a descriptor for thestdout
I/O stream. - The
%b[2]
register is filled with an address of the string constant. - The last
%b[3]
register contains string's length.
Integer values right after the instruction name define an execution unit that will be in charge of running it. The programmer can actually omit these and let the assembler to distribute instructions between units. Some units can execute instructions that other units can not. Detailed information on this matter can be mined from this document.
The second wide command consists only of the call
instruction which actually
performs the prepared call. It is passed not with the callee address but one of
%cptrX
registers. The wbs = 0x4
parameter specifies how much registers of
the register window should be shifted before entering the callee. Register
windows mechanism is quite a complex topic, so I'll omit any further details
there.
The last two wide commands perform the exit()
syscall in the same way.
Compiling C
The official Elbrus c/C++ compiler called lcc
is closed source, so to compile
C for E2K we’d need something else. For example, Elbrus fork of Little C Compiler.
Follow steps from its README to build it. Another roadblock to play with C is
absence of FreeBSD CSU, a part of libc that initializes C runtime. This means
that we’d have to write entry point and system calls code ourselves:
.text
.global syscall1
syscall1:
{
setwd wsz=0x8, nfx=1
setbn rbs=0x4, rsz=0x3, rcur=0x0
sdisp %ctpr1, 0x3
}
{
addd,0,sm 0, %r0, %b[0]
addd,1,sm 0, %r1, %b[1]
addd,2,sm 0, 0, %b[2]
addd,3,sm 0, 0, %b[3]
addd,4,sm 0, 0, %b[4]
addd,5,sm 0, 0, %b[5]
}
{
nop 2
addd,0,sm 0, 0, %b[6]
addd,1,sm 0, 0, %b[7]
}
call %ctpr1, wbs=0x4
{
nop 5
return %ctpr3
}
ct %ctpr3
.text
.global syscall3
syscall3:
{
setwd wsz=0x8, nfx=1
setbn rbs=0x4, rsz=0x3, rcur=0x0
sdisp %ctpr1, 0x3
}
{
addd,0,sm 0, %r0, %b[0]
addd,1,sm 0, %r1, %b[1]
addd,2,sm 0, %r2, %b[2]
addd,3,sm 0, %r3, %b[3]
addd,4,sm 0, 0, %b[4]
addd,5,sm 0, 0, %b[5]
}
{
nop 2
addd,0,sm 0, 0, %b[6]
addd,1,sm 0, 0, %b[7]
}
call %ctpr1, wbs=0x4
{
nop 5
return %ctpr3
}
ct %ctpr3
The setwd/setbn
incantations at the start of each procedure is another part
of register window operation that is not covered in this post.
On the C side we have this:
#define _NR_exit 1
#define _NR_write 4
#define STDOUT_FILENO 1
long syscall1(long nr, long a0);
long syscall3(long nr, long a0, long a1, long a2);
long write(int fd, const void *buf, long count) {
return syscall3(_NR_write, fd, (long) buf, count);
}
void exit(int status) {
(_NR_exit, status);
syscall1}
const char msg[] = "Hello, world\n";
void _start(void) {
(STDOUT_FILENO, msg, sizeof(msg));
write(0);
exit}
Now we can assemble and link both files:
lcc-build/lcc -target=e2k/linux hello.c -S -o hello.s
$E2KDIR/bin/e2k-mcst-linux-gnu-as -mcpu=elbrus-v3 -o start.o start.s
$E2KDIR/bin/e2k-mcst-linux-gnu-as -mcpu=elbrus-v3 -o hello.o hello.s
$E2KDIR/bin/e2k-mcst-linux-gnu-ld -o hello start.o hello.o
./qemu-e2k ./hello
Hello World
Hacking on the emulator
The QEmu E2K port on FreeBSD is an very early work-in-progress. To give you an idea of how raw it is, the e2k-bsd-user-blitz branch is based on:
e2k-v5-v7.0.0
branch, which is a WIP for E2K target for QEmu.blitz
branch from the https://github.com/qemu-bsd-user/qemu-bsd-user repository. This is a QEmu fork that implements FreeBSD usermode emulation. This branch is in the process of upstreaming.e2k-bsd-user
branch, which contains bsd-e2k specific code.
If you want to hack on the emulator, start with e2k-bsd-user-blitz
and send me
your patches.