kparc.com/b/

I have been studying this code for a while and I think I have understood most of it, although in general I am quite far out of my comfort zone. I find it fascinating. It is terse code, not your typical C, that’s for sure, but beautiful in its own way. It makes use of several clever tricks that are not what I would call self-explanatory. I hope the following notes help other interested people.

Although authorship is not mentioned anywhere, several references indicate that the code in kparc.com/b/ was written by Arthur Whitney. It seems to be his style. Love it or hate it.

Introduction

The following files define an interpreter for a language called b, isomorphic to C. When compiled with the k.c file instead of b.c, you would obtain the k interpreter. Unfortunately, the k.c file is not publicly available.

In the readme file, these notes are included:

A.S kernel interface (map unmap ..)
a.c replaces
libc (fp and pf need more work)
b.c compiles b code

Additionally, the c.h file defines some macros to write C in a terser manner, and a.h is the header file that accompanies a.c. There are also some test files included in C and b (t.c and t.b) and a Makefile.

c.h

Included by a.h, this file consists of some basic definitions for a terser C language.

Single letters for common types:

typedef unsigned char C,*S,*K;
typedef int I;
typedef long J;
typedef double F;
typedef void V;

Z is added for static types. For static inline, a preceding _ is used (why not either a Z for static or define _I, _J, and so on?):

#define ZI static I
#define ZC static C
#define ZS static S
#define ZJ static J
#define ZF static F
#define ZK static K
#define ZV static V

#define _ static inline

Control statements are replaced by single letter macros:

#define R return
#define P(b,a...)   if(b)return(a);
#define N(n,a...)   {I i=0,_n=(n);while(i<_n){a;++i;}}
#define W(b...)     while((b))              //while
#define $(b,a...)   if(b){a;}else           //cond
#define C(i,a...)   case i:{a;}break;       //case
#define S(i,c,a...) switch(i){c default:a;} //switch

Signatures of built-in functions:

V*memcpy();
J
strlen(const char*);

Maximum and minimum are also defined as macros, which return a copy of the max or min value:

#define MN(a,b)    ({typeof(a)_a=(a);typeof(a)_b=(b);_a<_b?_a:_b;})
#define MX(a,b)    ({typeof(a)_a=(a);typeof(a)_b=(b);_a>_b?_a:_b;})

AB is the abort macro. It uses functions from a.h: os(s) prints the string s and a new line to stdout, exit(1) exits with an error. In case exit fails, (V*)0L will produce a segmentation fault.

#define AB(s)      (os(s),exit(1),(V*)0L) //abort string

A.S

System calls (read, write, open, close, fstat, mmap, munmap and exit) are called using assembler. The S(n) macro defines a new global name n such that it first moves the content from rcx (4th parameter in function calls) to r10 (4th parameter in syscalls), then sets rax to the desired syscall number (defined in unistd_64.h), makes the syscall, and returns:

#include<asm/unistd_64.h>
#define S(n) .globl n;n:;mov %rcx,%r10;mov $__NR_##n,%rax;syscall;ret;
S(read)S(write)S(open)S(close)S(fstat)S(mmap)S(munmap)S(
exit)

(I find it slightly surprising that system calls numbers are taken from the unistd_64.h header file instead of given as integers. I guess it probably is done so to avoid problems with changes between versions, but this is one of the few concessions to defined constants).

a.h

Header file for C library (a.c).

Basically, all values (chars, integers and floats) are of type K (which is a char*). The type of the K value is determined in basis to the following constants:

#define KC 1L // char
#define KI 3L // int
#define KF 6L // float
#define KS 7L // symbol (char)

Additionally, NL represents a new (not assigned) value, and QQ represents an error string. X2 and X3 are only used in commented code (fixed arrays are used instead in this version of the code):

#define NL (K)(X1<<48)
#define X1 11L
#define X2 12L
#define X3 13L
#define QQ 15L

There are two main kinds of values: unique values and array values.

Unique values are encoded in a single K with the actual type in the first 16 bits (a pointer only uses the last 48 bits), except floats. Floating point numbers in 64 bits use the first 12 bits for the sign and the exponent. The exponent will only be zero for negative zeros and subnormal numbers, but these are not supported, since DAZ (denormal-as-zero) and FTZ (flush to zero) are set at init time (see _start, below).

Arrays are implemented by adding 8 extra bytes with metadata before the actual array elements: memory (bucket), references, type, unique type, and 4 bytes for array size.

Extern definitions (public interface to C library in a.c):

extern J nt[],ip();extern K G[],r1(),tn(),j2(),k0(),sS(),enm(),o();V exit(),w2(),r0();S pi(),pf(),px();F fp();

Some other functions are defined as static inlined functions:

The X function replaces the value pointed to by k with the value in y and returns an empty K value (NL). It decreases the reference counter of the value pointed to by k (with r0, see below), so memory will become available if this was the only reference:

_ K X(K*k,K y){R r0(*k),*k=y,NL;}

Output functions print to stdout (using the w2 function from a.c) chars (given as integers), newlines, strings (with an added newline), integers, floats and integers in hexadecimal (the last 3 functions use the print functions pi, pf and px, defined in a.c). All these functions (except nl, to print newlines) return the printed value (nb: chars are given and returned as integers):

_ I oc(I i){R w2((S)&i),i;}

_ V nl(){w2("\n");}

_ S os(S s){R w2(s),nl(),s;}

_ J oi(J j){R os(pi(j)),j;}

_ F of(F f){R os(pf(f)),f;}

_ J ox(J j){R os(px(j)),j;}

Scan functions to find, in the string s, a character c and return the pointer, or zero if not found. The function scn limits the search to a number of characters n, and returns this number if c is not found, or the index:

_ S sc(S s,I c){W(*s-c)P(!*s++,0)R s;}

_ I scn(S s,I c,I n){N(n,P(c==s[i],i))R n;}

Next, there are several functions to declare arrays and unique values.

The P1 to P3 functions define default parameters (x, y, z), but it looks like they have been replaced by a fixed array (see struct/fixedarray experiment in a.c):

//_ K P1(J x){R(K)(X1<<48|x);}
//_ K P2(J x){R(K)(X2<<48|x);}
//_ K P3(J x){R(K)(X3<<48|x);}

Array values are declared using tn(t,n), from a.c, which allocates memory for n elements of the type t. The 0 type is used for pointers. An array of pointers is equivalent to a list of arbitrary values:

_ K kK(I n){R tn(0,n);}
_ K
kC(I n){R tn(KC,n);}
_ K
kI(I n){R tn(KI,n);}
_ K
kS(I n){R tn(KS,n);}

These functions create a new char array from a string, either with n chars (pn) or zero terminated (kp):

_ K pn(S s,I n){R memcpy(kC(n),s,n);}
_ K
kp(S s){R pn(s,strlen(s));}

More functions to work with character arrays: the functions c0, c1, c2 and c3 create char arrays of 0, 1, 2 and 3 characters, respectively; the function jc appends a character c to the char array in x using the j2 function (from a.c), which joins two arrays:

_ K c0(){R kC(0);}

_ K c1(C x){K r=kC(1);R*r=x,r;}

_ K c2(C x,C y){K r=kC(2);R*r=x,r[1]=y,r;}

_ K c3(C x,C y,C z){K r=kC(3);R*r=x,r[1]=y,r[2]=z,r;}

_ K jc(K x,C c){R j2(x,kc(c));}

Functions to create new unique values such that the type is stored in the first 16 bits, except for floating point values:.

_ K kc(J x){R(K)(KC<<48|x);}

_ K ki(unsigned x){R(K)(KI<<48|(J)x);}

_ K kf(F f){R*(K*)&f;}

_ K ks(J x){R(K)(KS<<48|x);}

_ K qs(S s){R(K)(QQ<<48|(J)s);}

The A(x) macro will return the type of x if it is a unique value or zero for arrays. I(x) returns the value of a unique integer. Q(x) returns with x if it is an error string, and Qs(c,s) will return s as an error string if the condition c is true. The macro N1 is similar to the macro N, but does a do-while instead of a while loop (why isn’t it defined in c.h?):

#define A(x) ({J _j=(J)(x);!_j||_j>>52?KF:15&_j>>48;})
#define I(x) (I)(J)(x) //(-1UL>>16&(J)(x))
#define Q(x)        P(QQ==A(x),x)

//#define Q(c,i)      P(c,qi(i))  //error index(nyi,rank,length,type,..)
#define Qs(c,s)     P(c,qs(s))  //error string
#define N1(n,a...)  {I i=0,_n=(n);do{a;}while(++i<_n);}

The macros Ax and xi are shortcuts for A(x) and I(x), while xf access x as a float value (why isn’t there a F(x) macro?):

#define Ax A(x)
#define xi I(x)
#define xf *(F*)&x

Array attributes are stored before the actual array values, and therefore accessed with negative indices with the macros xm, xr, xu, xt, xn. The 8 bytes preceding the array elements store the attributes mturnnnn corresponding to memory bucket, type, return value type (for user functions), reference counter, and a 32 bits integer which indicates the size of the array:

#define xm x[-8] //mem
#define xr x[-7] //ref
#define xu x[-6]
#define xt x[-5]
#define xn xI[-1]

The macros xx, xy and xz are used to access the first 3 values in an array of K values, while xI, xF and xK give access to x as an array of integers, floats and K values, respectively (xC is not needed because K values are already defined as a char*). The macros Xc, Xi and Xx allow to access the i-th element of an array of the C, I or K type. Finally, X0(e) returns a copy of e after decreasing the reference counter of x:

#define xx xK[0]
#define xy xK[1]
#define xz xK[2]
#define xI ((I*)x)
#define xF ((F*)x)
#define xK ((K*)x)
#define Xc  x[i]
#define Xi xI[i]
#define Xx xK[i]
#define X0(e)  ({typeof(e)_e=(e);r0(x);_e;})

In addition to the above macros that work for the x variables, they are defined for the variables y, z and r. The variables x, y and z usually correspond to function parameters, while r is the return value:

#define Ay A(y)
#define yi I(y)
#define yu y[-6]
#define yt y[-5]
#define yn yI[-1]
#define yx yK[0]
#define yy yK[1]
#define yI ((I*)y)
#define yK ((K*)y)
#define Yc  y[i]
#define Yi yI[i]
#define Yx yK[i]
#define Y0(e)  ({typeof(e)_e=(e);r0(y);_e;})
#define Az A(z)
#define zi I(z)
#define zu z[-6]
#define zt z[-5]
#define zn zI[-1]
#define zx zK[0]
#define zy zK[1]
#define zI ((I*)z)
#define zF ((F*)z)
#define zK ((K*)z)
#define Zc  z[i]
#define Zi zI[i]
#define Zx zK[i]
#define Z0(e)  ({typeof(e)_e=(e);r0(z);_e;})
#define Ar T(r)
#define ri I(r)
#define rr r[-7]
#define rt r[-5]
#define rn rI[-1]
#define rI ((I*)r)
#define rK ((K*)r)
#define rx rK[0]
#define ry rK[1]
#define rz rK[2]
#define Rc  r[i]
#define Ri rI[i]
#define Rx rK[i]
#define R0(e)  ({typeof(e)_e=(e);r0(r);_e;})

In the following macros for function definitions, the letter indicates the return type, with a preceding Z for static functions, and the final number from 1 to 3 indicates the number of arguments:

#define ZV1(f) ZV f(K x)
#define ZS1(f) ZK f(S s)
#define ZK1(f) ZK f(K x)
#define ZK2(f) ZK f(K x,K y)
#define ZK3(f) ZK f(K x,K y,K z)
#define V1(f)   V f(K x)
#define I1(f)   I f(K x)
#define K1(f)   K f(K x)
#define K2(f)   K f(K x,K y)
#define K3(f)   K f(K x,K y,K z)

The k1, k2 and k3 functions return an array of K with 1, 2 or 3 values:

_ K1(k1){K r=kK(1);R rx=x,r;}
_ K2(k2){K r=kK(
2);R rx=x,ry=y,r;}
_ K3(k3){K r=kK(
3);R rx=x,ry=y,rz=z,r;}

These functions join several elements in a single array (j2, defined in a.c, joins two arrays):

_ K3(j3){R j2(j2(x,y),z);}      // join 3 K values (x,y,z) in K array
_ K2(jk){R
j2(x,k1(y));}        // append y K value to x
_ K
cj(C c,K y){R j2(c1(c),y);} // prepend c char to y

a.c

This is the C library. It includes basic memory management and I/O functions, as well as functions to help building interpreters.

Function calls

Signatures of the functions defined in A.S to make system calls:

J read(),write();I open(),close(),fstat(),munmap();S mmap();V exit();

Memory allocation and file I/O

Memory is allocated using mmap. The function ma allocates n bytes of memory if d is zero, or maps memory for the file with file descriptor d. The initial memory address used is 0x70000 (I think this is a partially arbitrary choice, just large enough to leave room for the stack in lower addresses). This address will be rounded to the nearest memory page boundary, and it is increased after each allocation (nb: it is kept in a static variable):

ZS ma(I d,J n){ZJ p=0x700000;p+=d?0:n;R mmap(
   D?
0:p-n,  // memory address
   n,        
// number of bytes
   
0x7,      // permissions (rwx)
   d?
2:0x22, // flags: 2 for private changes, 0x20 for anonymous (not files)
   d?d:
-1,   // file descriptor (-1 for anonymous)
   
0         // offset is zero
)
;}

Files are also opened using mmap. The string s contains a file name, and the number of bytes (the file size given by fstat) will be stored in n:

ZS mf(S s,J*n){
   J b[
18];       // to store results of fstat
   I d=open(s,
0); // open file
   Qs(
0>d,s)      // return with error string if open failed
   R
fstat(d,b),  // call fstat
     s=(*n=b[
6])?ma(d,*n):s, // allocate memory if size (b[6]) is > 0, else
                             
// will return string with name (n will be zero)
     close(d),    
// close file
     s;          
// return allocated memory (or string with name)
}

The function w2 writes a zero-terminated string to stdout, while r2 reads a maximum of 256 input characters from stdin (while also printing them to stdout) and returns the address of the zero-terminated string with these characters, stored in the static buffer b:

V w2(S s){write(2,s,strlen(s));}
ZS
r2(S s){ZC b[256];R w2(s),b[read(0,b,256)-1]=0,b;}

Formatted strings

All the format functions (named pi for integers, px for hexadecimal integers and pf for floating point values) use the static buffer b, with 24 bytes, to write characters from the end of the buffer:

ZC b[24];

The static function ng, which precedes a string with a minus sign, is used for formatting negative values:

ZS ng(S s){R*--s='-',s;}

The pi function returns a string with a formatted integer in decimal base with the help of the static function pu, which prints an unsigned integer (in decimal base) from the least significant digit at the end of the buffer until the most significant one:

ZS pu(S s,J i){J j;do*--s='0'+i-10*(j=i/10);W(i=j);R s;}
S
pi(J i){R 0>i?ng(pi(-i)):pu(b+23,i);}

Integers can also be printed in hexadecimal base using px. This function calls hh(s,c), which writes the hexadecimal representation of the byte c to the string s. The symbol to use is selected on basis to the property that, in ASCII, ‘W’ is defined as ‘a’-10:

ZS hh(S s,C c){N(2,C a=i?c&15:c>>4;s[i]="0W"[9<a]+a)R s;}
S
px(J j){S s=b+23;unsigned long k=j;do hh(s-=2,k);W(k>>=8);R s;}

Floating point values are formatted with pf:

S pf(F f){
   P(
0>f,ng(pf(-f))) // negative values
   P(!f,
"0.0")       // fixed format for zero value
   
// n is the exponent
   I n=
6; W(f<1e6)--n,f*=10; W(f>=1e7)++n,f/=10;
   
// print (integer) exponent if different of zero
   S p=n?p=pi(n),*--p=
'e',p:b+23;
   
// only one non-decimal digit (n is the mantissa now)
   n=f+
.5;W(99<n&&!(n%10))n/=10;
   
// move non-decimal digit to the left and add the dot
   R p=pu(p,n),p[
-1]=*p,*p--='.',p;
}

Parsing functions read values from the string passed as the first argument with a maximum of n characters. The function ip reads integers and fp reads floating point values.

J ip(S p,I n){P('-'==*p,-ip(p+1,n-1))J j=0;N(n,j=10*j+p[i]-'0')R j;}
ZF
x(I n){F e=1;N(n,e*=10)R e;}  // n-th power of 10
F
fp(S p,I n){
   P(
'-'==*p,-fp(p+1,n-1)) // negative values
   I l=scn(p,
'e',n), // position of e, or last character if not found
     m=scn(p,
'.',l), // position of dot, or l if not found
     f=l<n?ip(p+l+
1,n-l-1):0, // exponent (zero if e not found)
     j=ip(p,m),
// non-decimal part
     k;
   m+=m<l,
// increase m if there was a dot to point to decimal digits
   k=ip(p+m,MN(
9,l-=m)); // decimal digits
   k+=j*x(l),
// integer number with all digits
   f-=l;
// adjust exponent depending on number of decimal digits
   
// make it a floating point value multiplying or dividing by
   
// the power of 10 determined by the exponent (f)
   R
0>f?k/x(-f):k*x(f);
}

Arrays memory management

Unique values are stored in K values, with the type in the first 2 bytes. Array values are stored in memory from a memory pool structured as a hash table, where the hash is related to the approximate array size. The address of the buckets is stored in the static array M. The total amount of assigned memory bytes (which is usually lower than the allocated one) is stored in the workspace variable W. I am not sure why the initial value is -32 (may it be because of the 32 bytes need for a fixed array?):

ZK M[31];

ZJ W=-32;

The M[i] value points to a memory region with space available for 16<<i bytes. If more than one such region is available, the addressed memory will contain a pointer to it, in a linked list.

The function m1(n) returns a pointer from the memory pool with available space for n bytes. If there is a region available in a bucket, it is returned, and the bucket set to zero. If there is not, but there is memory in a larger bucket, it is returned from there and the remaining memory is assigned to a lower size bucket. If no memory is available, it is allocated with ma:

ZK m1(J n){
   K x,r;
   
// i is the bucket number (set only once)
   I i=
60-__builtin_clzl(n+7),j;
   
// If there is a memory region available, return it and
   
// set bucket address to the next one in the linked list
   P(x=M[i],M[i]=xx,x)
   
// Search for a memory region in a bucket of larger size
   
// (in this case, M[j] will be != 0), or allocate the necessary
   
// memory, with a minimum of 4 Mb (16L<<18)
   
// 8 additional bytes are needed at the beginning for the attributes
   j=i;W(!(x=++j<
31?M[j]:8+ma(0,16L<<(j=MX(18,i)))));
   xm=i,    
// set memory bucket attribute
   M[j]=xx,
// set bucket address to next region in linked list
   r=x;    
// r is the return value
   
// Adjust larger buckets such that:
   W(i<j)
// j-i are the memory regions that need to be assigned to buckets
       x+=
16L<<xm,          // the next available memory region x
       M[*(J*)(x
-8)=i++]=x, // is assigned to the corresponding bucket
                           
// with 8 preceding bytes for attributes
       xx=
0;                // and no more elements in linked list
   R r;
}

Arrays of type t and size n are allocated using the function tn(t,n). The size for each type is stored in the nt array:

// types are chijefs:
// pointer (list), char, hex byte?, int, long, e?, float, symbol (char as int)
J nt[]={
8,1,2,4,8,4,8,4};
K
tn(I t,I n){
   
// allocate memory (minimum for 1 value) for every non-string type
   K x=m1(MX(
1,n)*nt[KS<t?0:t]);
   R
       W+=
16L<<xm,     // add assigned memory to workspace
       xu=
0,xt=t,xn=n, // set metadata (not a function, type, length)
       x;              
// return memory address of first element
}

The static functions l1 and l0 respectively allocate and free fixed arrays for three pointer values and I think l2 should return an array of K strings with their values but has not been implemented yet. These fixed arrays have type 8 (there should probably be a #define KL 8L in a.h, but a comment indicates that these fixed arrays are an experiment). When a fixed array is freed with l0, its y and z elements are freed calling l0 recursively (I think the l0(xz) is unnecessary, it will be called from l0(xy)) and its memory is returned to the pool, making it the first element in the corresponding bucket and referencing the previous one as a linked list:

ZK1(l2){R kp("[]");} // I think this is a TODO
ZV1(l0){
if((J)xy)l0(xy),l0(xz);xx=M[xm],M[xm]=x;}
K3(l1){K r=m1(
24);R rt=8,rn=3,rx=x,ry=y,rz=z,r;}

More generally, arrays increase and decrease their reference counters with r1 and r0, respectively. If there is an overflow, r1 will abort (Is this expected? Why are there no similar checks anywhere else?). When r0 is called and there are no remaining references, it is returned to the pool:

K1(r1){
   P(Ax,x)          
// just return for unique values
   R++xr?x:AB(
"r1"); // increase ref counter or abort on overflow
}
V1(r0){
   
if(Ax||!x)R;        // just return for unique values
   
if(8==xt){l0(x);R;} // call l0 for fixed array types
   
if(xr){--xr;R;}     // if there are remaining refs after decreasing, return
   
if(!xt||KS<xt)N1(xn,r0(Xx))  // for arrays, call r0 for each value
   W-=
16L<<xm,           // subtract from workspace
       xx=M[xm],M[xm]=x;
// return memory to pool
}

In practice, what happens is the following: the first time m1 is called, it will allocate enough space for n bytes, but in fact allocating a memory region whose size is a power of 2 multiplied by 16 bytes (with a minimum of 4 Mb). The not used memory is assigned to different buckets in the M variable, such that M[0] points to a 16 bytes memory region, M[1] points to a 32 bytes memory region, M[2] to a 64 bytes memory region, and so on. Next time m1 is called it will return an address from the smallest possible bucket. If the corresponding bucket does not contain a pointer (is zero), it will take the pointer from the next bucket, divide the region in two, and return one while assigning the other to M. Memory is never really deallocated, it is just returned to the pool. This is done storing pointers to regions of the same size in a linked list, such that a pointer is stored in the memory region pointed to by the address in M, and so on. Notice that, although a memory region can be divided in smaller ones, they are never merged back.

The additional function j2(x,y) is used to merge two arrays in one, copying the elements of y at the end of x or copying the elements of both in a new array if there is not available space or there is some reference to x. For this, it uses the static function xiy, which copies values from an array in another, assuming there is enough memory available:

ZK xiy(K x,I i,K y){
   
// copy contents of y after the i-th element of x (NX=size of x element)
   
memcpy(x+NX*i,y,NX*yn);
   
if(!yt)N(yn,r1(Yx)) // increase references to elements in y (see below)
   R
Y0(x); // decrease reference to y and return x (this will also decrease
           
// the references to each of the elements of y, that is the reason
           
// they need to be increased before)
}
K2(j2){
   
// m is the old size, n is the new one
   I m=xn,n=m+(Ay?
1:yn); // increase 1 if y is a unique value
   P(!m&&!xt,X0(y))
// decrease ref to x and return y if x is empty
   
// if no available space or x is referenced, copy it to a new array
   $(xr||
8+NX*n>16L<<xm,x=xiy(tn(xt,n),0,x))xu=0,xn=n;
   
// copy contents of y to the end of x
   Ay?
memcpy(x+NX*m,&y,NX):xiy(x,m,y);
   R x;
}

REPL functions

The file a.c also includes several functions that can be used to build an interpreter. For this, it makes use of the ev() and ps() functions, defined elsewhere (ie. in the the b.c or k.c file):

K ev(),ps();

There is space available for 26 global variables (including functions), stored in the G array. Each of them will be named by a single letter from a to z. There is also a K0 value which is not used by b.c, although it is initialized (in the km function, see below):

K G[26]; ZK K0;

The interpreter program will start with the _start function, avoiding the fatter (!) main function. This function first writes a 32 bit integer before the stack pointer (at address -4(%rsp))  and calls ldmxcsr to set the 32 bit register MXCSR, which contains flags for control and status information regarding SSE instructions. Then, it moves the pointer at 8(%rsp), which corresponds to argv, to %rdi (first function argument) and jump to km.

V _start(){

    asm("movl $0x9fc0,-4(%rsp);ldmxcsr -4(%rsp);lea 8(%rsp),%rdi;jmp km");

}

The flags of the MXCSR register are the following:

FZ        bit 15        Flush To Zero
RN        bits 13 and 14 are 0        Round To Nearest
PM        bit 12        Precision Mask
UM        bit 11        Underflow Mask
OM        bit 10        Overflow Mask
ZM        bit 9        Divide By Zero Mask
DM        bit 8        Denormal Mask
IM        bit 7        Invalid Operation Mask
DAZ        bit 6        Denormals Are Zero

The flags FZ (also called FTZ) and DAZ make possible to code unique values of other types as a special case of floating point values, since they imply, in practice, that a number with a zero exponential part is not a valid floating point number.

Inside km, G values are initialized to be new values (NL), and K0 to be a zero length char array (this value is never used later). Then, the first argument of the program (a is argv[0], so ++a is argv[1]), if given, is loaded with the ld function. Finally, the program enters in an infinite loop reading user input with r2, evaluating the obtained string with es and printing the result with pr:

V km(S*a){
   N(
26,G[i]=NL)*(K*)   // init globals to empty values

    (K0=kK(0))=c0();     // init K0 to empty char array
   
if(*++a)pr(ld(*a));  // load file in first argument
   W(
1)pr(es(r2(" "))); // REPL
}

The loading function ld reads a file line by line, processing its input with pr and es:

ZS1(ld){
   J n;          
// file size
   Q(s=mf(s,&n))
// mmap file or return file name as error string
   S t=s,u;      
// t=first char in line, u=first char in next line
   I a,d=
0;
   
// parse until EOF or an exit command is found
   W(t<s+n&&d>=
0){
       
// finish line with a null character and set u to the first character
       
// of next line. a will be 1 for a single / character (may be a comment)
       
// and -1 for a single \ character (exit command)
       u=sc(t,
10),*u++=0,a=t[1]?0:(*t=='/')-(*t=='\\');
       
// if not a comment (starts with //) and not exiting, evaluate string

        // and print result, possibly returning an error string
       
if(!d&&!a&&'/'-*t)Q(pr(es(t)))
       d+=a,t=u;
// next line
   }
   
if(n)munmap(s,n); // unmmap non-zero file (zero files are not mmapped)
   R NL;
// return empty value
}

Strings (zero-terminated) are evaluated, both inside ld and the infinite loop in km, by es. This function will call different functions for system commands (\\ or \ to exit, \w to return workspace size, \v for global variables, \f for functions, \t to time a command and \l to load a file) or ev (defined in b.c or k.c) for everything else:

ZS1(es){
   K x;
   
// return empty value from empty line, else evaluate with ev
   
// everything that is not a system command (does not start by \).
   
// If there is a parsing error, ps will return an error string
   
// and will be returned, else it will return NL, I think
   P(
'\\'-*s,!*s?NL:(x=ps(s))&&NL-x?X0(ev(x)):x)
   
// exit with \ or \\ command
   
if(!*++s||'\\'==*s)exit(0);
   R !s[
1]? // single system commands with no arguments
       
'w'==*s?ki(W):           // return workspace size as an integer K value
       sc(
"vf",*s)?vf('f'==*s): // run vf function for \v or \f
       qs(s)                    
// return error string with command
   :
// system commands with arguments
       
't'==*s?tm(s+1):         // time command just in next character
       
'l'==*s?ld(s+2):         // load file (after a space)
       qs(s);                  
// return error string for everything else
}

The commands \v and \f call the vf function (with argument 1 for functions and zero otherwise). It returns an array with the numbers of the variables used with type string (char*). This will be understood by the pr function to format it as a list of variables with single letter names:

ZK vf(I f){
   K r=kS(
0); // strings array for output, initially empty
   N(
26, // 26 is the number of global variables
       K x=G[i];
       
if(NL-x) // skip empty values
           
if(f^(Ax||xt)) // functions do not have a type
               r=j2(r,ks(i))
// append variable number with string type
   )
   R r;
}

Command \t times the following command calling tm. If it is called as \t:n, the function is repeated n times. The static function ms, called from tm, returns the time from the last reset of the tsc counter in milliseconds, assuming that each cpu cycle lasts 0.58e-6 ms. The number of cycles is got calling the asm instruction rdtsc, which stores the 64 bits number in two 32 bits one (in this case, a and d):

ZF ms(){
   J a,d;
   
asm volatile("rdtsc":"=a"(a),"=d"(d));
   R((d<<32)+a)*.58e-6;
}

ZS1(tm){
   S t=sc(s,
' '); // find space
   Qs(!t,s)      
// return error if space not found
   *t=
0;          // replace space with null character
   
// n is the number of times the command to time is run
   I n=
':'-*s++?1:           // only once if : not used
       
10u>*s-'0'?ip(s,t-s): // parse integer number
       (J)es(s);            
// evaluate an expression
   
// parse string if system command and assign result to x (zero otherwise)
   K x=
'\\'-*++t?ps(t):0,r;
   F a=ms();
// get initial "time"
   
// run command n times, ending returning an error if it happens.
   
// It is assumed that every system command is \l, but it is not checked.
   
// The returned value ref counter is decreased at the end of each iteration
   N(n,Q(r=x?ev(x):ld(t+
3))r0(r))
   
if(x)r0(x); // decrease ref counter again (just in case, I think)
   R
ki(ms()-a); // return time difference
}

The result of es when loading files or in the REPL is printed by pr, which does not do much more than calling the output function o, which will call, for non-errors, the more complex se function:

ZK1(pr){if(NL-x)r0(o(x));R x;} // print non-empty values
K1(o){
   K y=QQ-Ax?se(x):
// if not an error, store string representation of x in y
       
// else, y is "ERROR:" and the returned string
       
// (removing the error type from first 16 bytes of the pointer)
       j2(kp(
"ERROR: "),kp((S)(-1UL>>16&(J)x)));
   R
Y0(write(2,y,yn)),nl(),x; // print it to stderr
}
ZK1(se){
   
// function c returns a quoted string, or its characters in hexadecimal
   
// if it contains non-printable characters (which means it is a function),

    // formatted as TxX, where T is the type number of the return value and X

    // is its address in hexadecimal
   
// (why is this function inside se instead of being a static one?)
   K1(c){
       N(xn,
if(94u<Xc-32){ // non-printable characters (not from 32 to 126)
           K r=kC(
2*xn);           // each char requires 2 hex digits
           N(xn,hh(r+
2*i,Xc))      // fill r with x representations
           R j2(c2(
'0'+xu,'x'),r); // format
       })
       R
cj('"',jc(r1(x),'"'));    // return quoted string (or list)
                                   
// increasing the ref counter of x
   }
   P(Ax,
// Unique values
       KS==Ax?c2(
'`','a'+xi):  // KS unique chars preceded by a back quote
       KC==Ax?X0(c(x=c1(xi))):
// quoted single char
       kp(                
// string as K value for
           KI==Ax?pi(xi):      
// integer values
           KF==Ax?pf(xf):      
// floating point values
           (S)
"+"+!xi          // operators
       )
   )
   
// from here, x is an array
   P(
8==xt,l2(x)) // return "[]" for fixed array (a TODO, I suppose)
   P(
1==xn,cj(',',se(Li(x,0)))) // preceed single value arrays by a comma
   I t=KS<xt?
0:xt; // t is the type of x
   P(KC==t,c(x))
// an array of chars is returned as a quoted string
   P(!xn,c2(
"!("[!t],")  0   `"[t])) // empty values: "()" for list, "!0" for
                                     
// integers, "!`" for KS values, and "! "
                                     
// for everything else
   x=sS(
";      "[t],e1(se,x)); // separate elements formatted as strings with
                               
// semicolons for lists, with spaces otherwise
   R!t?cj(
'(',jc(x,')')):x; // enclose lists in parens
}

These additional functions are called from se:

// Return item in position i as unique element
ZK
Li(K x,I i){R
   !xt||KS<xt?Xx:
// list element
   KC==xt?kc(Xc):
// char
   KI==xt?ki(Xi):
// integer
   ks(Xi);        
// string
}
// Evaluate f for every element of x
ZK
e1(K(*f)(),K x){
   K r=kK(xn);
// result array, of same length as x
   N1(rn,Rx=f?f(Li(x,i)):Li(x,i))
// e1 is only called with an f, so this check
                                 
// (to just return the array elements if f is
                                 
// null) should not be necessary
   R r;
}
// Return string with elements of x separated by c
K
sS(I c,K x){
   I n=c?xn:
0; // will double size with separators
   N(xn,K y=Xx;n+=yn)
// add length of each element
   K r=kC(n);
// make array of total length needed
   
if(c)--rn; // ignore the last separator
             
// (although it is added and memory for it is allocated)
   
// form return string r (s is the current position in the N loop)
   S s=r;
   N(xn,K y=Xx;s=
memcpy(s,y,yn)+yn;if(c)*s++=c)
   
// return string decreasing the ref counter of x
   R
X0(r);
}

Finally, there is an enm function, which creates an array of integers from 0 to xi:

K1(enm){K r=kI(xi);N(rn,Ri=i)R r;}

The a.c file also includes a rand static function to generate (semi)random numbers, but it is not used (maybe a missing \r system command? It may be easily implemented multiplying by the the result of ms and dividing to have a float between 0 and 1). The digits of pi are used as seed, together with the value 4294957665 (recommended in numerical recipes):

ZI rand(){ZJ j0=-314159;R j0=4294957665L*j0+(j0>>32);}

b.c

That was the easy part. Now it gets complicated.

The b interpreter (I assume the k one is not too different) works generating strings with machine code and executing them casting to a function. It is basically a compiler. Of course, no constants are used, just opcodes in hexadecimal. There are also no comments, except for this gem:

// :+-*% ^&|<=>  x64 JJ Jj o2 cc tst RET cll psh pop acdbsbsd89..  o[m[s|d]] c3 eb+1 e8+4 e9+4 [f2/66/4*][0f] 5* 7*+1 b*+4 0f8*+4  03 23 2b 3b (6b 83) 89 8b ..

This means:

  • :+-*% ^&|<=> dyadic functions
  • JJ Jj o2 cc tst RET cll psh pop: x64 specific functions
  • acdbsbsd89…: registers ordered by address (rax is 0, rcx 1, rdx 2, …)
  • o[m[s|d]]
  • c3: return
  • eb+1: JMP rel8
  • e8+4: CALL rel32
  • e9+4: JMP rel32
  • ...

Some useful references:

There are two functions: ps parse a string to generate an array of K values that ev executes, casting to a function, the type number in yu indicates the resulting type of the function (a generic K value, a float value or an integer):

// Evaluate a function, the x argument must be a K* array,
// where the second argument is the function (a character string)
// and the return type is its u attribute
K1(ev){
   x=xy,
   x=
       KS<xu?((K(*)())x)():
       KF==xu?kf(((F(*)())x)()):
       ki(((J(*)())x)());
   R x;
}

The ps function is where all the fun is. It includes several embedded functions:

K ps(S s){
   K z;
   I a=
0,
     
// addresses of registers in function calling convention order
     A[]={
0,7,6,2,1,8,9,10,11,3,12,13,14,15,5,4},
     B=
5,
     RET=
0xc3, // return opcode
     
// jumps and conditional jumps
     
//    jmp  jb   jz   jnbe jmp32 jnb  jnz  jbe   jnb32
     JJ[]={
0xeb,0x72,0x74,0x77,0xe9, 0x73,0x75,0x76, 0x0f83};
   
// convert octal number abc to integer,

    // used to fill mod(2),reg(3),r/m(3) byte
   I
m(I a,I b,I c){R 64*a+8*(7&b)+(7&c);}
   
// if required, adds additional opcode for access to new 8 bit registers
   K
rex(I r,I x,I b,K y){
       
// rex instructions:
       
//    0x41=rex.b,  0x42=rex.x,  0x43=rex.xb, 0x44=rex.r,
       
//    0x45=rex.rb, 0x46=rex.rx, 0x47=rex.rxb
       R(r=
7<A[r])+(x=7<A[x])+(b=7<A[b])?cj(0x40+4*r+2*x+b,y):y;
   }
   
// opcode o, arguments x,y (not rex: x should never be >15)
   K
h(I o,I x,I y){
       R
j2(
           
256>o?c1(o): // 1 byte opcode
           
c2(o>>8,o),  // 2 byte opcode
           
// 3 bit for mod, 3 bit for reg, 3 bit for r/m
           
// mod 3 => register addressing mode:
           
//     first arg (x) in reg
           
//     second arg (y) in r/m
           16>y?
c1(m(3,x,y)):
           
// mod 0 and r/m 5 => displacement only addressing mode:
           
//     first arg (x) in reg
           
//     second arg (y) in 32 bit address following the instruction
           
c5(m(0,x,5),y)
       );
   }
   
// opcode o, arguments x,y (maybe rex)
   K
i(I o,I x,I y){
       R
rex(
           
16>x?x:0,
           
0,
           
16>y?y:0,
           h(o,
16>x?A[x]:x-16,16>y?A[y]:y)
       );
   }
   
// function call
   K
cll(I c){R c5(0xe8,c);}
   
// push from register A[x]
   K
psh(I t,I x){R rex(0,0,x,c1(0x50+(7&A[x])));}
   
// pop to register A[x]
   K
pop(I t,I x){R rex(0,0,x,c1(0x58+(7&A[x])));}
   
// condition (set byte on conditon functions: 0x0f90,...)
   K
cc(I o,I x){R j2(i(0x0f20+JJ[o],16,x),i(0x0fb6,x,x));}
   
// test (sets sz if x is not zero), abort for fp x
   K
tst(I t,I x){R KF==t?AB("tst"):i(0x85,x,x);}
   K
Jj(K x,I n){R cj(0x0f,c5(16+x[xn],n-4));}
   
// return object code to execute opcode o with arguments x and y
   
// and leave the argument, of type t, in register r
   K
o2(I t,I o,I r,I x,I y){
       K z;
       P(KF==t,
8u>y-8?AB("vex"):j2(
           
// 0xc5 is an invalid instruction, next char is ???
           c2(
0xc5,16*(8&~r)+8*(15&~x)+(5-o?3:1)),
           
// for fp (with 0f prefix): i2f is integer to float
           
//      mov  add  sub  mul  div  cmp      i2f
           h((C[]){
0x10,0x58,0x5c,0x59,0x5e,0x2e,0,0,0x2a}[o],r,y)
       ))
       K
f(I o,I x,I y){
           
// for ints:    mov  add  sub  imul     cmp  and
           R
127>y?i((I[]){0x8b,0x03,0x2b,0x0faf,0,0x3b,0x23}[o],x,y):
           rex(
0,0,x,
               
// reg indicates the operation: 05741=add,sub,cmp,and,or
               o?c3(
0x83,m(3," \0\5\7\4\1"[o],A[x]),y-128):
               c5(
0xb8+(7&A[x]),y-128) // move to register x
           );
       }
       I a=
126<y,s;
       P(
0<=o&&r==x&&(!a||3-o),
            // 4 is %, which forces arguments to be floats, when used with

            // integers, it means shift right (/)

            4-o?f(o,r,y):
           
129-y?AB("/"): // shift only by one, or abort
           
//shift one to the right
           i(
0xd1,16+7,r)
       )
       P(
0<o&&r==y,
           z=f(o,r,x),

            //           neg
           
2-o?z:j2(z,i(0xf7,16+3,r))
       )
       P((a?
3:1)<o,j2(f(0,r,x),f(o,r,y)))
       R
           s=
0<o?0:3+(o+1)/2,
           rex(
               R,a?
0:y,x,

                //         mov  mov      lea  imul
               c3(
0>o?1&o?0x8b:0x89:3-o?0x8d:0x6b,

                   m(3-o?a:3,A[r],a?A[x]:4),
                  a?(
2-o?y-128:128-y)<<s:m(s,A[y],A[x])
               )
           );
   }
   
// compare
   K
cm(I t,I x,I y){R o2(t,5,x,x,y);}
   
// convert to float
   K
cv(I x,I y){R o2(KF,8,x,x,A[y]);}
   K
sh(I t,I r){R AB("sh");}
   
// length of instruction
   I
ln(S s){
       I o=*s++,
// opcode
         h=o/
16, // first hex digit of opcode
         
// add 2 for invalid opcode, 1 for 2 byte opcodes
         p=
0xc5==o?2:0x0f==o;
       R
           
4==h?1+ln(s):     // h=4 => rex
           RET==o||
5==h?1:   // h=5 => push/pop
           *JJ==o||
7==h?2:   // h=7 => conditional jumps
           
0xe==h||0xb==h?5: // h=0xe => call, h=0xb => mov
           p&&
8==*s/16?6:    // conditional jumps

            // 3==s[p]/64 => mod == 11 => r/m is a register
           
//               add      imul
           p+(
3==s[p]/64?2+(0x83==o||0x6b==o):
             
// 5==(0xc7&s[p]) => displacement addressing mode

               //                   (mod=00, r/m=101)

               // with displacement addressing mode the instruction

               // will have one instruction byte, one for mod,reg,r/m

               // and four for the address, else instruction, mod,reg,r/m

               // and one byte for the register
             
5==(0xc7&s[p])?6:3);
   }
   
// linker: fix relative addresses
   V1(lnk){
       S s=x;
       W(s<x+xn){
           
// 4==*s/16 => *s is a REX instruction, skip it
           I n=ln(s+=
4==*s/16),
             p=
0xc5==*s?2:0x0f==*s;
           S r=s+n
-4;

            // check if instruction uses relative address argument:
           
if(

               0xe8==*s|| //function call

               (p?8-s[1]/16:4>*s/16||8==*s/16)&&5==(0xc7&s[1+p])

            )
               *(I*)r=(
                   
0xe8==*s?a-'a'==*r?x:
                           
26==*r?(S)l1:
                            ((K*)G[*r])[
1]:
                   
32>*r?(S)&zF[2+*r-16]:
                         (S)(G+*r-
'a')
               )-r
-4;
           s+=n;
       }
   }
   
// disassemble (only used for debugging, commented)
   V1(dis){
       
// print output type
       w2(px(xu)),oc(
':');
       S s=x;
       
// print code
       W(s<x+xn
-2){N(ln(s),w2(px(*s++)))oc(' ');}
       
// print D[0] and D[1]
       N(
2,w2(px(*s++)))
       nl();
   }
   
// assign return type to function value
   K
u(I u,K x){R xu=u,x;}
   
// jump
   K
jmp(I n){
       R n<
-128||n>127?
            c5(JJ[
4],0>n?n-3:n): // long jump
            c2(*JJ,n);          
// short jump
   }
   K
O2(I t,I f,I r,K x,K y){
       I i=Ay?yi:yu;
// y can be a value or a function
       R
u(r,j3(
           Ay?c0()
:y,
           x,
           
// function
           10>f?
o2(t,f,r,xu,i):
           
// comparison
           
j2(cm(t,xu,i),16>r?cc(f-9,r):c1(f-9))
       ));
   }
   K
SH(I t,K y){R u(yu,j2(y,sh(t,yu)));}
   
// set r, of type t, to zero (2 is sub, and r-r=0)
   K
ZR(I t,C r){R u(r,o2(t,2,r,r,r));}
   
// move value y with type t to register r
   
// (func 0 is mov, the first argument is an empty array)
   K
MV(I t,I r,K y){R O2(t,0,r,u(r,c0()),y);}
   
// parse numbers, returning either an integer or a fp value
   
// NB: float values as 1e6 are not allowed, they must contain a dot
   K
Na(){
       S r=s;
       
// set s to end of number (exponential form not considered)
       W(
10u>*++s-'0'||'.'==*s);
       I f=
0;
       
// check if it is floating point
       N(s-r,f|=
'.'==r[i])
       
// the two loops may actually be merged in one:
       
// W(10u>*++s-'0'||f|='.'==*s);
       
// return float or int
       R f?kf(fp(r,s-r)):ki(ip(r,s-r));
   }
   
// N is the counter variable (initially i, then j,k,l,...)
   
// D[0] are local vars (and arguments), D[1] local floats
   
// L is the D value for that variable
   
// T is the type of local variables
   C N=
8,D[2]={1,1},L[26],T[26];
   N(
26,L[i]=T[i]=0)
   
// set r to arguments list for function definitions
   S r=
'['==s[1]&&(r=sc(s,']'))&&*++r?r:0;
   I M=
0; // registers in use
   
// set a to global name for assignments and function
   
// definitions and point k to its address
   K*k=r||
':'==s[1]?a=*s,s+=2,G+a-'a':0;
   
// fill with array using enm
   
// NB: !2+2 or !(2+2) is not !4; just ints, not expressions
   P(
'!'==*s,++s,X(k,enm(ki(ip(s,strlen(s))))))
   
// return the type of a value
   I1(t){
       I a=xi-
'a';
       R !Ax?xu:                        
//function (return type)
         
126<xi?KI:                      //integer (coded in single byte)
         
26u>a&&T[a]?T[a]:               //variable
         A(x=
26u>a?G[a]:zK[2+xi-16])?Ax: //variable or argument as an
                                         
// additional element of z
         xt+
8;                           //array (element type + 4th bit)
   }
   I1(q){
       I i=xi-
'a';
       R Ax?
26u>i&&L[i]?L[i]:0:
         
// why not ':'-*x like everywhere else?
         
':'==*x?I(xy):0;
   }
   
auto K d();
   
// expression x to register r
   K
e(I r,K x){
       K y=d(r,x);
       
// just pass y for arrays and functions. Else
       
// if y is zero, either coded as a small int (128),
       
// or in the z array, set register to zero (ZR),
       
// else move value to register
       R Ay?
128==yi||32>yi&&!zK[2+yi-16]?ZR(t(x),r):MV(t(x),r,y):y;
   }
   K
b(I f,K x){
       K y=d(
16,x);
       R Ay?AB(
"b"):
         
16==yu?y[yn-1]=JJ[y[yn-1]+f*4],y:
         j3(y,tst(t(x),yu),c1(JJ[f?
2:6]));
   }
   
// transform values and other expressions into object code,
   
// except conditional expressions, and leave result in register r
   K
f(I r,K x){
       K y=e(r,x);
       
// if not a function, move value y to register r
       R r-yu?MV(t(x),r,y):y;
   }
   K
E(I r,K x){
       I i=xn
-1;
       K z=e(r,Xx),y=kK(i--);
       r=zu,Yx=z;
       W(i--)Yx=e(
0,xK[i+1]);
       R
u(r,sS(0,y));
   }
   
// transform conditional expressions $[] into object code
   K
v(I r,K x,I n){
       K y=xz,z;
       I c=!n&&!Ay&&a==*y,l=M;
       K1(h){R++s,
1<n?e(0,x):f(r,x);}
       z=h(xK[
3]),M=l,y=h(y),x=b(1,xy);
       y=j2(y,n||c?yn-=c*B,jmp(
1-n?n-xn-yn-3:zn):c1(RET));
       R
j3(jc(x,yn),y,z);
   }
   
// loop (W or N)
   K1(w){
       I i=
'N'==*x?L[N++]:0,j=0;
       V
mm(K x){
           I i;
           $(Ax,
               
if(26u>xi-'a'&&L[i=xi-'a'])M|=1<<L[i]
           )$(
':'==*x&&A(xy),
               i=I(xy)-
'a',M&=~(1<<L[i]),mm(xz)
           )N(xn,mm(sc(
"{WNC",*x)?xK[xn-1-i]:Xx))
       }
       mm(x);
       K y=xy,z=xz;
       P(!i&&!Az&&
'$'==*z,
           x=b(
1,y),
           z=v(
0,z,-xn-1),
           j3(jc(x,zn+
2),z,jmp(-xn-1-zn-2))
       )
       x=i?M|=
1<<i,
       jc(cm(
0,i,j=(j=q(y))?j:*D),JJ[1]):b(0,y),
       z=i?j2(e(
0,z),o2(0,1,i,i,129)):e(0,z);
       I n=-zn-xn
-1;
       z=j3(jmp(zn),z,n<
-128?--xn,j2(x,Jj(x,n)):jc(x,n));
       R i?--N,M&=~(
1<<i),j3(f(j,y),ZR(0,i),z):z;
   }
   K
g(I c,K x){
       K y=c0(),z,r=c0();
       I i=
0,l=a?M:0;
       W(++i<xn){
           z=Xx;
           I l=M&
1<<i,
             b=Az||
128>*z&&':'-*z||i-L[I(zy)-'a'],
             h=
2-i||Az||26>c;
           z=f(i,z);
           
if(!h)z=j3(psh(0,3),z,pop(0,3));
           
if(l)z=j2(b?psh(0,i):z,b?z:psh(0,i)),r=j2(r,pop(0,i));
           y=j2(z,y);
       }
       z=cll(c);
       W(i<
16){if(l&1<<i)z=j3(psh(0,i),z,pop(0,i));++i;}
       R
j3(y,z,r);
   }
   
// lookup c character in s and return index, or zero if not found
   I
l(S s,I c){S t=sc(s,c);R t?t-s:0;}
   
// return kind of character c
   
// NB: broken to fit into several lines, it was a single
   
//     128>c with a very long string in the original
                 
// !"#$%&'()*+,-./0123456789:;<=>?@
   I
c(I c){R 65>c?"   +$++ ()++ + +0000000000+;+++  "[c-32]:
                 
//ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`
             
97>c?"aaaaaaaaaaaaaNaaaaaaaaWaaa[+]  `"[c-32]:
                 
//abcdefghijklmnopqrstuvwxyz{|}~
             
128>c?"aaaaaaaaaaaaaaaaaaaaaaaaaa{+} "[c-32]:0;
   }
   
// function number corresponding to character i
   I
U(I i){R l(" +-*% &|  <=>",i);}
   
// to register, constant or global
   K
d(I r,K x){
       P(Ax,
           (r=q(x))?M|=
1<<r,u(r,c0()):
           x
       )
       I1(h){

            I t=T[xi-'a'];

            R 14==t?-2:13==t?-4:2*t-26;

        }
       I s=
15&r,a;
       K y,z;
       S((y=xy,c(a=*x)), // a is the function, y the first argument
           
case'N':
           C(
'W',R w(x))
           C(
'$',R u(r,v(r,x,1)))
           C(
'{',R E(r,x))
           C(
'[',R g(26,x))
           C(
'a',R T[a-'a']?O2(0,1+h(xx),s,d(0,xx),d(0,y)):g(a-'a',x))
           C(
0,R y=d(0,y),O2(t(x),U(a-128),yu,y,d(s,xz))),
           
// default
           
if(':'==a){
               P(Ay,r=L[yi-
'a'],M&=~(1<<r),f(r,xz))
               y=d(
0,yy),z=e(s,xz),x=xy;
               R r=zu,u(r,j2(z,O2(
0,h(xx),r,d(0,xx),y)));
           }
           I m,b=t(y);
           P(
3>xn, // monadic functions

                // this will eventually abort (sh not implemented)
               
'&'==a?SH(b,e(0,y)):
               
// conversion to float
               
'%'==a?y=e(0,y),u(s,j2(y,cv(s,yu))):

                // shift left, implemented as an addition

                // (a shift left is a multiplication by 2, which is the

                // same as adding it to itself)
               
'\\'==a?y=e(s,y),O2(b,1,s,u(yu,c0()),y):
               O2(
0,'#'==a?-3:     // number of elements
                   
'*'==a?1+h(y): // dereference pointer
                   
'/'==a?4:      // shift right (4 is division for fp,

                                    // but it means shr applied to ints)
                    U(a),          //
dyadic function
                  s,

                   e(s,y), // first argument is y

                   // second argument is one (kc(129)) or

                   // kc(127) for # (why?) or zero for * (why?)

                   kc(128+('#'==a?-1:'*'!=a))
               )
           )
           z=xz; // z is the second argument

            // a is func number, r will be zero for

            // the first literal long argument if the

            // op is not a comparison
           a=U(a),r=
9<a||16-r?r:0;
           P(!Ay&&!q(y)&&!Az&&!q(y),
// bug? Last one: q(z)
               M|=
1<<(m=D[KF==b]++),
               y=e(
0,y),
               M&=~(
1<<m),
               z=O2(b,a,r,y,f(m,z)),
               --D[KF==b],
               z
           )
           R Ay&&!q(y)&&
2-a&&4-a?O2(b,9<a?11-a+11:a,r,e(s,z),y):
             O2(b,a,r,e(s,y),d(s,z))
       )
   }
   
// parse s
   K
p(){
       
// check if s is an ending character (one of ;})] or zero)
       S
q(){R sc(";})]",*s);}
       
// encode integers up to 127 as bytes with the highest bit set,
       
// other numbers are appended at the end of z and return index
       
// (indices <16 correspond to registers, zn-3 is the index in
       
// the z array of K vals: z[0] is the string and z[1] is NL)
       K1(n){R
kc(KI==Ax&&129u>1+xi?128+xi:(z=jk(z,x),16+zn-3));}
       
// expressions: c is the command, a an optional type (or zero)
       K
E(I a,I c){
           K r=k1(kc(c)),x;
// store command as char in K array
           
// append to r the result of parsing the following
           
// expressions separated by semicolons, or a zero if
           
// the expression finishes with ]}; or EOL
           
do r=jk(r,x=q()?n(ki(0)):p());W(';'==*s++);
           
// set given type, or the type result of the expression
           
// (ie. the type resulting of the last command)
           R
u(a?a:t(x),r);
       }
       K x,y;
       I a,b;
       
// character kind switch, if it starts with a minus sign
       
// or a dot, it may be a number
       S(
'0'-c('-'==(a=*s++)?s['.'==*s]:'.'==a?*s:a)?c(a):'0',
           
// in loops, the type of counter variables will be an integer
           
case'N':T[N++]=KI; // fallthrough
           C(
'W',R
                   ++s,x=p(),++s,    
// x is the conditional expression
                                     
// (enclosed in parens)
                   x=k3(kc(a),x,p()),
// 3rd element is the body
                   N-=
'N'==a,         // "free" loop variable
                   x
           )
           
case'$':++s;
           C('
{',R E(0,a)) // return enclosed expression
           // the type of the function is int when getting number of elements
           // of an array (#), float when dividing (%), the type of the
           // expression (t(x)) or of an element when referencing (*)
           C('
+',R x=p(),u('#'==a?KI:'%'==a?KF:t(x)-8*('*'==a),k2(kc(a),x)))
           C('
[',R E(12,a))
           C('
(',x=p(),++s) // parse expression inside parens
           C('
0',
               
// transform multiplication by two into right-shift
               P(
'2'==a&&'*'==*s,++s,x=p(),u(t(x),k2(kc('\\'),x)))
               --s;
               x=n(Na())
// parse number
           )
           
// return single character for globals. If followed by [exp],
           
// it is an array indexing or a function call, so exp
           
// needs to be evaluated by E
           C(
'a',
               x=
'['==*s?++s,E(
                   
// if no type, it is a function call
                   T[b=a-
'a']?T[b]-8: // unset high bit for arrays
                       (x=G[b],
                        x=xy,
// xx is the string, xy the code
                       
// D[0] and D[1] are stored after RET
                        *D=MX(*D,x[xn
-2]),
                        D[
1]=MX(D[1],x[xn-1]),
                        xu
// return type of the function
                       ),
                   a
// the "command" is the var or func name
                 ):kc(a)
           ),
           
// default: abort
           
// NB: this includes whitespace
           AB(s
-1)
       )
       
// if next char is ])}; or 0 return what has been parsed till now
       P(q(),x)
       
// else, next character should be a function
       
if('+'-c(a=*s++))AB(s-1);
       
// for operators with assignment, set highest bit of op character byte
       
if(':'==*s)++s,a+=128;
       
// parse right argument and get its type
       b=t(y=p());
       
// for assignments, set variable type to the type of the
       
// right expressions, for divisions set it to float, and
       
// else take the largest one (KF>KJ>KI>KC, etc)
       $(
':'==a&&Ax,T[xi-'a']=b)b='%'-a?MX(b,t(x)):KF;
       K1(f){
// cast to float if needed
           R KF-b||KF==t(x)?x:
             Ax&&
126<xi?n(kf(xi-128)):
             u(KF,k2(kc(
'%'),x));
       }
       
// if U(a) is not <9, it is a comparison with integer return type
       R
u(9<U(a)?KI:b,k3(kc(a),f(x),f(y)));
   }
   z=k2(kp(s),NL);
   
// inc ref counter of referenced global variable
   
if(!s[1]&&26u>*s-'a')r1(G[*s-'a']);
   
// function definition
   
if(r){
       X(k,k2(r1(zx),u(KI,c2(
1,1))));
       
// set type of the x,y,z function arguments
       
// more than 3 arguments will produce a segfault because T size is 26
       N(r-s
-1,L[23+i]=D[0]++,T[23+i]=l(" chijefs CHIJEFS",s[i]))
       s=r;
   }
   
// parse s and assign to x
   K x=p();
//o(x);
   
// set address of global vars (vars for i>23 are args: x,y,z)
   N(
23,if(T[i])L[i]=D[KF==T[i]]++)
   {
       I a=t(x);
// type
       
// z will contain:
       
//   - the evaluated string in x
       
//   - a function value (string of machine code followed by D[0] and
       
//     D[1]) with return type a
       
//   - arguments that do not fit in registers
       lnk(zy=u(a,j2(X0(Ax||
'$'-*x?f(0,x):v(0,x,0)),c3(RET,*D,D[1]))));
   }
//dis(zy);
   
// If k points to something, it is a definition, assign to it either
   
// the "compiled" function if it is a function definition or the
   
// result of evaluating it if it is a variable. Else, return the
   
// function so it is evaluated and printed later
   R k?X(k,r?z:Z0(ev(z))):z;
}