Kogut language
Kokogut compiler
Copyright © 2004-2009 by
Marcin 'Qrczak' Kowalczyk
(QrczakMK@gmail.com)
Permission is granted to copy, distribute and/or modify this
document under the terms of the
GNU Free
Documentation License, Version 1.2 or any later version
published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included
here.
Kokogut, a compiler of Kogut
Kokogut is an implementation of Kogut
in Kogut itself. It translates Kogut source into C source, which
is then compiled using a C compiler to produce an excecutable.
It’s a work in progress, in particular the documentation
is being written now.
Kokogut is hosted on
.
Here you can download the current version. It includes the
contents of these web pages. It is developed under Linux, and
tested under various Unix-like systems.
Please contact me if
you encounter building or porting problems, so I can make it
more portable.
Choose one of two kinds of the package:
-
Get kokogut-0.7.0.tar.bz2
if you already have Kokogut installed in version 0.6.0 or later.
-
Get kokogut-0.7.0-boot.tar.bz2
which can be built without having another version of Kokogut installed
(includes precompiled C files for bootstrapping the compiler).
Building instructions:
-
If you got kokogut from CVS: autoconf && autoheader
-
Optional: see ./configure --help for building options.
- ./configure
- make
- Become root.
- make install
The compiler proper and example programs are licensed under the
GPL,
and the libraries under the
LGPL,
with a “linking exception”.
The linking exception allows you to link a “work that uses
the Library” wit a publicly distributed version of the
Library to produce an executable file containing portions of the
Library, and distribute that executable file under terms of your
choice, without any of the additional requirements listed in
section 6 of LGPL version 2 or section 4 of LGPL version 3.
Kokogut principles
Note: these are principles of the current implementation, not of
the language.
-
There are no arbitrary limits on the nesting levels of
expressions, magnitude of integers, number of parameters of
a function, object size, recursion depth etc. Stack overflow
is checked and the stack is resized as needed, the heap also
grows as needed.
-
Passing and receiving a known number of arguments between 0
and 8 is efficient. In other cases it’s equivalent to
building and deconstructing a list of arguments.
-
Integers which fit in a machine word are tagged in the lowest
bit. All other values are represented by pointers. Objects
are allocated statically when possible, otherwise they are
on the heap.
-
The garbage collector accurately traces pointers to objects,
without conservatively assuming that some random memory
locations (like the system stack) might point to Kogut
objects. It is a copying collector with two generations and
a software write barrier.
-
Compilation doesn’t stop when an error is detected, all
detected errors are reported.
-
You can embed C code fragments directly in Kogut source to
implement primitive types and operations.
Current limitations, to be lifted in future
-
User macros are not implemented (and not fully designed).
-
There are no companion programs like a profiler, a debugger,
or an interactive interpreter.
-
There is no portable FFI. The current integration with C relies on
compilation to C and on various low-level details.
-
Arithmetic on very large numbers can crash the program. This
is caused by the GMP
library which allocates temporary objects on the stack without
overflow checking.
-
Only some builtin functions are expanded inline.
Non-portable assumptions in the generated C code
Kokogut aims at producing quite portable C code, but assuming
some reasonable properties of the environment allows to generate
good quality code easier. Please tell me if some of these
assumptions are not reasonable and it would make
sense to port Kokogut to platforms where they are not satisfied.
-
Signed integers use two’s complement arithmetic and
don’t signal errors on overflow (overflow is checked
after the fact); for new versions of gcc the
-fwrapv
flag is used to obtain this behavior.
Integer division and remainder round towards zero (this is
unspecified by C89 but required by C99). Right-shitfing a
negative number preserves the sign.
-
All pointers have the same size. There is an integer type
of the same size as a pointer. Odd integers can be cast to
pointers and back. Pointers into arrays of pointers are even.
Incrementing a pointer past the end of an array doesn’t
cause errors.
-
There is no unexpected padding in structures consisting of
pointers and pointer-sized integers, and types other than
double don’t have stricter alignment requirements than
a pointer.
-
Pointers have at least 32 bits. While it would be easy to
lift this restriction, as it deals only with making integer
literals and with the default stack and heap sizes, Kokogut
libraries would not fit on a 16-bit platform anyway, the
generated code is too large.
-
Sometimes an object is accessed using a different type
than it was created with. This mostly deals with pointer
types and with structs having a similar layout. I believe
this happens only in places where it does not make harm in
practice to apply C99’s type-based aliasing rules
(option -fstrict-aliasing in GCC, turned on by default).
-
Limits of the C compiler regarding issues like the number of
significant characters in identifiers, the level of nesting of
blocks, the number of external definitions in a translation
unit etc. may affect the limits of Kogut code. Kokogut
doesn’t impose artificial limits itself but it can
inherit them from the C compiler used.
-
The generated code uses the GMP library for
big integers so it must be available for the target
platform. It’s assumed that GMP limbs have the same
size as pointers and that values of type
mpz_t
may be moved (not copied) using memcpy.
Kogut and kokogut history
There is no definite starting time point of developing Kogut. Its
first incarnation, an interpreter in Haskell, was written in
August 2001. It was much different from Kogut though.
The second incarnation, a compiler in Haskell which generates
OCaml code, was written in November 2001 and touched in June
2002.
The third incarnation, an interpreter in OCaml, was abandoned
in Sempember 2002. Parts of it were reused in the fourth
incarnation, another interpreter in OCaml. This interpreter
started working in January 2003 and was used to bootstrap
further implementations written in Kogut. It was being tweaked
during 2003 to follow small changes in the language, and now
implements mostly a subset of Kogut.
The fourth implementation, a compiler in Kogut which generates
C code, was abandoned at the beginning of November 2003. Large
parts of it were reused in the fifth incarnation, which started
on November 2, 2003 and is the current Kokogut.
On January 12, 2004 Kokogut started producing executable
programs. The compiler was mostly finished then, but the library
was nearly non-existent.
On March 13, 2004 Kokogut was able to compile itself for the
first time. Later effort concentrated on improving the libraries
and the building system.
Since May 26, 2004 Kokogut is hosted on SourceForge.