Copyright © 2004-2005,2008 by Marcin 'Qrczak' Kowalczyk (QrczakMK@gmail.com)

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included here.

The Kogut Programming Language

Kogut is an experimental programming language which supports impurely functional programming and a non-traditional flavor of object-oriented programming. Its semantics is most similar to Scheme or Dylan, but the syntax looks more like ML or Ruby.

The name “Kogut” means “Rooster” (“Cock”) in Polish and is pronounced like [KOH-goot].

The paradigm

  1. The language is dynamically typed: consistent usage of types is not enforced statically.
  2. The language is mostly functional: object contents and name bindings are immutable by default. You can request a variable binding explicitly. The standard library includes both immutable and mutable collections, with most important compound types (lists, strings, tuples) being immutable.
  3. The language doesn’t prevent arbitrary side effects from occurring during evaluation of an expression (it’s not purely functional) and the evaluation order is deterministic. It’s not purely functional.
  4. Objects are deallocated implicitly when no longer referenced (garbage collection).

Names, definitions and scopes

  1. The language is lexically scoped: an occurrence of a name refers to a definition determined statically from program source, not dynamically by control flow.
  2. The same syntax and semantics of definitions is used globally in a module scope, locally in a function, and for specifying the fields of an object.
  3. There is a single namespace: each identifier in a given scope has one meaning, independent of the context of usage.
  4. Definitions are evaluated and names are defined in the order the definitions are written. Expressions may refer to names defined above or below, as long as names defined below are used only inside functions which are not called before the names are defined.

Errors

  1. There is no undefined behavior (an error can’t trash memory nor make the processor execute unpredictable code), and there is a little unspecified behavior. The meaning of a program is almost deterministic. In particular strings and lists are immutable, so they can be freely shared without problems with modifying literals.
  2. On errors generally exceptions are thrown, instead of implicit conversion of an argument to another type, returning a null or unspecified value, ignoring excess arguments, or guessing what the programmer could possibly mean. In particular a condition must be either True or False, and trying to get a non-existent element of a collection throws an exception.

Execution

  1. An object conceptually consists of three parts:
  2. A function (or generally any object) takes a list of arguments and either returns a single result or throws an exception. Keyword parameters and multiple results are simulated in terms of this model.
  3. Tail calls are properly implemented: before execution of a tail call, the memory which was implicitly allocated for the caller’s execution state is dealloated.

Miscellaneous issues with functions

  1. Data objects are primarily used by applying functions to them, rather than by sending them some messages. Objects themselves are applied only to access their core functionality, e.g. to access fields of a record.
  2. In case of a generic interface common to several types with different implementations, the functions specified by the interface are realized by generic functions, which dispatch their implementation on the types of arguments. In particular many operators are generic functions.
  3. New objects are constructed by applying functions which are designed to return new objects, rather than by using some distinct syntactic notion of constructors.
  4. There is a single most important equality operator ==, which generally compares values of immutable objects and identity of mutable objects, and can also be defined manually for particular types.
  5. The comparison used for sorting, for dictionary lookup, and the corresponding hash function, are generally specified once per type, not once per sorting operation or once per dictionary. Instead, these operations take a transformation function which extracts or transforms the part of the key used for comparison.
  6. Locking and unlocking synchronization objects, blocking and unblocking asynchronous signals, changing values of dynamic variables (usually), installing signal handlers (usually) are done for the duration of execution of given code, not as a permanent effect of an imperative operation.

Syntax

  1. Names are case-sensitive.
  2. Function application is denoted by separating the function from the arguments and the arguments from one another with spaces, but functions are not curried; all arguments are passed at once and the function can determine how many of them were given.
  3. Breaking of program text into lines and indentation are insignificant. Definitions and statements are separated by semicolons.
  4. Mutable variables are first-class objects. The meaning of accessing particular variables or accessing fields of an object can be programmed.
  5. The set of operators with their priorities is fixed, which makes possible to parse a module independently of the contents of other modules. You can make your own binary operators from ordinary names with a fixed priority (they look like %Foo or Foo%).
  6. The only keyword is the underscore. Other identifier-like names which are used in core syntactic constructs are macros, and these names can be redefined.
  7. Parentheses () are used for grouping subexpressions and subpatterns. Brackets [] are used for making and matching lists. Braces {} are used for delimiting other parts of the syntax (function bodies, if and case branches, object definitions etc.).

Conventions

  1. Most global names are written LikeThis, except type names which are LIKE_THIS (because they often coexist with a function or constant of similar name) and names or important macros like let and if. Local names and field names are usually written likeThis.
  2. In the author’s opinion the indentation width of 3 spaces looks nice. Since the standard tab width, 8, is not even divisible by 3, tabs are better avoided. Using a non-standard tab width would be evil.