Urn

Description

Urn is a Turing-complete esoteric programming language by Keymaker, its design finalized and published in 2013. Most of the language was thought out in 2011 and 2012. (Obviously there was no hurry with it.)

The language is based on one instruction that moves data, signals in this language's terminology, from one source to another. An instruction may contain any finite (zero included) number of other instructions, which are executed conditionally, depending on the current signal (always either 1 or 0). If something is executed, the current travelling signal is killed.

There are three in-sources for signals: input (denotated with emptiness (input itself is taken from the input channel, using ASCII '0' and '1', and EOF)), a register (any string of a-z characters, with the minimum length of one), or a static binary string (consisting solely of 0s and 1s, with the minimum length of one). The signal is either killed on the way, or will end up in either of the two out-sources: a register (a-z strings like with in-source) or the output channel (denotated with emptiness). Registers work with the same logic as binary queues; first in is first out. It is not possible to leave an instruction until its in-source is depleted (EOF in case of input).

An instruction begins with a '('. All parentheses must be matching. An instruction has four phases, separated from each other with a ':' (so there are three ':' in each instruction). The first phase is the in-source, the second is the conditioned code for 1s, the third is the conditioned code for 0s, and the fourth is the out-source. The conditioned codes, depending on the current signal, are executed as long as there are signals arriving from the in-source (nothing is ever executed without a signal). The instruction ends with a ')'. The program may have spaces and new-lines anywhere at all, even breaking a register name. These characters are simply ignored when the program is run. Comments are marked with a ';' as the last character of a line (not counting spaces or the new-line). Otherwise, all other data is considered an error, as are register names that include other characters than a-z, binary strings that include other characters than 0 and 1.

Examples

1. (111:::) would print "111" and terminate. The looping part is executed three times, since there are three signals.

2. (00:::e)(1:::e) would move "00" and "1" to register 'e', making it have "001" in the end.

3. (10:::a)(a:::b)(a:::c) would first move "10" to 'a', then everything from 'a' to 'b'. "(a:::c)" will never be executed since 'a' at that point has no signals.

4. (1001::(0:::zeroes):ones) would move all the 1s into 'ones' register, all the 0s into 'zeroes' register. In case the signal is 1, it has no conditioned code, and it will travel to the end, and go to 'ones'. 0, however, has the conditioned code "(0:::zeroes)", and that signal is killed. However, in the code that is executed, a 0 signal will be sent to 'zeroes' register.

5. (:::a) moves input to register 'a'. The looping part is executed as long as (and only when) there are signals coming from input. Once EOF comes, the instruction's execution stops.

6. Below is a larger and commented example that reads input, inverts the input, and outputs the result only if it's at least four characters/signals long. Input "101" yields nothing as output, input "11011" yields "00100".

input, and send the opposite signal to 'a' ;
notice how the out-source is output channel (marked with emptiness) but it doesn't matter as no signal ever reaches it, there being conditioned code for both signal types ;
(:(0:::a):(1:::a):)

take a copy of 'a' into 'b' and 'c' ;
notice that while moving into 'b' everything becomes 1 (this to simplify the coming part where the actual data isn't needed, but its length is) ;
(a:(1:::b)(1:::c):(1:::b)(0:::c):)

four nested instructions with 'b' as their in-source take four signals from 'b', and if the fourth is reached, ;
everything from 'c' is moved to output channel, and the remaining data in 'b' (if any) is moved to 'x' so that the looping is quit ;
(it does nothing but add a little speed (which might show if the input is a billion signals), no incorrect output would be done as 'c' becomes empty during the first time the code is executed) ;
(b: (b: (b: (b: (c:::)(b:::x) ::) ::) ::) ::)
Programs

A couple of programs I've written.

tm.urn - A truth-machine. With input '0' it prints 0 and terminates, with input '1' it keeps printing 1 infinitely. (Note, this like all Urn programs with input expects to receive EOF after the actual input is done, otherwise the input loop can never end.)

binary.urn - A binary counter. Input a binary number (such as 100, 10111, etc.) and the program increases it by one and prints the result.

prime.urn - Prints prime numbers in unary format of 1s, each string's end marked with a 0. Never terminates, even if it might look so from the slowness...

Interpreter

There is an interpreter written in Python, here. Give the Urn program as an argument. Any more arguments (doesn't matter what they are) launch the debug mode. If any other character than '0' or '1' is given as input, the interpreter stops because of faulty data.

-- Keymaker, 20.9.2013