Learning J – Part I

2007-04-09

The preferred way to learn J seems to be via a series of half-truths and tutorials. I hate learning like that. I want The Truth. (Since I originally wrote that I’ve discovered the cryptic documentation for J. Life is good.)

The basic types in J are the number and the array. There are other types too, like character, and I suspect more that I don’t know about. I expect verbs and adverbs (functions and higher-order functions) will turn out to be members of some type. Values are called nouns.

The array appears more fundamental in the sense that every other noun exists as an element in some array (I think). Like Lisp an array has N dimensions or axes. The number of axes, N, is called the rank. N >= 0.

An array of rank 0 contains one value. An array of rank 0 is called an atom or scalar. Compare this to #0A7 in Lisp (bet that has you looking up reader syntax).

Each axis has a non-negative length and is indexed using a 0-based index.

Arrays of rank 1 are called vectors and also lists. They have a convenient literal syntax:

1 2 3 NB. vector of length 3

Aside: comments are introduced with «NB.»; this is kind of cool and kind of perverse.

Computation is performed by applying verbs to nouns. Various verbs have a literal syntax; verbs can also be produced from adverbs. A given name, such as $, can stand for a monadic verb (1-ary) or dyadic verb (2-ary) according to how it is used. Syntax for monadic application is verb noun, for dyadic application is noun verb noun. Adverbs come after the verb: verb adverb. Evaluation and grouping is generally right to left. So 2 * 3 + 4 evaluates to 14 (dyadic * and + have the conventional meaning).

Monadic $ returns the shape of an array, which is a vector of its dimensions (like Lisp’s array-dimensions). Monadic # returns the length of a vector (like Lisp’s length).

In the following example my typing is indented, the results printed out by J are exdented. This is conventional for J.

   $ 1 2 3
3
   # 1 2 3
3
   $ 9 NB. Result is a 0-length vector.  Which prints as a blank line:

Note the last result. $ applied to the scalar 9 (an array of rank 0) yields a vector of length 0.

Applying # to the result of $ will give us the rank:

   # $ 1 2 3
1
   # $ 9
0

Dyadic $ can be used to construct arrays of arbitary shape (left argument) filled in with some value (right argument):

   2 3 $ 1
1 1 1
1 1 1
   2 0 3 $ 9    NB. an axis is 0, hence 0 elements
   # $ 2 0 3 $ 9
3

Note that the second array in this example has a zero-sized axis. It contains no values but still retains its shape. This is very similar to the situation in Lisp: (array-dimensions (make-array ‘(2 0 3))) ⇒ (2 0 3).

Strings have a literal syntax, ‘foo’, and are vectors. Except—hack!—a literal string with just one character is a scalar:

   'foo'
foo
   $'foo'
3
   $''
0
   $'f' NB. Recall a scalar's shape is a 0-length vector.

Dyadic $ can take a vector on the right (and arrays of higher rank, but something slightly hairy happens then):

   2 3 4 $ 'foo'
foof
oofo
ofoo

foof
oofo
ofoo

Observe how ‘foo’ is used to fill in the array, how the rank 3 array is displayed, and how the elements are ordered lexicographically by index (row major, but that gets slightly confusing above rank 2).

The adverb / is what we know and love as foldr (if only every higher order function had its own domain name). / turns a dyadic verb into a monadic verb that folds the dyadic verb along a list (“inserts into” according the J crowd). It can be used to yield the number of atoms in an array:

   2 * 3
6
   */ 2 3
6
   */ 2 3 4
24
   */ $ 2 2 5 $ 'foo'    NB. same as */ ($ (2 2 5 $ 'foo'))
20
   */ $ 7
1

Note the last example, the scalar’s shape ($ 7) is a 0-length vector which when * is folded over it yields 1. 1 is the identity for *. Recall that in Lisp (*) ⇒ 1 for similar reasons.

There’s plenty more to learn about arrays, and that’ll come in part II.

PS. If J intrigues then see how I learnt to multiply in colour using it.

6 Responses to “Learning J – Part I”

  1. glorkspangle Says:

    You write:

    */ $ 2 2 5 $ ‘foo’
    20

    Why is this? Is this dyadic $ making the value (with shape 2 2 5):

    foofo
    ofoof

    oofoo
    foofo

    and then monadic $ returning the shape 2 2 5, and then */ folding multiplication along that to say 20?
    I guess I’m a bit confused about precedence. What sort of language always starts evaluating at the right?

  2. drj11 Says:

    Yes, you’re explanation is correct. Parentheses have the usual meaning (of forming expression groups). */ $ 2 2 5 $ ‘foo’ is the same as */ ($ (2 2 5 $ ‘foo’)). I’ve added a comment to the example now. I agree that the right-to-left evaluation is a bit bonkers, but I think I’m already used to it.

    I don’t know much about J syntax but I think the lexer labels each token as noun, verb, or adverb (its “part of speech”). 2 2 5 becomes a single noun (magic syntax for sequence of literal numbers). The part of speech then drives the parsing.

  3. Skip Cave Says:

    The rationale behind the strict right-to left evaluation precedence of J becomes obvious when you realize that J has over 100 “verbs” or operators such as + * | = ^ ~ # $ etc., etc.

    J’s designers knew from their experience with APL that it would be disingenuous to try to prioritize all these verbs in any particular hierarchy. In any case, It certainly would be difficult for programmers remember any precedence scheme that included over 100 items. Right-to-left evaluation, which is the same as current mathematic notation (for example – f g h (x) ) solves the problem elegantly.

  4. drj11 Says:

    Yes, yes, bazillions of operators means that a traditional precedence hierarchy is madness. I get that from C. It’s just that if I was going to throw out the hierarchy then I would either do what Lisp did and put brackets everywhere or go with left-to-right evaluation.

    In any case since learning more of J, the “right-to-left” mantra turns out not to be so useful once conjuctions and trains come into play.

    I find your appeal to current mathematics notation unattractive. When I did my maths degree I found that “f applied to x” was denoted «xf», «f(x)», or «fx», according to the taste of the particular course. My Cohn Algebra uses the «xf» notation, for example.

    In any case doesn’t the associativity (left-to-right versus right-to-left) only come into play when using dyadic operators? Everybody that programs a computer believes that «f g h x» should be «f(g(h(x)))». No argument there.

  5. YAJer Says:

    For left to right… I think the original motivation was that alternating signs for stuff like 1-2-3-4-5 (or its equivalent: -/1 2 3 4 5) was more interesting and useful than having it be equivalent to 1-(2+3+4+5).

    ~:/\ on boolean arrays stands out as a notable example of this kind of thing.

    But, even if you don’t find such arguments convincing (and, personally, I am not sure I do) at this point, it’s way past the “water under the bridge” point.


  6. As you’ve quickly discerned, thinking about J evaluation as “right-to-left” works only if you’re looking at verbs together with nouns/pronouns. But there’s a different, more fundamental way of thinking about this which yields as a consequence that “right-to-left” evaluation. Namely, what the argument(s) to a verb are: roughly, the right argument is the result of everything to its right, whereas the left argument, if any, is the thing just immediately to its left. For an adverb (which necessarily has just one argument), its argument is the entire verb to its left. (Of course these are just slightly rough ways of saying in English words the much more precise rules given by the explicitly stated parsing rules.


Leave a reply to drj11 Cancel reply