Learning J – Part III

2007-04-27

Please see Learning J Part I and Part II.

Syntax

At the lexical level J is fairly unsurprising. A program consists of a series of sentences, one sentence per line (I think we can already see how the issue of breaking long lines never comes up). Comments are introduced with NB. and finish at the end of the line. Sentences are decomposed into words.

Examples words:

blue
blue_green
7
_7
-
=:
'foo'
i.
$
"
\\
[:

Names and numbers are pretty standard (but note that negative numbers are denoted using _, like ML, because - is a verb), but apart from those there is a bewildering array of punctuation available including items that are usually found as brackets, delimiters, and escapes in other languages.

There’s an additional quirk. A sequence of numbers separated by spaces forms a single word (a list). This only works for numbers. So 3 1 2 is a single word (a length 3 list), but a b c is not.

Each word is categorised into a part of speech. J deliberately uses terminology from natural language grammar.

  • Nouns: values to you and me. 1, 2 3 4, 'foo', and so on.
  • Verbs: functions or procedures. +, $, i..
  • Adverbs: monadic higher order functions. /, \.
  • Conjunctions: dyadic higher order functions. ", and perhaps lots more.
  • Punctuation: ( and ) (which have their conventional meaning) and maybe some more.
  • Copula: What J calls =: and =. for assigning names to things.

As we’ve seen, verbs can either be monadic or dyadic, so that’s: term verb term or verb term. I’m being deliberately vague about what a term is (mostly because I don’t know).

Adverbs are always monadic and follow the verb: term adverb.

Conjunctions are always dyadic: term conjunction term.

Let c be a conjunction, v be a verb, and a be an adverb.

There’s an obvious x v y v z ambiguity, as in 2^3^4. As previously discussed things group to the right, so this is 2^(3^4):

   2^3^4  NB. some number much larger than 256 ((2^3)^4)
2.41785e24

Conjunction Adverb Precedence

What about v c v a? Is that (v c v) a or v c (v a)?

Let’s find out. Consider *: @ + / 1 2 3. There’s two new symbols I need to explain. Monadic *: is square (it squares its argument). @ is a conjunction called Atop, it’s a kind of compose operator.

So *: @ +/ 1 2 3 could mean either (*: @ +) / 1 2 3 which would fold the dyadic (*: @ +) over the list 1 2 3 or it could mean *: @ (+/) 1 2 3 which would apply the monadic *: @ (+/) to the list 1 2 3. (It could also mean *: @ (+/ 1 2 3), but it doesn’t.) The @ conjuction is defined so that when used dyadically (as in the first possible interpretation) x (*: @ +) y means the same as *: (x + y) (square the sum); when used monadically *: @ (+/) y means the same as *: (+/) y.

So the first interpretation would have the meaning 1 (*:@+) 2 (*:@+) 3 which is 676 (262). The second interpretation would have the meaning *: (+/ 1 2 3) which is 36. Let’s see:

   *:@(+/) 1 2 3
36
   (*:@+)/ 1 2 3
676
   *:@+/ 1 2 3  NB. Same as second example.
676

So we can see that the @ conjunction binds more tightly than the / adverb. Conjunctions have higher precedence.

Operator precedence is not enough

Sadly, whilst it’s tempting to think that you can parse J using an operator precedence parser, you can’t. That’s because the reductions to apply depend not on the syntactic category of the items being parsed but on their runtime class.

Consider what might happen if you had a conjunction whose result was sometimes a verb and sometimes a noun. This is extremely unconventional, but possible. I introduce the notion here to show how certain methods of parsing are not possible. I borrow from the future and show you my evil conjunction:

   ec =: conjunction define
if. n = m do. *: else. 4 end.
)

The evil conjunction ec takes two (noun) arguments and has the result *: (which is a verb) if they are equal, and the result 4 (which is a noun) if not. Now consider n ec m - 3. Is this ((n ec m)(- 3)) (monadic verb applied to - 3) or ((n ec m) - 3) (dyadic - applied to two nouns)? Actually it could be either:

   7 ec 0 - 3    NB. 7 ec 0 evaluates to the noun 4
1
   7 ec 7 - 3    NB. 7 ec 7 evaluates to the verb *: (square)
9

So the parsing of J is mingled with its execution. The way it actually works is that the sentence forms a queue of words. Words are moved from the right-hand end of the queue onto a stack. After every move the top four elements of the stack are considered and one of 9 possible reductions is applied. When no more reductions are possible another word is moved across to the top of the stack and the cycle repeats. This is all reasonably well described by the J dictionary appendix E.

It strikes me as a mix of elegant simplicity, stomach churning madness, and pragmatic considerations of actually implementing something on 1950′s hardware.

Next, part IV covers more syntax (trains) and introduces some more practical examples of conjunctions.

About these ads

7 Responses to “Learning J – Part III”

  1. YAJer Says:

    I haven’t read through to see if you’ve sorted this out yet, but conjunctions bind exactly as tightly as adjectives.

    The proper interpretation of your test results would have been: conjunctions and adjectives group to the left (and bind tighter than verbs, which group to the right).

    (A related issue is that nouns are the only terms accepted by verbs, while adjectives and conjunctions accept both nouns and verbs as terms. But I think you had figured that out by the time you wrote this blog post, even if you did not explicitly state it.)

    This is all covered in that “appendix E” you referenced.

    Finally, about “certain kinds of parsing are not possible”… the “J way of doing things” would be to simply not support the ambiguities which cause conflicts, if you had some context where you needed to care. This would dump the hard support work on the people who use evil words, but that’s probably OK.

  2. godspiral Says:

    >

    Actually, the rule is that adverbs and conjunctions bind to the entire “verb phrase” to its left.

    A verb phrase is a series of words where some left binding occurs to interupt the normal right to left parsing that happens when only verbs and nouns are used.

    so:
    v v v c v c v a a
    parses as:
    v v ((v c (v c v)) a) a

    You can also make an adverb by “currying” a conjunction.
    so (c v) or (v c) makes an implicit adverb.

    so:
    v v v c v c v a (v c)
    parses as:
    v v ((v c (v c v)) a) (v c)

  3. drj11 Says:

    @godspiral:

    Are you sure? Doesn’t «v c v c v» parse as «(v c v) c v»? Have you checked?

  4. drj11 Says:

    @YAJer: do you mean adverb when you say adjective? I hope you do, because I don’t really want more Lurking Horrors.

    I can’t see that saying that conjunctions bind exactly as tightly as adverbs is very helpful. It doesn’t help you decide whether «v c v a» is «(v c v) a» or «v c (v a)». An analagous situation occurs in traditional programming languages (awk, say): Is «-x^4» the same as «-(x^4)» or the same as «(-x)^4». Traditionally we use operator precedence to decide (and in the case of awk «^» has higher precedence than unary «-»).

    “grouping to the left” is helpful, but still deserves a bit more explanation to say that it includes the case when adverbs mix with conjunctions.

    Saying that conjunctions and adverbs bind to the entire verb phrase to their left seems most helpful, but by then you may as well have just memorised the Appendix E reduction table, in my opinion.

    In any case none of those informal rules are enough to tell you how «n c n v n» parses because, as my post illustrates, you have to actually evaluate «n c n» to know.

    That’s what I mean when I say certain methods of parsing are not possible. In a traditional language you can describe the semantics by first parsing and then evaluating. Note that the parsing phase doesn’t require any evaluation (in a tradition language); you can determine the syntactic category of something (lexing) and determine which reduction rule to apply (parsing) without evaluating any code. That’s simply not true in J.

  5. YAJer Says:

    Yeah, I meant “adverb” when I wrote “adjective”. Call it a bad tab expansion… I hate it when I do that.

    Anyways, you are right about how «n c n v n» parses. Technically, you’ve got four possibilities, based on the type of «n c n»:

    noun — v is a dyad
    verb — v is a monad
    adjective — syntax error
    conjunction — syntax error

    Practically speaking, you should know enough about what’s going on to figure out this part without much effort. Either that, or just run the expression and see what it does…

  6. godspiral Says:

    re: Doesn’t «v c v c v» parse as «(v c v) c v»?

    Yes it does, sorry for mistake.

    so:
    v v v c v c v a a
    parses as:
    v v (((v c v) c v) a) a

    My example doesn’t contradict your conclusion that conjunctions bind ahead of adverbs, and its a fair rule of thumb, but it could lead to confusion on this example.

    v v c v a c v a
    parses as:
    v (((v c v) a) c v) a

  7. YAJer Says:

    To be fair…

    v v v c v c v a a
    parses as
    v v (((v c v) c v) a) a

    only when the intermediate results from those adverbs and conjunctions are not themselves adverbs and conjunctions.
    This is almost always the case (and you have to go out of your way to construct a counter example), but almost always is not the same thing as always.

    For example, if the left most c returned an adjective, you wind up parsing this as

    v (((v (v c v)) c v) a) a

    To really understand adverbs and conjunctions you need to understand both their type and their result type. And, when their result type is adverb or conjunction you need to understand the result type of the derived function. Etc.

    Worst case, this would be like trying to solve the halting problem. (Practically speaking, though, no one is going to construct a sentence with an infinite number of words. So even the most evil worst case you could imagine isn’t really unknowable.)

    Anyways, practically speaking, if you have an adverb or conjunction which returns another adverb or conjunction, this is worthy of a nice big chunk of documentation… Like, maybe:

    Note”
    BLAH is an adverb which creates another adverb
    where the derived adverb expects its argument to
    be a gerund, or a verb which will be converted to
    a gerund. The argument verb to BLAH operates on
    this gerund and is expected to produce a new gerund.
    This new gerund is then converted to a verb which is
    the result of the derived adverb produced by BLAH
    )

    Or… maybe something a little more comprehensible than
    this off-the-top-of-my-head example.

    By the way: a gerund is the noun form of a verb.

    For instance, to get the gerund for +
    ing=: `”
    + ing

    (conjunctions automatically curry themselves into an
    adverb if you only give them one argument.)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: