Fear and Loathing and YACC

Ho hum, it's YACC

In the circles I frequent, it's commonly accepted that a parser done with YACC* is better than a recursive descent (“RD”) parser done without YACC. In fact, I bet some people wouldn't even know how to write one without using YACC.
----------
*Or some other automated syntax-directed translation system, I'll just call them all YACC here.

I'm Biased

I will admit, I'm a bit biased; I've done lots of projects with and without YACC and I'm happier, all-in-all, not using it. In this essay I try to justify that position, but without complete success.

No Conclusion

So, this essay has no conclusion. It doesn't seem possible to completely choose YACC or RD, so I will settle for listing the advantages and disadvantages.

Good Things About YACC

Ease of Maintenance

I think the main advantage of using YACC is: lower life-cycle cost. After a number of years of maintenance by a series of authors, it will obviously be helpful that the YACC grammar is much smaller textually and laid out in BNF-like precision. It's easy to find any construct and understand it, and it's sometimes easier to tweak it or insert new syntax. If new time-pressed maintainer Joe Codebanger comes along, he can pop a fairly small file into his editor, see the grammar, and hack away.

Of course, one doesn't often change the grammar...

Easy Error Handling

You get really easy error handling in YACC. You get a syntax error, the parser fails and returns an error status. (Actually, it executes one of your error productions that it chooses randomly .. well, not really, but it seems like that.)

The problem is, it's really bad, easy, error handling.

Good Things About RD

Implementation Effort

Believe it or not, the RD parser can be quicker to write.

It's generally thought that using YACC makes it much easier and faster to write parsers. I would submit that no one who says this seriously has ever done larger parsers both ways. In the old days, when YACC was written, it was uncommon for programmers to produce many thousands of lines of code on a short schedule. If you could do that, you were a wizard. But today, everyone really ought to be able to do that. (What's different now? Beats me .. that would be a good essay subject for old-timers though.)

Anyway, writing code isn't actually hard, it's writing hard code that's hard. It's debugging difficult problems with limited visibility that's hard. RD parsers are easy. Dealing with YACC wackiness and the inevitable shift/reduce conflicts is hard. LALR(1) isn't a terribly powerful grammar and the disambiguating rules actually differ between YACC versions and platforms. (Yes, the same YACC grammar might work on system A and not on system B.) Bison even has a kludge statement where you specify the number of shift/reduce conflicts you expect. If the number comes out different, you get an error return. Older systems would often print in the Makefile the number of shift/reduce conflicts that were considered normal. Gee, if I change the grammar and get more, or fewer, do I just change the expect number?

What typically happens is that the initial YACC grammar is quickly banged out, but then the project bogs down in grammar conflicts and wierd bugs where, for no obvious reason, the system isn't doing what you want. In an RD parser, it's essentially impossible to get stuck on a parse problem; a little bit more code and arbitrarily conflicting grammars are easily parsed. Furthermore, you will always get exactly what you ask for .. but in YACC, the generated parser decides.

More Predictable Implementation Effort

With RD, at least you know exactly what you are in for when it starts. You have to write a bunch of code, but it's easy, and when you are done, it works. With YACC, you don't have to write nearly as much, but you have absolutely no clue how long it will take. Maybe less, maybe more. Maybe you will never get it to work. It's a bit like driving: would you rather be stuck in traffic for 9 minutes of a 10 minute trip? Or would you rather take a 15 minute trip with no traffic jam? And, oh yes, there was some chance you would be sitting behind that wrecked car for hours, not just for 9 minutes. Your YACC parser has 12 shift/reduce conflicts and your test cases work. Are you done? Beats me.

Much Better Error Messages

Error recovery is more work in RD, because you have to either buy into the exception racket, or constantly check for error returns, or do it Wirth-style with extra parameters that follow the descent chain down. But, on the other hand, you get arbitrarily great error messages and precision error recovery. You can state just exactly what's wrong with the input if you want, you could even fix the input for the user and report exactly what your change was.

In fact, you could even write the fixed program back, but no one does this, perhaps because all the parsers are now written in YACC or a YACC-descendent.

Final Thoughts

It may be that in the long run, YACC is better, but from the point of view of the original author, RD is better. Guess what? The original author often gets to choose...and what if we include the life-cycle costs of the pathetic error messages and brute-force-and-ignorance error recovery that YACC parsers always seem to feature?

The IEEE 754 Floating Point Standard

What's Wrong With IEEE 754 Floating Point?

Many people in the circles I frequent simply assume that a standard must have had all the right inputs and must be much better than what any real-world computer manufacturer would have come up with on their own. I disagree, and here is why.

  1. The standard was controversial at the time.

    At the time - 20 years ago - all of the world's scientific computing had been done on Cray, VAX, IBM, and CDC computers. Of these companies only the manufacturer of the VAX sent a committee member.

    There is a perception that only work done recently in computers matters. This may be true in systems -- the old kernels and user environments were awful, except for unix -- but it's not true in the number-crunching racket. There, most of today's codes actually date back to the older eras.

    Quite a bit of advanced work had been done in aerodynamics, finite-element-analysis, computational fluid dynamics, and many other fields. The state of the art in floating point support was not in any way primitive, yet none of those companies, except DEC, sent a committee member. The standard only applied to microprocessors, and at the time, they were useless for this kind of work.

    The one committee member from DEC led the opposition to the standard.

  2. Serious opposition and debate was stabbed in the back when Professor Kahan stacked the committee with his own graduate students, and with his former students who were now employed by microprocessor companies with committee representatives.

  3. A fundamental confusion over domain The justifications for 754 floating point often involve mathematical identities that are preserved when division by zero or other strange operations are allowed to produce infinities. By extension, then, it is argued that 754 arithmetic is on a better mathematical foundation.

    But in fact, computerized floating point arithmetic is no more about internally consistent theoretical mathematical systems than a line on a drafting board is a mathematical line. Both are simply tools used to solve problems, and the line on the paper is only in a vague way related to a line in algebra.

    If the line hits the edge of the paper, it is acceptable for the engineer to start over with a different scale factor. Likewise, when computerized arithmetic overflows the supported number formats it is acceptable for the program to trap and reject the attempted operation.

    Very little can be accomplished by stretching the piece of paper. Likewise, little is gained by adding the small band aids that IEEE 754 adds to computerized arithmetic. It remains only a distant approximation to any abstract real number system and all of the IEEE 754 features change that in only small ways.

  4. It is expensive to implement. When the standard was ratified, exactly two companies produced floating point arithmetic chips that attached to microprocessor CPU chips. Intel sold the 8087 and National Semiconductor sold a chip for the NS32 processors.

    Here is how they compared on multiply:

    Example two, about 17 years later...

    Now almost all computing is done on microprocessor based systems. Intel is still the only system that implements 754 in hardware. The DEC Alpha is a very 754-hostile microprocessor that, like the National NS32 17 years earlier, implements virtually none of 754 in hardware. (Both claim 754 compatibility because they use the same exponent widths and because tons of kernel trap handling software can in theory handle all of the traps, but for the most part neither chip is ever used in that mode other than to demonstrate conformance, and I'm not aware that the NS32 has ever done that, even if it was theoretically possible.)

    Here is how they compared on SpecFP95:

    Today, Intel has broken into double digits and claims some results in the 15-30 range, but the alpha is now into triple digits.

  5. The targeted users have ignored it for 20 years. Supposedly, the target audience is the number crunchers. Yet these users have never used the standard, even 20 years after the stacked committee of W. Kahan's graduate students forced its adoption over the objections of the minority of independent committee members.

  6. No RISC microprocessor has ever implemented IEEE 754 floating point. Since the RISC processors were always by far the leaders in floating point performance, the fact that they for twenty years never implemented the standard is telling. (They generally use the IEEE format, and there is usually a software-fixup technique that allows them to claim complete conformance. No one uses them in that mode, however. It's way too slow.)

  7. Buyers and Manufacturers of scientific (number crunching) computers are often IEEE-754 hostile. They are typically neutral or actively dislike the standard. It's obviously not a selling point, or the computers made for these users would not universally scorn the standard.

  8. Much of the standard has not been used for 20 years, even on the PeeCee HW that provides all of the features. I would argue that if a standard is not being used, then it is very wrong to implement it in hardware. This is true regardless of the merits of the standard ... if it isn't used, it's just empirically stupid to build hardware to accelerate it. How do I know it isn't used? It couldn't have been (and Kahan even admits this) because of the problem described in the next item.

  9. The standard went 20 years without an API! Much of the standard revolves around the flags and the exception masks. These all need to be set and cleared by programs in order to be of any value, yet for the first 20 years there was never a standardized API for them, so no portable program could use them. (C99 ratified a proposal submitted by, who else, W. Kahan, that finally defines an API.)

  10. The exception mechanism is incompatible with modern kernels and computing hardware. It was never possible to quickly take exceptions, and this has become relatively slower over the years as all kernels have adopted heavy-weight mechanisms such as virtually mapped and protected user environments. The standard's trap mechanism is not terribly useful as a result.

  11. The standard did not address any contemporary concern from the number crunching community. The motivation for this standard was entirely the mind of the committee chairman: W. Kahan.

What's Good About IEEE 754 Floating Point?

I tried to say something good about IEEE 754 Floating Point here, but I realized that it is actually the worst item of all.

  1. Continuing computations after errors does have a certain value. It is possible to attempt a computation and then check for overflow, inexact, whatever you want. While some of these features impose a price unacceptable to many in terms of speed of execution, it is a useful model in that programs need not be slowed down with explicit range checks and programs do not need exception mechanisms that are not always portable.

    Unfortunately, the net result of this new behavior in practice is something much darker than a useful new program organization and a more convenient error checking paradigm. No, instead what happened -- and with 20-20 hindsight this seems inevitable -- is that people aren't finding the stupid divide-by-zero bugs any more.

    Is this really a big problem? It's bigger than you might think. There are 2,000 programs in The NetBSD Packages Collection. Many of these programs use at least a little bit of floating point, often for noncritical functions such as statistics or image conversion.

    Almost every last one of these is developed or is maintained on the PeeCee. Consequently, it isn't very important for those developers to fix the stupid math bugs, especially if it just affects the statistics in cycle zero, or the color chosen when the intensity bits are zero, or whatever. The problem here is that the breakage just isn't serious enough, when you do have error-ignoring IEEE 754 floating point.

    As a result at any given time many of these packages are broken on a system that doesn't implement the IEEE 754 floating point standard. When run on such a system, the program traps and stops. That the trapping routine lacked any importance in the active case is now of no help. While one can argue that it's just great that 754 enabled the buggy program to run, before 754 this kind of error was immediately caught and corrected.

    So I conclude that 754 is a virus, infecting individual programs, and making them unable to run on non-IEEE-754 hardware.


The Theory and Practice of Code Sweeps

There are a few developers who spend their NetBSD development time doing code sweeps. As everyone knows, these are done to clean up the code for some -Wthis or -Wthat gcc option, or to make some section of code lint(1)-clean.

Good Things About Code Sweeps

The code sweep concept is full of grey areas. There certainly is some really awful code out there. I can fully understand the desire of the code sweepers to clean this up. However, code sweeping isn't as easy as it sounds. While everyone has the general idea that lint-clean == good, actually understanding why lint complains, or why the nonconforming code is bad, or even what conforming code is and how to get it ... these things aren't so easy.

Why is there bad code in the first place? The problem, I think, is that few people even attempt to understand the actual C standards. Many standards are badly done, as in fact the previous essay on this page complains. But C89 and C99 are well written standards. (Another nicely done standard is PCI.)

Not only do people not know what conforming code is, they don't know why the rules are there in the first place. Interestingly enough, conforming code is always portable code. That's one of the reasons for the rules. The most common confusion seems to be between what C allows you to do with expressions, and what C requires you to do with data. Because the compilers are required to accept all C expressions, they must compile programs even when those programs violate the rules on using data. Believe it or not: all C standards require you to dereference pointers and access data in the data's original type or as char *. Type punning is simply prohibited, except when the new type is char *.

You are allowed to legally change the type of pointers, you can move pointers around using memcpy(3), you store pointers in scalars, you can write them to files, you can move then around as void *, you can do arithmetic on them and even make them up in scalar expressions. But when you finally dereference it, it has to point to a thing of its type, except for char *, which is exempted.

That one rule means that the compiler has control of all data alignment. Imagine a data file or network protocol that streams together many objects of disparate types. The obvious and wrong programming approach involves casting a pointer into a buffer to first one type and then another in order to process the stream. This breaks all sorts of rules and is nonportable, yet it's what most people do.

The alternative almost no one uses, but which is about the only conforming approach, is to declare objects of the various types and then memcpy(3) sections of the stream into these objects before accessing them. I don't care whether you like that approach or not, or whether you think it's ugly. It's the only conforming way, and anyway memcpy(3) is a built-in almost everywhere now. Hint: make the memcpy(3) destination a union and your code may even look beautiful, appear to play all kinds of games with types, but in fact be perfectly conforming and portable.

You can extrapolate from this into a way to discriminate between breaking rules foolishly and breaking rules reasonably. It doesn't make sense for me to drastically exceed the speed limit when I drive into work over California 154, a narrow, twisting mountain road. Although I work at sea level in California, there is occasionally snow on top of San Marcus pass, the route of Ca 154. On the other hand, if I take the slightly longer way around on US 101, a nice, straight multi-lane US Interstate traversing the much lower Gaviota pass, I can go 100 MPH or more without risking running off the road.

Both approaches break the speed limit and are prohibited, but the second still makes far more sense than the first. If you want to change the type of something for a good reason: go ahead and break the rules, C almost comes out and specifies that even programs which break the rules must compile. (To restate the observation above: the compiler would have to prohibit a legal expression in order to prohibit an illegal data conversion, so it can't realistically happen.) But break the rules by using memcpy(3) or char *, and in such a way that when you do finally dereference or otherwise use the data, it is in an object of the appropriate type, allocated somewhere by the compiler with the correct alignment, and make sure the bits got there via memcpy(3) or some char * manipulation.

Bad Things About Code Sweeps

The first problem is obvious, the second problem is serious.
  1. It doesn't support any new devices or give us any new features. It doesn't help us get Java support, it doesn't support any new protocols, it doesn't enable or provide any new graphics or games. It doesn't make the system easier to install or better documented. It doesn't make anything run faster.

    What's really sad is: it seems to add at least as many bugs as it fixes. It may be that it adds way more bugs than it fixes: we have no way of knowing. Perhaps it fixes more than it inserts. I doubt it.

  2. They are almost never done right.

    The problem is that, despite the protests of the sweepers, they end up making the following priority decision: having untested but linted code is more important than having tested but unlinted code.

    Now, if you actually were to put that in words and ask them to admit it, no one would agree. But when they commit untested code, in the rare cases that someone complains, the complainer is invariably chastised for causing trouble, and he is assured that I tested it as well as I could.

    The philosophy seems to be: yes, it is critical to test. But in this case, I didn't have that device, so I couldn't, or I didn't have any way to get the program to execute that case.

    What these excuses really boil down to is: it compiled on i386. But the way it is usually phrased on the rare cases where someone dares to question the bad decisions is: i tested it as well as i could.

    This is the king all of all phony excuses. Let's analyze this. What is being said is:


    1. If it was easy to test, I did test.
    2. But if I didn't have the device, I did not test, but I checked in the mod anyway.

    In other words: a priority ordering was in fact made that concluded that having untested linted code was more important than having tested unlinted code. The priorities were ordered (A) Linted, tested code if easily tested. (B) Linted, untested code if testing was hard. (C) Linted, tested code was the lowest priority, because it was harder to arrange for, and so it didn't happen.

    Do I really have to spell out how stupid this is?

    Well, maybe I do! The most completely innocuous changes can and will cause failures from time to time, usually by uncovering a separate, latent bug. Even if you do know your change is correct by inspection, that's not good enough.

I think one problem with sweeping through the system editing files is that people see certain expert developers do that for particular reasons and feel that as a general practice it is then OK.

When I tried out the arguments from this essay on someone recently, he said ``well, Joe Turbodeveloper does that too, when he changes kernel interfaces on every port''.

My response to that was:

  1. Interfaces change for functional reasons. If that has to happen, there is no other way to do it. But a lint change is nonfunctional, you have the option of not checking it in, and everything will continue to work; no progress is held in check.

  2. Joe Turbodeveloper does test the interface changes on a couple of architectures, then the same change is made on the others.

  3. If you are Joe Turbodeveloper, the rules are different. Most of us cannot safely work like that, because we are not Joe.




How Contracts and Licenses Really Work

Contract law comes from English common law, so it's actually much the same in the UK, Australia, the US, and a few other places. It's not written down in statutes (`common' vs. `statutory') but rather in the history of decisions over many centuries. Many states in the US specifically refer to `the Common Law of England and her Colonies' or some such thing, but it doesn't matter if they say it or not.

Invariably, a contract (say, a license) is a “memorandum of the understanding”. That is, it is the written record of the actual understanding between the parties, and not the actual definition of the understanding.

Most people don't understand this, they think that only the written words are important, and that they represent the “real” or “core” agreement. Software types are particularly prone to misunderstand this, because they are used to dealing with invariant technicalities. It certainly makes no difference what you intended a program to do.

But with law, it's the opposite. For example, if parties A and B make an agreement, and party A drafts the contract, and inserts into the middle of this contract a clause that effectively reverses the deal, that clause will be thrown out by every court in most countries, and certainly by every court in the countries I listed. In fact, one principle of law is that they interpret ambiguities and contradictions in favor of the party that did not draft the contract. This is a consequence of the fact that the contract itself is just the written record of the real intention and understanding. It is the understanding between the parties that is what's important, not the actual written words of the contract or license.

It's still important to get the wording right, of course. For one thing, everyone generally agrees on what the written words are. For another, contracts stated in writing, while no more valid in principle than oral contracts, are not as easily disavowed or creatively reinterpreted. The parties with well-written contracts tend to avoid needing to appear in court at all. Sometimes, the only thing you have to work with is the written words. But it's important to never forget that the contract itself is just the written record of the true agreement.

So, when people who read a license or contract ask “what did you intend?”, they are really asking a reasonable question .. i.e., what is it you want? One can't refer them to one's own agreement for clarification, since it is the understanding of the parties to the contract that is the primary source of authority for the details.

Although it's very good practice to get the wording precise and correct, the agreement is the understanding, not the technicalities of the wording.

Example: This is a case in statutory, criminal law, and not common law, but it illustrates the way courts apply rules.

Several years ago, OpenBSD moved crypto code from the USA to Canada, then felt they could export it freely. They thought the technicalities that “first hop is legal”, “second hop is not covered by US law” would protect them.

Actually, that's not true at all, many people have been put in jail for violating US export restrictions via that exact step .. USA to Canada, Canada to elsewhere. An interesting case of exactly this was that of master artillery engineer Gerald Bull, where the export went USA .. Canada .. South Africa. Gerald Bull spent a year or two in club fed for that.

Anyway, technicalities are no defense. The court just said “started in the US, ended in South Africa, guilty.” It was only the intention of the exporter (and of the reg) that mattered. In the case of software exports, the courts were very suspicious of the BXA regs, djb beat them with his EFF-backed defense, and no one at OpenBSD ever paid for their wrongdoing. But it was clearly illegal, nonetheless. That doesn't mean you shouldn't do it, I see no moral or ethical conflict per se with defying a State, but usually other people are involved so there are ethical issues with exposing them to prosecution.

It's important to realize that OpenBSD crypto exporters within reach of USA courts could easily have been convicted if anyone cared and if the regs themselves had withstood the constitutional test. That is, they won simply because (1) they were under the radar, no one cared about a few BSD exports because the regs were written to stop Microsoft from equipping every Windows PC with encryption; (2) the regs were unconstitutional and (3) they were ultimately withdrawn. But it had absolutely nothing to do with any technicalities the BXA regs may have had in their application.

In fact, the entire popular picture of defendants being let off on technicalities by liberal judges is entirely false. In most jurisdictions, well over 90% and often even over 95% of all defendants are convicted. Basically, you go to court, you get convicted. It's best, it would seem, to stay out of court.




NetBSD Home Page
NetBSD People Page

(Contact us) $NetBSD: ross-essays.html,v 1.17 2004/10/30 22:33:40 jschauma Exp $
Copyright © 1994-2003 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.