For anyone who hasn't browsed through Peter Seibel's "Coders at Work," one of his subjects is Fran Allen...it's kind of funny because I do agree that learning C has been valuable to the high-level programming I do today (but only because I was forced to learn it in school). But there's always another level below you that can be valuable...Allen says C killed her interest in programming...not because it was hard, but because of, in her opinion, it led engineers to abandon work in compiler optimization (her focus was in high-performance computing):
Seibel: When do you think was the last time that you programmed?
Allen: Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue....
Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels?
Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities.
Actual song from PLDI'07 where she received the Turing Award, which we all sang to the tune of Take Me Out to the Ball Game:
Let’s all sing to Fran Allen
For the great things she’s done.
PTRAN and Blue Gene and E C S
Fran, we’ve gathered to toast your success.
As we ponder all you’ve accomplished,
Our colleague extraordinaire,
Here’s to you, Fran Allen, you’re truly beyond compare
Optimizing compilers
Parallel transforms too
Intervals, call graphs, and data flow
Keep our programs from running too slow
So we root, root, root for Fran Allen
Her heart, and spirit, and voice
For she’s won the Turing Award and we all rejoice!
She said that she believes functional languages and programs are key for scalable parallelism on massively multicore processors, because of the unavoidable performance problems associated with synchronization on shared data dependences in imperative languages.
The back story you quoted about C motivates her position a little, so thanks. It seemed like few of the hardcore compiler people that I spoke to at the conference seriously believed functional languages were going to be the future because of performance reasons, although personally it doesn't seem like an entirely unreasonable proposition.
Using functional paradigm with immutable types does take care of a lot of the complexities of parallel computing... It allows you to make a lot of assumptions (from the underlying platform's perspective) and can break up work in very interesting ways, not just across cpu cores, but even in distributed computing environments. If you combine this with a scriptable language, you can even carry the workload with the code to operate on the load.. which takes things a bit farther. You could create a literal farm of workers that grab a bit of data, process it against the defined load, and then return the result with the next step to run against the data...
I've wanted to create such a system for a while.. right now the closest I could come up with would be to use nodejs with json and a message queue to handle requests/loads. Unfortunately, there's a pretty serious cost to the json serialization, and other issues... but it's an interesting idea that has merit. I think in the end such a system will have a quickly serializable binary expression of both the data and the work to be done.
The catch is that most developers I know aren't used to breaking work up in such a way that it could be very parallelized.
The kicker for me that makes me think that working too hard to make easily parallelizable constructs isn't worth it is that communication is expensive and single cores are actually pretty darn powerful on their own. The implication is that you don't want to be parallelizing at too low a level. You want to be doing it at a much higher level if you possibly can.
If you have an 8-core machine and thirty independent tasks at the top-level, then all the functional programming research and parallel algorithm wizardry in the world isn't going to make you want to parallelize anything except the top-level tasks. So even if that's a verbose and error-prone task due to procedural programming constructs, at least you only need to do it once. Heck, in my own (admittedly limited) experience 90% of the hard work I have needed done has been handled by my OS's process scheduler and Redis.
The fact is that business applications, the web, and consumer software are all perfectly happy to accept the current limitations of hardware -- 12 processes running on 4 cores is efficient and easy, even with no parallelization at all below the OS process level (with the possible exception of GUI threads in desktop applications).
The only really interesting work happens in areas like HPC where you have 1 task and 900 cores and if you can't parallelize you are dead in the water, and latency-critical applications where the sacrifices made to run intra-task parallelization pay big dividends.
I think with the increasing availability of more, weaker CPUs with a decent thermal envelope, and efficient processors like ARM, then distributing work makes even more sense... in a SOA system, where a lot of processing may well be IO bound, why not break out the load even more... I think the future will be thousands of compute nodes/workers along with hundreds of service nodes handling millions of simultaneous requests.
Functional constructs make such scaling nearly effortless once the practical issues of breaking up work are handled. Yes, communications has a cost, but there are faster channels available than what are used... and distributing data persistence into redundant, sharded clusters can yield a lot of other benefits.
Given, most line of business applications are fine on current hardware... the problem is scaling to 10x or 100x the workload. You can do this by creating a system that can scale horizontally, or to be more performance oriented with current hardware. One solution gains you a single generation of increased output... another gets you N scale expansion.
I don't think it's just macro parallel tasks... but many-micro tasks that can work within such a system.
You are missing the per-CPU parallelization via vectorization that lies at the core of every modern video decoding library. The conundrum with C is that it is one of the very few languages that allow (via assembler callouts) to directly use CPU vecorization instructions, but at the same time does so in a manner that absolutely prevents automatic optimizations (i.e. by observing data dependencies).
I think she might have given up on programming a bit prematurely. The pendulum has obviously swung completely the other way with high level languages like Haskell pushing forward compiler optimization, and JIT VMs pushing forward in other directions. It's actually an exciting time for "smart compilers".
> I think she might have given up on programming a bit prematurely. The pendulum has obviously swung completely the other way with high level languages like Haskell pushing forward compiler optimization
C compilers were optimizing long before Haskell. From her interview, I don't understand why she couldn't work on optimizers even if someone else advocated optimization being programmer's repsonsibility?
Right, and they've come up with some excellent tricks too. But the reason she felt optimization wasn't going to progress as far is because C is a lower level language than some of the other languages out at the time, and there are necessarily less tricks you can do in a lower level language because you have to infer the intent of the programmer more, and rely on optimizing idioms etc, rather than optimizing actual constructs of the language.
How does a compiler optimize a single "goto"? There isn't much it can do unless, for instance, it notices that the goto is found in an idiomatic pattern that results in a loop. Then it can make a decision whether to unroll the loop or not. If the language gives you the loop construct, it can skip the "recognize the idiom" step (and the associated risk of guessing wrong), and go right to optimizing loops. Similarly, in higher level languages than C, the programmer can express their intent more directly, and therefore the compiler can take less risks when guessing "Ah, I see what you're trying to do, here's the fastest assembly that accomplishes that"
To convince C users why Haskell has the potential (but currently only potential) to optimize better, it is best to just point to one example that C can never hope to optimize : deforestation.
C will never be able to do that. Before optimization : the programmer requests a list to be created, fill it by calling functions and then passes the completed datastructure list (or tree) along to another function, which executes commands according to what the list contains.
After optimization there is no more list. Instead the function the list is passed to will call a generated function that generates exactly the needed elements of the tree just-in-time. Result: no list, no memory (aside from 1 element on the stack), no allocation, no clearing of memory afterwards.
Of course the downside is that it's very tempting (and encouraged) to write programs that don't contain these optimizations you'd have to do manually in C/Java/... and just have them run. What you'll miss as a Haskell programmer is that the program is effectively dependent on those optimizations for it's complexity (for example: optimized program is O(n), program as written is O(n^n). Then you insert what looks like a tiny change, say, sorting the list, which prevents optimization from happening and boom, your binary switches from O(n) to O(n^n). All tests will obviously pass, yet your boss is unlikely to be happy ... At this point it is extremely hard to figure out what just happened)
While I'm very sympathetic to Allen's viewpoint, it seems that there was, and still is, a large class of problems where C-like languages (where the programmer does the optimizations) are better than high level languages with optimized compilers.
The best example for her case would be Fortran, where the language is both higher level, and faster, because the compiler can make much more assumptions (my understanding is the restrict keyword somewhat evens the playing field with C, but that is kind of a hack).
However, plenty of numerical work is also done in C, in spite of Fortran's availability.
writing optimizers in C is much harder than in Fortran because of reduced abstractions and reasoning about memory and pointers. in Fortran they were working on making automatic multi-threaded optimizations where you would write Fortran and it would auto-parallelize into multiple threads ... before C was invented.
C did kill a lot of work on making computers easier to program and safer. it was a devils bargain for speed and we paid for it with decades of crappy code with buffer overflows and shared state race conditions. it's tricky to know if it was worth it.
(Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming (Kindle Location 6269). Kindle Edition: http://www.amazon.com/Coders-Work-Reflections-Craft-Programm... )
Seibel: When do you think was the last time that you programmed?
Allen: Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue....
Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels?
Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities.