For anyone who hasn't browsed through Peter Seibel's "Coders at Work," one of hi...

foobarbazqux · on June 2, 2013

Actual song from PLDI'07 where she received the Turing Award, which we all sang to the tune of Take Me Out to the Ball Game:

  Let’s all sing to Fran Allen
  For the great things she’s done.
  PTRAN and Blue Gene and E C S
  Fran, we’ve gathered to toast your success.
  As we ponder all you’ve accomplished,
  Our colleague extraordinaire,
  Here’s to you, Fran Allen, you’re truly beyond compare

  Optimizing compilers
  Parallel transforms too
  Intervals, call graphs, and data flow
  Keep our programs from running too slow
  So we root, root, root for Fran Allen
  Her heart, and spirit, and voice
  For she’s won the Turing Award and we all rejoice!

She said that she believes functional languages and programs are key for scalable parallelism on massively multicore processors, because of the unavoidable performance problems associated with synchronization on shared data dependences in imperative languages.

The back story you quoted about C motivates her position a little, so thanks. It seemed like few of the hardcore compiler people that I spoke to at the conference seriously believed functional languages were going to be the future because of performance reasons, although personally it doesn't seem like an entirely unreasonable proposition.

tracker1 · on June 2, 2013

Using functional paradigm with immutable types does take care of a lot of the complexities of parallel computing... It allows you to make a lot of assumptions (from the underlying platform's perspective) and can break up work in very interesting ways, not just across cpu cores, but even in distributed computing environments. If you combine this with a scriptable language, you can even carry the workload with the code to operate on the load.. which takes things a bit farther. You could create a literal farm of workers that grab a bit of data, process it against the defined load, and then return the result with the next step to run against the data...

I've wanted to create such a system for a while.. right now the closest I could come up with would be to use nodejs with json and a message queue to handle requests/loads. Unfortunately, there's a pretty serious cost to the json serialization, and other issues... but it's an interesting idea that has merit. I think in the end such a system will have a quickly serializable binary expression of both the data and the work to be done.

The catch is that most developers I know aren't used to breaking work up in such a way that it could be very parallelized.

sirclueless · on June 3, 2013

The kicker for me that makes me think that working too hard to make easily parallelizable constructs isn't worth it is that communication is expensive and single cores are actually pretty darn powerful on their own. The implication is that you don't want to be parallelizing at too low a level. You want to be doing it at a much higher level if you possibly can.

If you have an 8-core machine and thirty independent tasks at the top-level, then all the functional programming research and parallel algorithm wizardry in the world isn't going to make you want to parallelize anything except the top-level tasks. So even if that's a verbose and error-prone task due to procedural programming constructs, at least you only need to do it once. Heck, in my own (admittedly limited) experience 90% of the hard work I have needed done has been handled by my OS's process scheduler and Redis.

The fact is that business applications, the web, and consumer software are all perfectly happy to accept the current limitations of hardware -- 12 processes running on 4 cores is efficient and easy, even with no parallelization at all below the OS process level (with the possible exception of GUI threads in desktop applications).

The only really interesting work happens in areas like HPC where you have 1 task and 900 cores and if you can't parallelize you are dead in the water, and latency-critical applications where the sacrifices made to run intra-task parallelization pay big dividends.

tracker1 · on June 3, 2013

I think with the increasing availability of more, weaker CPUs with a decent thermal envelope, and efficient processors like ARM, then distributing work makes even more sense... in a SOA system, where a lot of processing may well be IO bound, why not break out the load even more... I think the future will be thousands of compute nodes/workers along with hundreds of service nodes handling millions of simultaneous requests.

Functional constructs make such scaling nearly effortless once the practical issues of breaking up work are handled. Yes, communications has a cost, but there are faster channels available than what are used... and distributing data persistence into redundant, sharded clusters can yield a lot of other benefits.

Given, most line of business applications are fine on current hardware... the problem is scaling to 10x or 100x the workload. You can do this by creating a system that can scale horizontally, or to be more performance oriented with current hardware. One solution gains you a single generation of increased output... another gets you N scale expansion.

I don't think it's just macro parallel tasks... but many-micro tasks that can work within such a system.

_pmf_ · on June 3, 2013

You are missing the per-CPU parallelization via vectorization that lies at the core of every modern video decoding library. The conundrum with C is that it is one of the very few languages that allow (via assembler callouts) to directly use CPU vecorization instructions, but at the same time does so in a manner that absolutely prevents automatic optimizations (i.e. by observing data dependencies).

habitue · on June 2, 2013

I think she might have given up on programming a bit prematurely. The pendulum has obviously swung completely the other way with high level languages like Haskell pushing forward compiler optimization, and JIT VMs pushing forward in other directions. It's actually an exciting time for "smart compilers".

irahul · on June 2, 2013

> I think she might have given up on programming a bit prematurely. The pendulum has obviously swung completely the other way with high level languages like Haskell pushing forward compiler optimization

C compilers were optimizing long before Haskell. From her interview, I don't understand why she couldn't work on optimizers even if someone else advocated optimization being programmer's repsonsibility?

habitue · on June 2, 2013

> C compilers were optimizing long before Haskell

Right, and they've come up with some excellent tricks too. But the reason she felt optimization wasn't going to progress as far is because C is a lower level language than some of the other languages out at the time, and there are necessarily less tricks you can do in a lower level language because you have to infer the intent of the programmer more, and rely on optimizing idioms etc, rather than optimizing actual constructs of the language.

How does a compiler optimize a single "goto"? There isn't much it can do unless, for instance, it notices that the goto is found in an idiomatic pattern that results in a loop. Then it can make a decision whether to unroll the loop or not. If the language gives you the loop construct, it can skip the "recognize the idiom" step (and the associated risk of guessing wrong), and go right to optimizing loops. Similarly, in higher level languages than C, the programmer can express their intent more directly, and therefore the compiler can take less risks when guessing "Ah, I see what you're trying to do, here's the fastest assembly that accomplishes that"

waps · on June 3, 2013

To convince C users why Haskell has the potential (but currently only potential) to optimize better, it is best to just point to one example that C can never hope to optimize : deforestation.

http://book.realworldhaskell.org/read/profiling-and-optimiza...

C will never be able to do that. Before optimization : the programmer requests a list to be created, fill it by calling functions and then passes the completed datastructure list (or tree) along to another function, which executes commands according to what the list contains.

After optimization there is no more list. Instead the function the list is passed to will call a generated function that generates exactly the needed elements of the tree just-in-time. Result: no list, no memory (aside from 1 element on the stack), no allocation, no clearing of memory afterwards.

Of course the downside is that it's very tempting (and encouraged) to write programs that don't contain these optimizations you'd have to do manually in C/Java/... and just have them run. What you'll miss as a Haskell programmer is that the program is effectively dependent on those optimizations for it's complexity (for example: optimized program is O(n), program as written is O(n^n). Then you insert what looks like a tiny change, say, sorting the list, which prevents optimization from happening and boom, your binary switches from O(n) to O(n^n). All tests will obviously pass, yet your boss is unlikely to be happy ... At this point it is extremely hard to figure out what just happened)

1123581321 · on June 2, 2013

I believe she worked on optimizers her entire career as a logician, scientist and manager rather than as an implementing programmer.

http://www-03.ibm.com/ibm/history/witexhibit/wit_hall_allen....

http://amturing.acm.org/award_winners/allen_1012327.cfm

anuraj · on June 3, 2013

There is no pendulum swing - Haskell remains a niche - C and its off springs are still mainstream. Have you ever checked statistics?

http://www.tiobe.com/index.php/content/paperinfo/tpci/index....

http://www.langpop.com/

C still remains the only accessible low level language to do systems programming. C compilers have been optimizing for long.

mc-lovin · on June 2, 2013

While I'm very sympathetic to Allen's viewpoint, it seems that there was, and still is, a large class of problems where C-like languages (where the programmer does the optimizations) are better than high level languages with optimized compilers.

The best example for her case would be Fortran, where the language is both higher level, and faster, because the compiler can make much more assumptions (my understanding is the restrict keyword somewhat evens the playing field with C, but that is kind of a hack).

However, plenty of numerical work is also done in C, in spite of Fortran's availability.

munin · on June 3, 2013

writing optimizers in C is much harder than in Fortran because of reduced abstractions and reasoning about memory and pointers. in Fortran they were working on making automatic multi-threaded optimizations where you would write Fortran and it would auto-parallelize into multiple threads ... before C was invented.

C did kill a lot of work on making computers easier to program and safer. it was a devils bargain for speed and we paid for it with decades of crappy code with buffer overflows and shared state race conditions. it's tricky to know if it was worth it.