Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Big changes to Erlang (joearms.github.io)
325 points by waffle_ss on Feb 1, 2014 | hide | past | favorite | 133 comments


I love this:

"Key := Val Updates an existing Key The key must be present in the old map. If Key is not in the old map Erlang will complain loudly. Using two operations instead of one has several advantages: A spelling mistake cannot accidentally introduce a new key"

Lord knows how many times I accidentally created a new key with a spelling typo, and then later when I wanted to retrieve the data, I was like "What the hell is wrong? I know I put data in that key, so why is nothing coming back? I guess I have discovered a rare bug in the compiler that no one has ever discovered before."


I also like how such rather large changes are being introduced in a language that has been in production for many decades. That is pretty remarkable.


I think Erlang is sort of unique in this way. Many languages go through a lot of changes when they are not all that popular, and then blow up in popularity and finally have a lot of users who are wary of Big Changes. Ruby, for instance, was around for years and years before it started to get popular because of Rails. Python, too, was a product of the early 90ies, but as late as some Linux conference in... '99 or 2000 or thereabouts, I recall sitting at a BoF table with Guido and just a few other people, and the language not being that big a deal.

Erlang is still not really all that popular, but has always had a lot of Important Code in production thanks to Ericsson, which has made it more difficult to change even though there are some things that even the creators acknowledge as not particularly elegant.


It is funny Erlang is one of those languages and run-times that is used quite a bit without people knowing about. Its reliability properties put at the core of many services (Facebook chat, Amazon, phone gateways etc).

It is both a blessing and a curse. If a service or program doesn't crash and cause a lot of drama and headaches it also doesn't get noticed in a certain way.

I forgot who mentioned it (one of Erlang Factory talks probably), but for example if you access the Internet via your smart phone, there is like a 30% change that it goes through Erlang at some point (due to going through Ericsson's servers).


> I forgot who mentioned it (one of Erlang Factory talks probably), but for example if you access the Internet via your smart phone, there is like a 30% change that it goes through Erlang at some point (due to going through Ericsson's servers).

yes there is probably > 30% chance of that. the space that you are looking for is mobile-packet-core, and is dominated by ericsson, csco (via starent acquisition), and to a lesser degree by alu, nsn etc.


I seem to recollect reading about Facebook Chat getting rewritten in C++. Maybe here somewhere?


I also remember seeing such comments on HN, but all the references I could find searching just now say a combination of Erlang and C++ is used for FB Chat.


> ... has always had a lot of Important Code in production thanks to Ericsson, which has made it more difficult to change...

Google basically solved this problem for themselves with gofix. I'm surprised Erlang doesn't have an equivalent, given that it would basically just be a reuse of the "parse transform" code.


Erlang dates back to sometime in the late 80ies, so I don't know if it's quite as simple as all that. Some systems are things you really don't want to mess with, and a Pareto (80/20) tool won't cut it.


Rather large change? As far as I understand it's just adding a new built-in collection and literal form for it (oh, and the pattern-matching machinery for it obviously).

Python added a set builtin in 2.4 and a literal syntax in 2.7, Objective-C added NSNumber, NSArray and NSDictionary literals only last year.


I think it is a rather significant change.

It is like adding built-in maps to C++ or Java or C. Python set was similar. And there was talk about whether {} should be an empty set or a dict. It stayed a dict. I remember it.

But also in mid 90's already large production systems were written in Erlang while Python was still experimental. Python 2.0 was released in early 2000s or so.


> I think it is a rather significant change.

"Significant" and "large" are different things. A change can be small and significant, and large and insignificant.

> It is like adding built-in maps to C++ or Java or C.

It's like adding built-ins to anything. Your point?

> Python set was similar.

Similar what? It was a new builtin and literal to a language which did not have a builtin (let alone literal) set.

> And there was talk about whether {} should be an empty set or a dict.

Which is relevant how? Also not really, `{}` becoming an empty set literal was taken off the table more or less immediately when the introduction of set literals was being discussed.

> But also in mid 90's already large production systems were written in Erlang while Python was still experimental. Python 2.0 was released in early 2000s or so.

You're joking right? Python 2.0 was released in 2000 (October), which would quite obviously mean that Python 1 had preceded it by a few years (6), and most things are usually considered not experimental by the time they reach version 1. Python had stopped being experimental more than a decade before sets were added.


> It is like adding built-in maps to C++ or Java or C. It's like adding built-ins to anything. Your point?

If you'd read my comment you'd understand. The point was that adding a new built-in to an almost a 30 year old language is unusual. Adding a new built-in to Dart is probably a lot easier. So not it is not "like adding built-ins to anything".

> > Python set was similar. > Similar what?

Wild guess but well I don't know the topic we are discussing -- maps to Erlang. And by extension adding a major built-in type to any established language.

> > And there was talk about whether {} should be an empty set or a dict. Which is relevant how?

It is not. Just making conversation remembering from years back. Sheesh...


You're comparing apples to oranges. Implementing a data structure in one language cannot be compared to implementing a data structure in a different one.


I don't want to sound pessimistic, but adding a map structure to an language sounds rather small to me.


Off my head:

Support the runtime, support the syntax and semantics, change the pattern matcher, write extensive tests, add it to the term_to_binary/binary_to_term protocol, add it to distribution, enable reading of maps in various function calls, enable NIFs to inspect maps, define how to copy maps between processes, change the core compiler.

It isn't so simple.


To a new and experimental language it would be a small change. Try adding a built-in map to C or Java. Lambda expressions took year to add to Java for example.

Now if say Nimrod or Dart acquired built-in sets (I don't know maybe it already does) it wouldn't be too remarkable. They are pretty experimental.

Also there is a difference between having a library and a language built-in. All those languages including Erlang, have libraries that provide associative data structures.


I don't understand. By built-in map you mean "map literals"?


Yes, as built-ins I consider literals (with specific language syntax). Well, otherwise most modern useful languages have at least a library level implementation of some associative data structures. Erlang has them too, dict, gb_tree and a few others.


You'd have to understand more about Erlang semantics to appreciate it. Those who have used Erlang or hacked the Erlang language itself know this to be nontrivial.


It's not the grammar footprint that makes it a large change. It's that a lot of people will change the way they write erlang programs now that maps exist.


many = 2 ?


To counteract such bugs I usually try to use as short names as possible. I mean, there is no point in writing productionDatabase when prodDb works just as well. It has several advantages but one of them is that the latter is much less likely to be misspelled.


I'd say the latter is much more likely to be misspelled since you abbreviated it arbitrarily. It's easy to write it prodDB or proddb, or was is productionDb?

Using actual English names actually minimizes the risks of mistakes.

And of course, using a statically typed language makes all these concerns moot since the compiler won't let you make these mistakes in the first place.


using a statically typed language makes all these concerns moot

No, you can make exactly the same bug with a static type system. Structurally typed records are just that – structurally typed. If you add an extra field you never use, the type system doesn't care. You need further static analysis to catch that.

(Beside, Erlang ships with a decent static analyzer.)


Depends on your brand of static typing.


In fact, Erlang's records are something like this. They are tuples under the hood (with a first element being atom name of record "type"), and pre-processor translates matching properties by name to matching tuple elements in proper position. So, you can't made typo in this scenario - the code won't compile.

I.e., you declare:

    -record(foo, {bar = ?default_bar, baz = ?default_baz}).
And every time you write #foo{bar = 1} you get {foo, 1, ?default_baz} under the hood. Obviously, records are unusable as general-purpose dictionaries as they have fixed structure.

One big inconvenience with records is the fact that you have to type record's name every single time you have to do anything with it. And writing MyFoo#foo.bar is cumbersome, compared to MyFoo.bar in most other languages.

Another issue is that records exist only before pre-processing. Even compiler is unaware of member names, let alone the runtime. This makes introspection and live code operations (by connecting a shell to a running instance) quite inconvenient.


The number of times I've had to type element(3, Record) (or whatever ordinal) for a filter/map, after manually -counting- the fields in the head of a list of records, because I didn't have the source with the record definition handy to rr() in my shell...


It'll still yell at you for typos, though. You can confuse positional with nominal, but you can't make invalid nominal statements.

Which is not at all to say that Erlang's tuple/record trick is anything but a clever trick. It is kind of neat to see how far you can go with just that, though.


The abbrevation is not arbitrary and instead is very conventional among programmers. An arbitrary abbrevation would have been "ProDaBa" but no one abbrevates like that. The casing is also standardized: ProdDB in Python, prodDB in Java and (i believe) ProdDb in C#. Though that is beside the point because you still have to decide which letters to capitalize whether the name is long or short.


In Python, I would use `prod_db` as a dictionary key and not a name that looks like a class name. That said, I do agree that some very common abbreviations are okay in some contexts.


Static typing does nothing for hashes/maps. That would only help with attributes on objects.


Right, but the example code would probably, in a staticly typed language, be implemented with an object rather than a map.


It certainly does if your keys are not strings but objects.


I prefer to counteract these potential bugs by locking the keys of a hash.

For eg. in Perl:

  use Hash::Util 'lock_keys';

  my %hash = (key1 => 'foo', key2 => 'bar', key3 => 'baz');  
  lock_keys %hash;
Now any attempts to lookup or amend a key that doesn't exist produces a runtime error...

  $hash{key1} = "FOO";   # updates fine
  $hash{key4} = "XXX";   # throws error


But now the culprit typo is typing -> when you meant :=

Maybe less likely, but if you've been writing a lot of code using -> and then in one case need := I can see it happening quite easily.


The system would complain loudly in an error message with a line number.

The problem with accidental key typos is that at their worst they are completely silent, and if the value you intend to replace was another valid value, finding the bug might not be easy.


That syntax for that purpose was totally stolen from Go. Nice to see the cross pollination going in both directions.


Go's := (short variable declaration) is closer to Erlang's => (introduce a new key) then it is to Erlang's := (update an existing key).


... and in that sense, Erlang is following existing practice better than Go did — := is "traditionally" an assignment operator, not a binding operator.


"This trick is well known to old-style functional programmers, they waffle on about Y combinators and eat this stuff for breakfast, but it’s the kind of stuff that gives functional programming a bad name. Try explaining this to first year students who had a heavy night out at the pub the evening before."

Priceless (and completely true). I really think this attitude is why Erlang is such a success.


Meh, the Y combinator is a funny hack. A lot of people waffle on about it because they're blown away by how crazy untyped lambda calculus is. There are all kinds of ways to introduce recursion into a language, though. It's simply interesting that you might get it on accident in a language will as little restriction as UTLC.


Oh don't worry, you can make a Y combinator in a typed lambda calculus too, it's just a little more involved: http://en.wikipedia.org/wiki/Fixed-point_combinator#Type_for...


Depends on the type theory. You need recursive types. Or you can just posit fix.


I think there's definitely a rule against dismissing Y combinators on news.ycombinator.com.


They're important in CS theory, sure, but you sure don't want to encounter one in raw form lurking in a regular codebase. (Just like you wouldn't want to encounter monadic IO in raw >>= form.)


Using the y combinator outside of some ridiculously abstract code would be irritating. Using (>>=) makes sense sometimes though. For instance if you just wanted to echo something you could do

  do stuff <- getLine
     putStrLn stuff
but the stuff variable is then unnecessarily introduced. You could instead just write

  getLine >>= putStrLn
or the flipped version if you want it to read more like function application

  putStrLn =<< getLine


how do neither of you get the obvious joke OP is making with "I think there's definitely a rule against dismissing Y combinators on news.ycombinator.com". (he's making a joke out of the fact that YCombinator is named after the construct.)


That wasn't a joke.. People have been hellbanned for speaking ill of Y combinators.


Really?


Maybe that was the joke.


>Just like you wouldn't want to encounter monadic IO in raw >>= form

Yes I would. I would much rather see "getLine >>= putStrLn" than a 2 line do block.


There are two side notes. 1. Should we really be constrained by frequent pub visitors when teaching CS? Adapting to stupidity (OK, mediocrity) caused Java and switch from Scheme to Python for intro courses at MIT (with great help of mr. Guttag, I suppose). 2. Should we ever hire a person who couldn't get what Y Combinator is or not being able to write one in Scheme?

Update: Python is heavily used in Biology and related research (so they said) and transition at MIT seemed like reasonable, but I still believe that Scheme shall be the language for serious introductory courses, like classic Berkeley CS61a was. Now PLT/Racket folks are doing great job of introducing proper concepts and developing "good habits" to students.


I am in the process of writing an OTP-ish dialect of Erlang that compiles to Javascript and has its own runtime environment (http://luvv.ie/mission.html) so maps is a more work for me, so I am a bit yay!/sigh!...

The thing that I am less clear about with maps is their relationship with Mnesia tables going forward. Mnesia tables take their schema from records. I suspect we will go down the route of having records that contain 'privileged' fields (effectively ones you can have indexes on) and field that contain maps which allow to 'extend' the schema without performing difficult state transforms.

It will certainly help when you need to add a field to a record and thread that change through a running system without stopping it.


Hey, luvvie looks awesome. Thanks for all your great work. I am following it. Sorry don't have much insight as to what happens with Mnesia. Just wanted to thank you for your work.


Cool, just slogging away, its is still early doors...


Whoa. This makes the language feel much more approachable. I love that they've provided both an "upsert" and a semantically distinct "update" for their maps. Working without upsert is painful, but the problem of accidentally inserting when you meant to update is an irritating pitfall. They've really picked a good middle way here.

Does anyone know how often Learn You Some Erlang is updated? I'd definitely take another pass at the language with these features in place. Kudos, team!


Author here. I haven't yet updated it, but plan to eventually update the website with these features. The thing is a lot of it won't need to change majorly.

Maps should be a replacement of data structures like dicts and gb_trees, but I personally do not see them as a replacement of records within a module, where I feel their restrictiveness is welcome, for two main reasons:

1. especially to crash early in live code upgrades, despite, I'm sure, a lot of people disagreeing with me.

2. The module isolation inherent to records makes people think at a protocol level and with their API much, much better than the common pattern of saying "screw it", sharing the state around, and breaking abstraction all over. I like how it constrains the programmer to think of what should be passed around in messages, and that maps may remove that "think hard" part of the problem.

Maps should be especially nice and enable more complex dictionary manipulations, nested key/val mapping, and so on, and in terseness of operations. More elegantly, they could be a decent fix to 'use ETS to optimize K/V operations', although they won't benefit from the same parallel access.

I plan to explain this and possibly revisit some code snippets from the book in an add-on chapter, and also show what I wouldn't change.

Regarding Funs, I probably will just add a section to the anonymous function part, and see if I ever used recursive anoynmous functions before. It's likely that I avoided them on purpose in the book and as such, won't need to add too much there.

Let me know if that sounds good to you.


Maps should be a replacement of data structures like dicts and gb_trees

Why? Why do you need to pattern-match on a structure whose keys you don't know a priori? Why should the syntax prefer one map-like data structure over another?

Don't get me wrong, I love that maps are in the language now – as a replacement for records, where syntax and pattern matching make sense and are used all the time, and the data structure issue is much less sensitive. Please don't hamper their performance by forcing them to support the use case for which dicts and gb_trees are intended!


Maps literally have a module definition that ends up being similar to dicts if you want to swap them in directly.

You can know a data structure's keys in advance while still having it dynamic and not a record: proplists, key/val lists, and so on are like that. The user has to know the keys to use them sometimes. The data structure definition doesn't.

The weakness of dicts and gb_trees is two-faced: they're both slower than maps promise to be, and they do not have pattern matching.

Records and maps are distinct in my opinion. I, for example, won't expect to be able to use Dialyzer and do type checking of specific key's values (much like with dicts, trees, k/v lists), but will do so with records.

Maps are pretty much dicts and trees with pattern matching added more than anything. They even have the same issues with considering the equality of integers and floats.

Records still have their use case as is right now. Maps have more in common semantically with dicts than they have with records, where the similarity is syntactical. I prefer to prioritize semantics over syntax.


Maps literally have a module definition that ends up being similar to dicts if you want to swap them in directly.

So? Having a similar interface doesn't mean they have comparable implementations. Even in the maps runtime implementation, we see a benefit of knowing that a structure is homogeneous: run-time type information can be lifted out of the structure, saving memory and time.

You can know a data structure's keys in advance while still having it dynamic and not a record: proplists,

proplists are like, the poster child static heterogeneous structure, precisely because I do know all the keys in advance! It's just that in a given instantiation, some of them map to the value "undefined"!

The weakness of dicts and gb_trees is two-faced: they're both slower than maps promise to be

Then reimplement the dict module. How does syntax help here?

and they do not have pattern matching.

Again, why do you need to pattern match something whose keys are determined at runtime? I've never wanted such an ability in a language; I don't see the benefit.

Records and maps are distinct in my opinion. I, for example, won't expect to be able to use Dialyzer and do type checking of specific key's values (much like with dicts, trees, k/v lists), but will do so with records.

Why not? OCaml can perform typechecking of structurally typed values. That's a solved problem. If there's no plans to support it in Dialyzer I'll add it myself.

Maps have more in common semantically with dicts than they have with records, where the similarity is syntactical.

I think we disagree on what a "map" is. You say a map is one thing (a replacement for a key-value store), but when I see syntax that looks, walks, and quacks like a structurally typed record, I see a structurally typed record, and I'm utterly baffled why those things should have anything to do with each other.


You know that maps as envisioned in the EEP (http://www.erlang.org/eeps/eep-0043.html) should let you do pattern matching using variables, right? It's not there now because it's a provisional implementation, but it will be there.

We will be able to do X = 3, #{X := Val} = Map, ... to extract the value of 'Val' under the unknown key 'X'. This just hasn't been implemented yet, but is part of the total specification.

What you see is the incomplete implementation of maps, not up to the total spec, because this is a RC1 implementation. They will support part of the record use case, but the entirety of the dict use case with more flexibility there (including comprehensions, from_list, etc.)

They are, by specification and once complete, going to be far more feature-equivalent with dicts than records.


We will be able to do X = 3, #{X := Val} = Map, ... to extract the value of 'Val' under the unknown key 'X'.

Yes. I don't understand the use case for that.


I think you may be reading too much into it. The goal is to be able to replace dict/orddict it seems like. For that, we need some way to be able to get values out for keys that are unknown at compile time (and pattern matching is a lovely way of doing that).

That is, let's say I have an in memory map of IDs to counters (for whatever reason).

I can write an implementation to increment and get the counter value for a given ID (assuming this is a gen_server and this is the call implementation) like so (well, possibly like so. I haven't used maps yet; haven't looked at the early r17 builds yet) -

  handle_call{{incrementAndGet, Id}, _, State) ->
    #{Id := Val} = State, 
    NewVal = Val + 1,
    {reply, NewVal, State#{Id := NewVal}}.
The comparable implementation right now, using a dict/orddict, might look like -

  handle_call{{incrementAndGet, Id}, _, State) ->
    Val = orddict:fetch(Id, State),
    NewVal = Val + 1,
    {reply, NewVal, orddict:store(Id, NewVal, State)}.
TL;DR - Do you need to be able to get out a value, given a key? Yes. How do you read out values given a key? Pattern match, as described.

PS - Because it pattern matches, if I decided I didn't want to throw if the thing doesn't exist, I can do cool stuff like -

  handle_call{{incrementAndGet, Id}, _, State) ->
    NewVal = case State of
      #{Id := Val} -> Val + 1;
      _ -> 1
    end,
    {reply, NewVal, State#{Id => NewVal}.


    #{Id := Val} = State

    Val = orddict:fetch(Id, State)
Sorry, I don't see the benefit of one over the other. I do know that one ties me to a specific implementation of a map that may or may not work well for my data if I use it, and that merely by existing, prevents O'Keefe's frames from being implemented in an optimized fashion.


Okay, so what you meant was not that you don't see a use case, but rather, you don't see why they implemented maps as a replacement to dict/orddict, and in lieu of frames.

Those are different questions, the latter of which not one I feel capable of answering, given that I wasn't involved with the discussions.

I assume the reason for wanting to replace dict/orddict is because they feel very much a second class citizen in the language (see at bottom for a comparison of why pattern matching trumps separate modules), dicts are inefficient for small collections, and orddicts are inefficient for anything but small collections.

As to why they went this route instead of frames, I recall seeing it in the discussions (I believe the Erlang mailing list but I can't recall), but I never paid that much attention to it.

But then, to the first, why not just keep the dict/orddict syntax, a completely frivolous example, solely created to demonstrate why pattern matching is cleaner then dict/orddict -

  %This just increments a map if the password passed in matches, sets the counter to one and the password to the passed in password if the map does not contain a password, and throws if the password does not match, but does exist.
  incrementAndGetIfAllowed(Map, Password) ->  
    case Map of
      #{password := Password, count := Count} -> 
        {Count + 1, Map#{count := Count +1};
      #{password = _} -> throw(bad_password_can_not_increment);
      _ -> {1, Map#{count => 1, password => Password}
    end.


  %We have to extract first, then do our logic separately. Behaves the same as the prior function.
  incrementAndGetIfAllowed(OrdDict, Password) ->
    CachedPassword = orddict:find(password, OrdDict),
    CachedCount = orddict:find(count, OrdDict),
    case {CachedCount, CachedPassword} of
      {{ok, Count}, {ok, Password}} -> 
        {Count + 1, orddict:store(count, Count + 1, OrdDict)};
      {_, {ok, _}} -> throw(bad_password_can_not_increment); 
      _ -> 
        orddict:store(password, Password, 
          orddict:store(count, Count + 1, OrdDict))
    end.


And that's just with a map/dict holding two things. If you have complicated logic based on the internals of the map/dict, it could remain pretty elegantly matched upon with the map, while getting exponentially uglier using a dict/orddict.


The example you give uses no feature that frames don't have.

To demonstrate why maps need syntax support, you need an example which pattern-matches keys.


A structurally-typed record would be a frame, as defined by Richard O'Keefe in his paper [1]. Maps aren't frames. Maps fit where dict and friends do.

[1] http://www.cs.otago.ac.nz/staffpriv/ok/frames.pdf


OK, great. I want frames. But how can I get them if there's already similar syntax for something that can kind of do what frames do but isn't optimized for that use case? That's a difficult sell: "Oh, yes, Erlang has structurally typed records! No, it's not the syntax that looks like a record without a name, it's this other syntax."

Maybe I'm being pessimistic, but I think frames have pretty much zero chance of being implemented so long as maps have syntax support.


Interesting! I'm surprised to hear that maps aren't viewed as a replacement for records as Joe Armstrong (probably jokingly , I now realize) declared. Your explanation makes complete sense. Thanks for sharing your perspective, and thanks for maintaining LYSE!


I definitely agree on records forming the basis for a protocol. Maps do not serve the same purpose, I could see maps _used_ in a protocol but not the basis for it.

Long live records!

It would be nice if records became a concrete thing in Erlang.


Not a direct answer to your question but Joe's "Programming Erlang" book has been fully updated, and includes a chapter on the new map feature.


As someone who owns both of his PragProg books, the second edition is also a big improvement over the first. If you ever tried learning Erlang, and didn't stick with it, but still want to learn it, I highly recommend the second edition.


>If we now build a large list of objects, all with the same set of keys, then they can share a single key descriptor. This means that asymptotically maps will be as efficient in terms of storage as records.

Credit for this idea comes from Richard O'Keefe who pointed this out years ago. I don’t think any other programming language does this, but I might be wrong.

That is a nice language feature to enforce, I love it :) This is however basically the same thing that V8 / most Javascript runtimes do under the hood[1]. Essentially, as you construct objects and add/remove keys, you back them in the engine with more-efficient shared objects that have the same structure. Add a key D and it moves from <shared structure with A,B,C> to <shared structure with A,B,C,D>. Since most code ends up doing the same kind of operations on a bunch of similar bits of data, you can save a lot of compute-time by assuming that and having a slower fallback when it fails (like using an actual dictionary instead of a Struct-like thing).

That said, Javascript has zero enforcement for this, and the runtimes may have already shifted to something different. They are quite different beasts. Just pointing out that the idea has been around.

[1] it has been a while since I've looked closely, and I may be mistaken. Call it 95% certainty.


These are great changes, and a reminder that I must find a reason to write more Erlang.

For the record, one of these is another thing JavaScript got right, sort of, with named function expressions:

  setTimeout(function innerName() {
    if (shouldRecurse())
      innerName();
  });
  typeof innerName; // undefined
I don't think I've ever seen anyone actually use this in production code, of course. And you're free to have your own issues with function declarations and named function expressions having the exact same syntax, which causes a big problem in IE<=8 that is probably responsible for this feature not being in common use (and omitted from CoffeeScript, alas).


You can also do that in Clojure, since you'd usually self-recurse with (recur) and a function will create a recursion point.

    (set-timeout (fn []
      (if (should-recurse)
        (recur)))


You can also name fns in clojure

(map (fn whatever [a] (println a)) [1 2 3])

ps: this code is stupid but I'm too lazy to write useful clojure from my phone, you get the gist


And you can also do that in Perl:

  set_timeout sub {
      __SUB__->() if should_recurse();
  };
NB. __SUB__ was added at Perl 5.16 and so requires either:

  use 5.016;  # or newer

  use feature 'current_sub';


It seems odd that we haven't standardized on a name for maps/tables/dictionaries/associative-arrays/hashes.


Well, most of those really conflate two things: static heterogeneous "maps", and dynamic homogeneous maps. The former being a structure whose keys are known at compile time (static) and whose values may be of different types (heterogeneous); the latter being a structure whose keys are not known until run time (dynamic) and whose values must all be of the same type (homogeneous).

Most languages conflate these, because they kind of look the same, but they are entirely different use cases, with different use patterns, permit different optimizations, and their generalization (a dynamic heterogenous structure) is pretty useless.

The relational model [1], which has been around since 1969, does distinguish these. The static heterogeneous structure is called a "tuple", and the dynamic homogeneous structure is called a "relation".

If only language designers would stop conflating the implementation of these structures with how they are used, we could drop the cringeworthy names "associative array" and "hash", stop calling tuples "maps" (which is a specific kind of relation), and stop calling relations "tables" (which are two-dimensional arrays, which are another kind of relation).

Erlang at least as of R17 has mostly kept this distinction (Erlang maps mostly support the static heterogeneous use case). However the majority of users are keen to jump on the "maps" bandwagon as a way to get syntax support for dynamic homogeneous use (which makes utterly no sense to me, as most of the syntax is for pattern matching and construction, which makes little sense in the dynamic case!). I am glad R17 does not include many of the proposals for such syntax that I've seen on the mailing list; and instead sticks closely to the original static heterogeneous design proposed by Richard O'Keefe.

[1] http://en.wikipedia.org/wiki/Relational_model


I agree with what you said but I don't see how a heterogeneous dynamic structure is "pretty useless" though


In a dynamic structure, you're adding and removing key/value pairs at runtime, when you generally don't know what the keys are. When you later access the structure, since you don't generally know what's in it, you can't know the type of any of the values, and hence you can't know what operations are valid on them.

The way around this of course is to tag the values with their type. This is usually done implicitly in dynamically typed languages (e.g. Javascript), and always must be done explicitly in statically typed languages (e.g. OCaml). But once you do that, you have a single type for all the things in the structure (a tagged union type) and it's now a homogeneous structure.

Either that, or you can split your mapping into multiple mappings, one for each type of thing. That has the benefit of permitting better optimization (you don't need to store per-value tags, among other benefits).

(Even in dynamically typed languages, it's better to explicitly tag things anyway. If you have a structure that can legitimately contain two kinds of things, one which happens to be an integer, and one which happens to be a string, then it's pretty likely it might later need to contain a third kind of string, which, if it happens to be an integer or a string, well, you're screwed.)


I don't follow how a relation is a "dynamic homogenous map". Can you elaborate?


In the context of relational databases, a relation is "dynamic" in that the keys (i.e. values of the primary key column(s)) are not know a priori – they are generally derived from external input to the system.

A relation is "homogeneous" because its rows are all of the same type (i.e. they have the same columns and types).

This is in direct contrast to the rows/tuples, a set of which form a relation: the "keys" of a row (i.e. the column names) are determined a priori and are fixed ("static"). (Yes, even in a schemaless database – your program almost definitely operates on a fixed set of column-keys.) Whereas, the values in a row almost always are of different types ("heterogeneous").

There are of course two other combinations – static homogeneous and dynamic heterogeneous. The static homogeneous case is the degenerate form of the static-heterogeneous and dynamic-homogeneous: the keys are fixed and the values all the same type. You can view such a structure as either dynamic-heterogeneous (ex.: a structure which tracks run-time statistics) or static-homogeneous (ex.: a structure containing configuration settings that just happens to all be of the same type) without losing anything.

The dynamic heterogeneous case is not really a useful one: generally, dynamic structures support some form of aggregation or iteration, since the keys by definition are not known beforehand. But if the types of each value may differ, such operations are ill-defined: how can you operate on a value whose type you do not know? I contend that any claimed use of such a structure is best broken down as a combination of the two: either a static-heterogeneous structure containing one dynamic-homogeneous map for each value type, or a single dynamic-heterogeneous map whose values are static-heterogeneous tuples which include a tag indicating the type of the value.


I was already with you on the other points. And, technically, you can look at a relation as a dynamic homogenous map, and that's not wrong per se.

But there seem to be some relevant points left out.

First of all, a relation is a set of tuples. You can say that a set is just the degenerate case of a dynamic homogenous map in which the values are empty and you use only the keys. That seems like a useless generalization though, and more likely to cause confusion than clarity.

Let's say you have a simple relation of person to product: this person has bought this product. It would be awkward to try to group them like: "this person has bought these products" because you could just as well invert it to "this product has been purchased by these people" and there's not a good reason to choose one over the other. Making it arbitrarily asymmetrical just makes it harder to reason about, so that's not a good choice. And neither attribute is unique. So, you are left saying that the key is {person, product} and the value is empty (or "unit", if you prefer).

And yes, that is the key of the relation, so that's not wrong. But what have we gained through this exercise? We are calling a binary relation a "map", but confusingly it's not a map of the first attribute to the second -- it's a map of both attributes to nothing!

You could say that relations representing something more map-like are common enough. But even in that case, it's common to have multiple candidate keys (which can be true even for normalized relations), and it's hard to say what's mapping to what. Again, more confusion than clarity.

I think it's best to just call a relation the set of all tuples which satisfy the relation predicate. That's the model.


I disagree with nothing you said – I'm very familiar with the relational model. I've used keys in my examples, but nothing prevents the concepts from applying to the more general cases of tuples and sets. (A set is "dynamic" in that its individual values are not addressable in a manner which is determinable at compile time – you (generally) don't know what will be in the set.)

To give separate examples without using keys, you can compare a (integer-indexed) tuple to an array. The former is static-heterogeneous; the latter dynamic-homogeneous. And thankfully, most languages (even mainstream dynamically-typed languages like Python) seem to get the distinction, even though ignoring types or implementation, they are nearly identical.


Well, hashes in particular implies a very specific kind of map/table (one that uses hashing). I think the most common non-implementation-dependent names are maps, dict(ionarie)s, tables, and associative arrays.


the reason it's in OP's list is that that's just what Perl calls associative arrays. (http://en.wikibooks.org/wiki/Perl_Programming/Hash_Variables)

Hence it's literally the exact same thing as a dictionary, but is named a hash in Perl.


Hah, in the second paragraph the author forgot, called it "associative array", and then remembered "hash" for the latter half of the sentence.


And in JS, it's an object.

Figuring out how to store arbitrary key-value pairs should be one of the first steps in learning a new language, given that every language has a different name for it.

Associative Array is probably my least favorite name for it, since it implies arrays aren't associative. It was a clojure book that first made me realize an array is a map that only allows ints for keys.


Actually an object in JS stores string keys only, if you try to insert something else it's auto-converted to strings. Array is an object wich stores integer keys. You don't really have a datastructure that stores arbitrary keys in JS (when not including third party libraries). This will be changed with Ecmascript Harmony though (dedicated Map type)


> an array is a map that only allows ints for keys

Actually, it only allows nonnegative ints for keys.

Also, it has the added undesirable property of requiring memory proportional to the largest key value, rather than number of keys.


Not if it is Fortran.


Well, there is the distinction between a sparse array and a contiguous one, both of which are technically associative. I associate the former with the term "associative array" though.


Its also an object in Python , Pharo and most likely Ruby too.


He meant that — before ES6 anyway — javascript's "associative array" are objects. As in, they're the basic objects of the languages, generally instances of Object:

    > {}.__proto__ === Object.prototype
    true
although nowadays you can use `Object.create(null)` to get a bare object, with no prototype:

   > Object.create(null).__proto__
   null


As someone who's completely foreign to Erlang, can anyone explain the rationale behind replacing records with maps? At a first glance it seems as though records are nominal types and maps are structural types, and if true that would suggest to me that these are complementary features rather than supplementary ones.


> As someone who's completely foreign to Erlang, can anyone explain the rationale behind replacing records with maps?

Simplicity and flexibility. Erlang records have to be declared and they're nothing more than syntactic sugar for tuples.

> At a first glance it seems as though records are nominal types

They're not, not in the way you'd usually expect anyway. They're little more than macros for accessing tuple fields by name:

    -module(test).

    -record(foo, {bar, baz}).

    main(_) ->
        R = #foo{bar=1, baz=2},
        io:format("~w~n", [R]).
will print:

    {foo,1,2}
the record information does not exist at runtime. In fact, you can just create the corresponding tuple and tell Erlang to interpret it as a record:

    S = {foo, 2, 3},
    T = S#foo{bar=3},
    io:format("~w ~w~n", [S, T]).
will generate no warning from either erlang itself or dialyzer, and will print

    {foo,2,3} {foo,3,3}
And although dialyzer (erlang's "type checker") can use records as type specs, it will be updated to handle map specs[0], allowing for the exact same static expressivity since you can define a map with explicitly specified keys e.g.

    -type foo() :: #{ 'status' => 'update' | 'keep', 'c' => integer() }
[0] http://www.erlang.org/eeps/eep-0043.html section "Dialyzer and Type specification", sadly not directly linkable.


Having the record not exist at runtime can lead to issues if you serialise you records using term_to_binary. If the format of the record changes, old records can no longer be accessed.

    1> rd(person, {name, email}).
    person
    2> P = #person{name = 'John', email = 'john@example.com'}.
    #person{name = 'John',email = 'john@example.com'}
    3> P#person.email.
    'john@example.com'
    4> rd(person, {name, email, age}).
    person
    5> P#person.email.
    ** exception error: {badrecord,person}


That is actually good in my opinion, because of how static records are expected to be in structure. You shouldn't be able to reload or proceed with data dating from an earlier version as if nothing happened.

You have to think about what changed and prepare for an upgrade from any version you may encounter, rather than move along with incorrect data, similar to if it were corrupted.

I fear Erlang programmers will use maps in a way that allows them to be sloppy, rather than just exploiting the flexibility they allow. Double-edged sword, in a way.


I suspect what you will find is that API's are defined with records, one or more of whose fields is a map. You do kind of do that with Mnesia tables where you want to specify the indexable-keys up front, but need to have the flexibility to extend the schema - a nice kv list does the trick.


The templating of the key descriptors is neat, but not original. NewtonScript did this in the early 90s (mostly in response to the severe memory pressure on the Apple Newton). I believe that Self did it, too (but those guys were on Sun workstations, and just showing off as far as I'm concerned :-) )


I've got a passing interest in Erlang, and also saw the existence of Elixir http://elixir-lang.org/ , which uses the Erlang VM. Could someone from either language's community comment on if they see a strong future for Elixir?



It runs on Erlang's VM. So there are chances if you pick one up or another you'll be able to at some level inter-operate.

It is up to you, try one and another one. Some people have big issue with Erlang's syntax. I actually kind of like it, so I prefer Erlang.

Others find Elixir more approachable, it reminds them of Ruby for example. Elixir has macros so you can do some nifty things with them. Those kind of blow my mind when I read them so I am afraid I would get too "clever" with them.

You can call Erlang from Elixir fairly easily so you can take advantage of libraries you find for Erlang.

So kind of up to you. Whatever you like or whatever gets you more productive and interested.


Think of elixir as a nicer and more approachable erlang. It compiles to the same byte code. It's very young but I think once it hits v1 it will have a very bright future I think.


This is great. I can attack the few chapters I skipped over in Joe's most recent book[1] that required R17.

[1]: http://pragprog.com/book/jaerlang2/programming-erlang


I have to say, this is one of my favourite programming books, even though I've never been paid to write any Erlang at all. Just the insights into how to go about building reliable systems are inspiring and profoundly affected my work in other languages.

That and it's written in a brilliantly approachable style.


> Just the insights into how to go about building reliable systems are inspiring

I found that too; I wonder if the programming community would be better off if more people were pushed towards Erlang? Similar to how every programmer should know C, so that when they write python they have some idea what's happening under the hood, it would be good for every programmer to know Erlang/OTP so they learn how large-scale reliable systems should work?


I agree. It is just useful to have the ideas and paradigms used in one's toolbox. Concurrent isolated units working together by sending messages. That is interesting and can be implemented in other systems. Like say using a message broker like RabbitMQ or 0mq and isolated processes. Or the idea of supervision. Have a supervisors OS process watch and restart its workers and so on. Those can be implemented and applied without even touching Erlang itself.


Yes, it's really good. Don't worry. I can't write any erlang either. I'm still in the try-hard phase.


Thank you for mentioning this book. I intend to learn Erlang after I finish studying Haskell, and I'll probably use this book because of your recommendation.


The EEP (Erlang Enhancement Proposal) about maps is really neat and a superb documentation.

http://www.erlang.org/eeps/eep-0043.html


This was a great read .. A very considered document.


EEP-43 [0] standards draft, with an extensive overview of new Maps features, plenty of examples, and discussion on a few different syntax proposals they had.

(Is there any newer official documentation for maps? I couldn't find anything.)

[0]: http://www.erlang.org/eeps/eep-0043.html


Kudos to all ppl involved! Thanks for the effort.

Compared with maps adding a fun-call chaining operator like |> should be child's play, right? ;)

Seriously, what are the functional programmer's tricks to handle long function call chains in languages without the |> or similar operators?

Anyway, maps are a more important addition to the environment for sure. So thanks again.


An F#-like operator |> would be useless without a convenient partial application syntax. Macro-like operators aren't well received in Erlang, certainly not when they obfuscate what really is going on.


really useless?

http://joearms.github.io/2013/05/31/a-week-with-elixir.html -> "The Pipe operator"

I really get several arguments against such "syntax sugar". I would not call it useless, though.

So how to handle/structure/avoid code with function call chains in absence of such operators?


First, I am never against adding good syntax into a language, I implemented the named funs. Second, just because Joe says something doesn't mean I agree. Third, here is my tentative answer to the piping syntax:

An F#-like pipe operator:

    X |> F.
    F(X).
A partial app syntax:

    fun map(F)/1.
    begin _1 = F, fun (_2) -> map(_1, _2) end.
These coupled together allows concise yet explicit function chaining and makes use of the most common convention which is to pass the most-changing argument last.

The problem with what Joe likes is that in X |> foo(bar), the call foo(bar) is transformed to foo(X, bar), making what really happens less obvious, a big no-no for Erlang.


Kudos for the named funs.

Care to provide some "marketing" about what they allow for? (besides and in addition to what is already in the eep-0037(?))

Why are they important? How will they support developers in writing and understanding erlang code? Do they provide even some internal/resp performance benefits?

Reg. the piping syntax: how is your proposition making the "transformation" more obvious for the reader of some application code?

To me it seams partial appl. is the more powerful feature for sure. Together with a rather "simple" |> operator it could provide the fun piping functionality. It looks like a map function is needed to wrap other functions taking part in the piping, right? Is that the "make it obvious" part you mentioned?

The piping syntax from elixir (from the joe blog above) on the other hand binds the |> operator to a more specialized, less flexible "fun call" transformation providing fun piping only, right?

Anyways, thanks for insights and again really kudos for the named funs.


With my two orthogonal syntaxes, there is no transformation. "|>" doesn't need to look at its operands to be implemented. With a macro, you have a call to foo/N end up a call to foo/N+1.

Also, coming from a functional background, where I read "X |> F()", I think "(F())(X)", not "F(X)".



Thanks to Anthony Ramine for his hard work on the named `fun' feature, and to the OTP guys as always :^)


You're welcome.


  code {background: #F7F7F9; color: #999999}
Egad, the color scheme he uses in the code blocks is horrible. The snook.ca contrast calculator says that's only a 2.66:1 contrast ratio, which is terribly unreadable. It needs to be at least 7:1.

(CSS edited for clarity)


There's been lots of chatter about maps on the mailing list; but the named inline functions thing is totally out of the blue to me, and totally welcome. Erlang does such a great job of fixing everything that OCaml almost but didn't quite get right.


Implementor here. Richard O'Keefe deserves all the credit as he had the idea and suggested the syntax (EEP 37). I just noticed that it was easy to implement it through let rec expressions in Core Erlang. No VM changes were required.


Could you elaborate on what you feel OCaml didn't quite get right WRT recursive definitions?


Maps look really interesting. They can also be saved in mnesia right?


Yes, and passed via distribution


Way to go, Erlang! As a developer, I really appreciate it.


Great! Maps looks really nice :) Will mnesia be extended to us maps as an alternative to records?


Brilliant Erlang gets even better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: