Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Sylph: the programming language I want (eev.ee)
101 points by alixmartineau on March 1, 2015 | hide | past | favorite | 116 comments


A lot of syntax / spacing / etc. problems would go away if we stopped using completely plain text as the medium for programming languages. Or rather, if we extended plain text, or got our editors to understand the languages a bit more thoroughly than just highlighting keywords and giving us autocompletion or whatever. This has been explored a bit by michaelw and others (http://www.foldr.org/~michaelw/emacs/) for lisp.

Imagine a syntax aware editor which could display code from a language using the syntax or style of another. (This might only work from a single base language, there are too many "odd" features in languages which couldn't translate universally).

So, since I like python, it could be displayed PEP8 style. Someone else could see it with LISP braces everywhere. Another could have it with 2 spaces and 'end' keywords, or {curlybraced;} (in K&R style, or whatever you like).


This really assumes that syntax is a purely superficial thing, but in reality syntax reflects the underlying semantics of the language as well.

For example, a typical C program (sequence of imperative statements) displayed with lisp syntax will look awful, yes, but so will a typical lisp program (deeply nested expressions) rendered with C syntax.

Getting rid of text doesn't change the fundamental issues: Languages would still have different semantics (and thus suit different/custom representations) and it wouldn't make any more sense to mix-and-match than it does now.


Different semantics can be approached by several different methods though - one can simulate an interpreter for a "guest" language in a host language - or the languages can provide an FFI for interoperability. While we have various means to combine the two different semantics of a language, we have no means to combine their syntax without encountering ambiguity problems. This is why storing code in a structured format rather than plain text could be so valuable, because it would enable us to mix and match languages in whatever way we wanted - rather than resorting to putting code inside strings, loading up new files, or complicating the syntax of our language to provide the expressivity we would like (eg, LINQ).

Diekmann and Tratt have a theory of Language Boxes[0][1] which provide the groundwork for achieving this kind of editing. They enable language quotation without the need for introducing per-language delimiters, the need to quote code in strings, and the need for escaping characters when hosting one language inside another. The editor is aware of where the language boundaries occur because they are specified by the programmer. The semantics of hosting one language in another are left up to the programmer to specify.

[0]:http://lukasdiekmann.com/pubs/diekmann_tratt__parsing_compos...

[1]:http://soft-dev.org/pubs/html/diekmann_tratt__eco_a_language...


Composing a larger program, by combining functions in different languages in a new framework, seems to take things in the direction of large and complicated programs. Would it not be simpler to write smaller, independent, individually named and reusable programs in several languages, according to each language's strengths, and pipe the output from one of these programs to the other? As an added bonus not all the programs need to understand the entire problem domain.


This line of thinking ignores simple domain specific languages like SQL, which 9 times out of 10 are held in strings in the host language (and thus undergo no static checking because the contents of strings are basically ignored by the compiler. There are other examples too: html files contain nested Javascript, CSS etc. PHP files host HTML. C++ effectively hosts C and assembly.

Haskell hosts a few dozen languages called "extensions" - specified in a {-# LANGUAGE #-} pragma at the top of the file. This one is of particular interest because the language appears to have some kind of extensible syntax which allows these extensions to occur - except when you look under the surface, they're all combined into the same grammar and they all interact with each other, such that other extension developers basically need an entire understanding of all of them to know where conflicts may lie.

Using a framework like LanguageBoxes instead to host these kinds of extensions would allow individual developers to put their own, independant extensions into the language, without having to hack on the compiler and rebuild it.

Also, writing several independant programs and using any IPC to communicate between them is ideal in theory, but in practice is more often unsuitable - because Unix processes are a bloat. They take time to initialize and use lots more memory than necessary for what amounts to running code for a small amount of time and discarding of it. Perhaps if we had a more lightweight model of processes ala Erlang style, this kind of reusability would be practical and not just a good philosophy to follow.


Sure - not all syntax is superficial. But a goodly proportion of it is, I suspect.

That's why I kind of was trying to aim towards saying you /COULDN'T/ have a universal language syntax translator - but for a (new?) language, you could create (possibly) different interfaces to the central logic and structure of the code to suit different preferences.

Sort of how we have different skins and fonts and so on now.


Lisp programs are data for the reader ~ in contrast to text for a lexer as is typical in other language families. So long as AST's are in the pipeline from source to execution, the options are lexing [or its equivalent] or writing an AST directly.


This was tried. It was called APL.

http://en.wikipedia.org/wiki/APL_%28programming_language%29

It was very, very clever.

It was so clever hardly anyone understood it, and it's (mostly) forgotten.

Having said that - I kind of agree. It seems like most languages are attempts to:

1. Create a human-readable text-based representation of symbolic logic.

2. Chunk the symbolic logic in (hopefully) useful ways.

3. Cross-correlate symbolic levels, so at one extreme you have actual bytes in physical memory, at the other you have arbitrarily complex symbolic data structures.

4. Build simple error checking into the language so it's impossible to write obvious nonsense. (This includes, but isn't limited to, all type systems.)

5. Constrain the operations that are possible for similar reasons. ('State' etc.)

This is all more or less taken for granted. I'm not sure it should be. Basically it's a bottom-up approach to language design - reduce operations to binary algebra, add constraints.

But top-down languages like Prolog and (kind of...) Erlang try to do interesting, powerful, things, and have at least a few features that Just Work.

So I think the top-down approach is underexplored. There's a sweet spot between top-down simplicity, expressiveness with minimal cognitive load, and processing efficiency.

It's unlikely current syntax conventions are close to it.

I think most people know that languages are different points on various trade-off scales, so there will never be one single best language.

But more variety might not be a bad thing.


>This was tried. It was called APL.

APL is absolutely nothing like the parent described.


Iverson's design of APL was well thought out to the degree he won a Turing award. His lecture Notation as a Tool of Thought lays out some of his thinking:

http://www.jsoftware.com/papers/tot.htm

Three decades after first conception, Iverson developed the language J using only ASCII characters and based on what he learned over roughly two decades of APL's deployment in the field. IMO, J is worth looking at because it may change the way a person looks at programming languages and their design.

http://www.jsoftware.com/


Interesting that you use the past tense. I am using such a language in the present. And it is working just fine. I have no need for Python but so many people are seemingly trying to coax me to use it. Such is the case for most of the verbose languages.

I do not understand all the constant discourse about languages. Personally I prefer to choose something simple and small, and stick to it. To increase convenience when working with these terse languages, I write code generators.

To me, it is not the language that matters (except as below), it is the programs that the author chooses to write and how those programs perform.

It is probably my own selective bias but I find that the smaller programs written in relatively terse languages tend to be of higher quality. When I see a verbose language used to write a program, I am reluctant to use the program. But I would never suggest that someone else make the same decisions. I believe one should think for herself.

There is not so much discussion about the terse languages I use. I am inclined to think this is a good thing. Then I can focus on the choice of the programs the author chooses to write instead of her choice of language.


What are these terse languages you speak of?


APL, J (http://www.jsoftware.com/), k (http://kparc.com/k.txt).

www.reddit.com/r/apljk/


I made a start on it http://sediment.io

There's a pretty exhausting amount of work to get to the same toolset we have for text-based languages though.


This reminds me of reading a typical zoning ordinance: more an historical sack of bandaid reactions to specific but unrelated past problems than a positive program generating a bounded creative domain. Addressing gripes is not evaluating tradeoffs.

Successful languages start with a clear idea of what problem they want to solve, creating coherency at a high level. Values are ranked and might be nice is distinguished from what is held dear. The starting point is identifiable and the passion positive. Saying "the ending point is not here," isn't enough.


It actually reminds me of a first draft mind dump. From here, some aggressive refactoring could turn this into a very nice language specification.

Brain-dumps are valuable for discussion and as a first step. From there, we just need to follow our normal programming routines: Write, refactor, test, repeat.


I hated sounding negative on the article. It represents a lot of admirable effort. My point is that the effort is not directed in a particularly constructive direction in regard to creating a new programming language.

There's a clear goal when writing a language that is both a floor wax and a desert topping. On the other hand, all writing a language with the goal of being neither floor wax nor desert topping just gets us a floor topping and a dessert wax. There's no problem solved. Griping isn't brainstorming.

Brainstorming is structured. Productive brainstorming is constrained by reality. By which I mean that brainstorming about something like type inference acknowledges that type inference is subject to the Halting problem and that dynamic typing as in Python does not entail type inference.

Starting with Assembly, all higher level programming languages are DSL's over machine code. Things I hate is not a domain that leads to insight or a coherent minimum language. Python's core was addressing pedagogical problems such as beginners learning to format code well. Google's Closure compiler addresses the problem of JavaScript performance. Rust is intended to address client server programming. For each there is something that can be measured objectively and improved.

Don't misunderstand me, there's nothing wrong with writing a programming language as a learning exercise or to scratch one's own itch or solve the world's problems. But it is an engineering design exercise not a poem. The facts of computation make it so.


Brainstorming is not a structured exercise, it's a bunch of people throwing ideas. How productive it was is not measured by constraints.


One of the structural features of brainstorming is the prohibition on criticism.

Etc.


Etc, what?

You mean that structural features like prohibition of criticism in brainstorming leads to limited and unimaginative ideas, which is the point of brainstorming in the first place, and that no one agrees on what the the structural features of brainstorming actually are in the first place?

In any case, I don't understand your writing very well because it doesn't show a lot of clarity, but if you really wanna keep up a debate, I don't mind.


We should add another stage into that routine: research.

It's no good looking at problems that other people have solved (or proven that they can't be), when we could be looking at new ways to approach problems instead. For example, we'd know that "statically inferred duck typing" is already well researched. It's called structural typing.


For some of the stuff about types and optimisation in particular, you should check out Julia's type system [0]. First-class types, avoiding inheritance, optional type restrictions, and (albeit early) support for static checking [1], it's all there.

It'd be worth taking a look at things like its macros as well – I actually think Julia hits on a lot of the points in this post really well, though of course no language is perfect.

[0] http://docs.julialang.org/en/latest/manual/types/ [1] https://github.com/astrieanna/TypeCheck.jl

Minor nitpick:

> Tut, tut. That should really be a set! It’s much faster, O(1).

Structures like sets have a lower complexity but higher constants (i.e. the overhead of hashing the value etc). When you have a small number of values, a straight linear search over an array will often be faster (which is why it's useful to have the explicit choice).


>If your language has braces, then you are indenting for the sake of humans (because humans are good at noticing alignment and edges), and bracing for the sake of computers. That’s double the effort for something as mundane as where a block ends. If they get out of sync, then your code naturally breaks in a way that’s very difficult for you to detect.

No, I use braces for the computer, and the computer automatically and unambiguously indents it for me.

If anything the wasted effort happens in whitespace significant languages like Haskell and Python where the computer can't tell what you want and you have to cycle through all sensible indentations (for every line!)


You're using a double standard, here: if the computer is good at indenting brace based languages, then it is equally good at indenting languages without braces just based on the grammar.


Not at all. In this case, by 'languages without braces' we mean languages where indentation is part of the grammar. Python's grammar includes explicit INDENT/DEDENT tokens. It is impossible to reindent this mangled python:

    if z:
    print 'a'
    print 'b'


Why would it be improperly indented in the first place? How is removing the indentation any different from removing braces?

Sure, when you paste code somewhere that expects prose (Facebook status updates, for example), it will consolidate consecutive whitespace and thereby remove the indentation, but you shouldn't be pasting code into a place like that anyway, because no matter what you do it'll be unreadable, and most people aren't going to be interested enough to put it into their editor to autoformat.


I did not make the value judgment you are arguing against.


Editors know that a python line ending in a colon must be indented and so indent it automatically. It wouldn't be hard to use the same method to reindent it.


No, lines following a line ending in a colon must be indented.

But the OP's point is that you can't know how many lines make up the indented block, since there is no explicit block-end-marker left if the indent is gone.


>No, lines following a line ending in a colon must be indented.

Except if you have, for example, an if statement. if x: y() is valid.


Yep, that's what I meant.


The computer is really quite proficient at inferring block structure from indentation, and if you use a modern editor this structure is manipulated rather effortlessly.

The main problem with indent oriented languages is that not everyone is or wants to use a modern editor. Then there are LISPers who think braces everywhere are a swell idea.


> The computer is really quite proficient at inferring block structure from indentation, and if you use a modern editor this structure is manipulated rather effortlessly.

Nonsense.

Try pasting a block of Python into Hacker News and you get something that starts off like this:

import os import random def do_something(x): if x: fd = os.open(x)

and only gets worse. A language with braces/semicolons has no such problem; it is easier to communicate with others, and the absolute value of being able to communicate with others when starting a new language is exceptional.


> A language with braces/semicolons has no such problem; it is easier to communicate with others

If someone posts more than two lines of practically any curly-brace language as one line, it's illegible. Yes, you can pull it into an editor and fix it automatically, but it's a huge waste of time that is best fixed by simply posting the code properly in the first place.


> any curly-brace language as one line, it's illegible.

You might not be able to read it, but others can, so it is by definition not illegible.

"Indention language", on the other hand, has actually lost its meaning and therefore is illegible.

> it's a huge waste of time that is best fixed by simply posting the code properly in the first place

Helping others is never a waste of time, but telling people thy are too stupid to ask for help correctly is cruel.


> Helping others is never a waste of time, but telling people thy are too stupid to ask for help correctly is cruel.

Asking others politely to help you help them is not cruel in any way. If you allow a community to become a place where people can ask questions without putting any effort at all in themselves, you end up with all the people who can actually help leaving, and your community being full of people who will only ever put in the bare minimum of effort.


"Help you help them" however is only necessary in the python example because data is lost. The indention doesn't mean anything to C programmers, and thus I need no help from them to help them.


> The indention doesn't mean anything to C programmers

So you could read and debug a 5000-line program written all on one line without reformatting it?


What does "5000 lines all on one line" mean?

If it fits on a page I don't have trouble reading it. Line breaks tend to make it harder to fit on a page and not easier.

Example:

http://www.nsl.com/papers/origins.htm


You can honestly read and understand that as-is? And adding line breaks and proper indentation would make it absolutely no easier? Impressive. I don't think there's many C programmers who have that ability.


Yes I can read this, and adding line breaks would not make it easier.

In fact I think it would make it worse.

This has come up before, if you're interested:

* https://news.ycombinator.com/item?id=8476189

* https://news.ycombinator.com/item?id=8476702

It is also something I set out to learn how to do because I saw someone else do it: If someone produces smaller and faster code than me, then I should want to learn from it.

When working with others, I will often extemporise my code for their sake. I sometimes make mistakes when doing this, but sometimes making friends is more important than fast and correct programs.


> it is by definition not illegible

illegible, adjective: not clear enough to read.

Says nothing about "others being able to read". A doctor's handwriting can be illegible to everyone but the doctor, and not be a contradiction to the definition of illegible.

> Helping others is never a waste of time

If someone cares so little as to paste unformatted code on a forum which provides simple tools to display formatted code, how much value can you really put in their ability to provide quality code? How is requiring someone else decipher something you didn't care enough to post correctly not ultimately a waste of everyone's time?


"If someone posts more than two lines of practically any curly-brace language as one line, it's illegible." "...you can pull it into an editor ... but it's a huge waste of time..."

Well, I know it may sound a little bit awkward, but sometimes I just want it to see it work before looking into it, and copy-pasting curly-braced kind of code is perfect for that. So the curly-braced style solves a lot of such inside content distortions and actually frees me from a huge waste of time.


[Note: if you add two spaces at the beginning of each line, it will look like code:

  import os 
  import random 
  def do_something(x):
    if x:
      fd = os.open(x)
Note2: If one line with two spaces is too long, it will mess the whole thread, so if you ever want to quote use italics instead of spaces.]


Sorry about being pedantic.

PEPpy Python should have four spaces at each level of indentation...not two.

Now I feel better


> Try pasting a block of Python into Hacker News and you get something that starts off like this:

You're talking about a web site that seems to have been written by web developers who don't know Javascript exists.

Any Python parser will be able to format your snippet correctly.


It's only problematic because the comment grammar used on HN for non-code folds consecutive whitespace down to one space. Ironically, this was done to work around those who feel that text documents should be hard-wrapped at ~70 characters. If the grammar did not apply this folding, then it would be happy to preserve the structure.

If I were designing a language, its compatibility with naive commenting systems (or catering to users who refuse to use code blocks provided by most modern commenting systems) would be pretty low on my list.

Whitespace, even disregarding indentation, is already very significant to a number of modern languages. Why, then, is this particular form of significance bad?


C code with whitespace and new lines stripped is equally unreadable.

You are arguing that braces should be kept so people who don't indent their code properly on HN can be understood.


Just put two spaces before each line (https://news.ycombinator.com/formatdoc: "Text after a blank line that is indented by two or more spaces is reproduced verbatim. (This is intended for code.)"):

  import os
  import random
  
  def do_something(x):
    if x:
      fd = os.open(x)


Why the downvote?


Because petulant children can downvote here. It was a good comment, and I learned something from it. So thanks!

I wonder how a comment system that made downvotes public would do. And further, if the downvoter was required to give a one line justification. And further yet, if the downvote itself could be downvoted.

Oh, scrap all that. I'd like to see would happen if downvotes were eliminated entirely, so things sorted to the top purely by upvotes.


The only thing superior to Sexps is Forth. Mechanical syntax forever.


A token is a token. PEPpy Python's four spaces are isomorphic to Lisp's left banana and C++'s curly brace. An ASCII <bell> would be just as well with a lexer.

White space has significance for humans. The road to the Turing tarpit is paved with human intentions. Describing computation mathematically is a solved problem. Expressing it clearly for and by humans, not so much.


What's effortless is putting begin and end of blocks symbols only where they matter.

Haskell is particularly guilty of creating several different indenting possibilities, and does require constant intervention. Python is less bad, but still requires more effort than C for example.


Er… you know Haskell's syntax is defined as braces-and-semicolons right? And the braceless-and-semicolons-free syntax is the addition of a few layout rules[0] for human convenience.

[0] https://www.haskell.org/onlinereport/syntax-iso.html#sect9.3


The problem is the addition of "a few layout rules" turns the context-free syntax of the language into a context-sensitive one. For example, if you were to write a parser using just the information from the link you've posted, you can enter a scenario where the same piece of code pasted in two different places in a code base mean two different things.

Case in point: putting if _ then _ else on separate lines is fine, but inside a do-notation it causes problems because the layout stage inserts semicolons between the parts, and the production rules listed at the bottom of the page do not include optional semicolons in the "if" alternation. You need to indent the then/else to overcome it. This problem is fixed with a hack[0], which fortunately has no side-effects for the rest of the language, but it just shows that throwing together ad-hoc rules for parsing code, rather than well understood theory like LR parsing, is inviting room for problems.

With CFG subsets like LR/LL, every production rule in these grammars is also a valid LR/LL grammar - in other words, it's not only the entire grammar of a language that is context-free, but all the parts are too. An indentation-sensitive language is only context-free when looked at for the entire language grammar, but the individual production rules of are context-sensitive. You can't just copy and paste some code into the syntax without providing this context (by means of manually indenting the code).

[0]:https://ghc.haskell.org/trac/haskell-prime/wiki/DoAndIfThenE...


Yes, I know that. Currently I'm having a better result using parenthesis instead to completely escaping from the semantic blank lines. It's cleaner than using semicolons everywhere.

Anyway, Haskell is a very instructive example about how semantic indentation creates problems.


You really have to design your language around it for it to work well. Haskell has a a lot of syntactic craziness in the name of simplicity.


What? Haskell has a base language expressed in terms of braces and semicolons, and a relatively simple set of layout rules defining how the compiler can insert these braces and semicolons automatically.


Right. I meant that Haskell has a whole lot of other weird syntax to consider, that make it hard to read compared to Python. It's use of indentation is not one of the problems.


Charitably speaking, that is very confusing considering you responded to:

> Haskell is particularly guilty of creating several different indenting possibilities, and does require constant intervention

with:

> You really have to design your language around it for it to work well. Haskell has a a lot of syntactic craziness in the name of simplicity.

Where "it" would seemingly be indentation, implying Haskell wasn't designed around making indentation simple.


I like writing compilers, how do languages like Haskell and Python handle blocks-by-indentation? I figure you have some sort of INDENT token, but how do you know its width? Also, in specifying the grammar, with explicitly delimited blocks you just say `block := OPEN stmt_list CLOSE`. What about with indentation?


I think they specify "INDENT foo DEDENT" and the lexer tracks indentation and issues INDENT/DEDENT as appropriate.


This is correct. A bit of state (an indentation width stack) in the lexer creates the block boundaries as clearly for the interpreter as normal braces does.

Any problems in parsing are typically only encountered by stateless parsers.


I have one (probably quite naive) implementation here:

https://github.com/djc/runa/blob/master/runac/parser.py#L54

(Runa uses tabs for indentation, not spaces, but it's basically the same thing.) Whitespace at the start of a line is a separate token, then I add a thin layer of processing that turns a change of indentation level (i.e. count chars in the token contents) into INDENT or DEDENT tokens.


> you have to cycle through all sensible indentations

What? My editor indents stuff for me as per PEP8. I don't have to think about it. Why do you?


  def fun():
      for x in xs:
          if a:
              b
            c
There's no way for an editor to know where to re-indent c to. You have to cycle through the indents manually.


Perhaps I misunderstand, but vim, not a particularly uncommon editor, treats backspace at the start of a line by removing one indent level.

So, you're at `b`, you hit enter: next line, same indent level. Now you hit backspace: indent level matches `if`. Again: indent level matches `for`. Type c, there it is.

If you had curly braces, you'd use } instead of backspace. Literally the same number of keystrokes, no?

Or were you talking about something else and did I misunderstand?


How did you end up with c there, honestly? It's not a thing that's ever happened to me.


Auto indent, tab, and shift tab are your friends. Really, it isn't that bad, considering the curly brace alternative is even worse from an effort perspective.


To each their own. I came to C++ from Haskell and found braces to be superior (except visually).


> it’s sometimes asked why len in Python is a function, rather than a method. The answer is that of course it is a method, called __len__. The real answer is that Python pointedly and deliberately does not reserve any method

That's some comical post rationalization.

The reason why len is a function is because it was in Python before Python started receiving OO features.

As a result, Python is a crazy hodge podge of imperative, OO and functional semantics where you never know if you should be calling `o.foo()` or `foo(o)` unless you've been writing Python since the late 90s.


This is a myth. Python supported classes close to the start of its existence, way before it became popular.


If you have some evidence to back that up then you'll want to correct http://en.wikipedia.org/wiki/History_of_Python

"In February 1991, Van Rossum published the code (labeled version 0.9.0) to alt.sources. Already present at this stage in development were classes with inheritance, exception handling, functions [...]"


I really like the approach of allowing either kind of call, with the "dot" style mere convenience over the function style. Rust has that sort of thing:

    let v = &vec![1u32,2,3];
    println!("{}", v.len());
    println!("{}", Vec::len(v));


In general this feature is sometimes called "Universal Function Call Syntax" (UFCS), though interestingly Rust does it exactly the opposite of most other languages with this feature. In Rust, defining a method lets you use it as a free function; in C# and D, defining a free function lets you use it as a method.


I always thought this was a weird nod to typeclasses


This bothered me a lot:

> I seem to have a knack for trying to write things in Rust that rely on intertangled references and mutability: first a game, then a UI library…

First of all, a UI library is EXACTLY the kind of thing I want you running through Rust and having a long, drawn out fight over mutability. Every UI we currently have absolutely sucks in a multi-threaded context. I want the compiler to make you think long and hard that maybe, just maybe your ideas about how to architect a GUI library are very broken and that you have to restart from a clean slate.

As for a game, do you really have that much mutable state, or are you just conditioned to use mutable state by default? Games do have lots of mutable state, but games also have lots of bugs due to multi-threading that being forced to analyze mutable state exposes.


This is a great bag of "Hum, these things irritate me, let's do better". Couple critiques come to mind:

- too Rust/Python focused - the author needs more rounding of PL experience. I would suggest spending some more time with Haskell/ML and then some time with Common Lisp/Scheme. Reread the Programing Languages survey textbooks from college.

- not really enough familiarity with the theoretical side of PL. A metric ton of work has been done in academia, some of the listed problems have been solved, and at the worst, will give the author a nice way to go to sleep at night. This kind of relates to the previous point.

The concept of Locality I think is an important one and a real take-away from this essay. You want to be able to ensure that your code is meaningful locally without bouncing around half a dozen modules and inheritance trees just for basic understanding.


Whenever I see something like this I feel like it is someone unfortunate who has thought a lot about language design but unfortunately hasn't heard of julia. His list is almost a checklist of the reasons julia was created.


The inference example is actually a good illustration for why you want to add types to a function's signature.

As the author mentioned, a machine can look at how the arguments are used and make some educated guesses. The big problem is that humans have to do the same.

That's why I like optional types à la Dart. Not only do they help with static analysis, they also act as documentation.

With a good editor, this documentation is right at your fingertips. When you type a function's name, a call-tip will remind you of the expected arguments. If you've added type annotations, the types will be there, too.


> a machine can look at how the arguments are used and make some educated guesses. The big problem is that humans have to do the same.

Why? Just get the machine to do it.

It's like when you see programmers doing arithmetic in their heads. "You're sitting in front of a glorified calculator!"


> Why? Just get the machine to do it.

I meant the writing stage, not the checking stage.

Imagine you don't know which type to pass. So, you'd have to pass something (null or whatever) in order to generate an error which gives you a clue.

That doesn't sound very convenient.

Well, the machine could check which types match that fingerprint, which might work okayish if the list isn't too long, but it won't tell you anything about the writer's intend.

Is a list of doubles, a list of ints, a Float32x4List, or a Float64x2List really the same thing?

What if it's actually meant to be used with something else which also supports the [] operator and whose items support the + operator.

Static analysis might not be able to tell you because it doesn't know all packages in existence. That package might not have been referenced, because that's not required to use a type from that package.

How does the generated documentation look like? It takes one argument "foo" which is something which has a "bar" field and a "baz" method? I'd rather have a concrete type there. "int x". Done.

What if you optimize the function a bit and now the fingerprint doesn't match one or more types anymore. Was this a breaking change?

Your intention didn't change. You always had one particular type in mind.


I'd encourage you to try writing some Haskell. It's a good example of type inference that works.


You didn't answer my questions. I have no doubt that untooled unitypedness can be improved upon.


> Just get the machine to do it.

Exactly. Many people already use IDEs, or programs which provide auto-completion or hinting of the types a function accepts when writing out a function name - the type does not need to be explicitly written for the machine to identify the type of the function.


What types does the following function expect for the arguments a and b, and what type does it return?

What should an IDE tell the user about the types of a and b in the completion popup?

function foo(a, b) { return a + b; }


In some static languages, the inferred type for a and b could be Numeric, and foo's type could be:

    (Numeric, Numeric) -> Numeric
Which seems good enough to me.

But what if + is also String concatenation? Or any other overloading of "+". Then maybe the type of a and b is Something_that_can_be_+ed. The user can then think "ok, I'll pass a couple of Ints to obtain an Int, or a couple of Strings to obtain a String!". This also seems useful to me.


Now consider the following buggy code:

    var price = document.querySelector('input#price').value; // "10"
    var tax = document.querySelector('input#tax').value; // "1"
    var total = foo(price, tax); // "101"
This will concatenate two strings together, even though the author intended to add two numbers. If the foo function had been annotated as numeric the IDE could have provided a warning about the incorrect types and put a squigly line under foo.

I find this type of fast feedback improves programmer productivity. It means that you get warned straight away, and don't have to do a trip through the debugger to find the problem.

Just because the compiler can infer that a specific type could be passed to a function, that does not mean that the developer intended that function to be used with that type.


Oh, agreed. I assumed foo was meant to be generic, i.e. useful over more types than just Int. If it was just meant to be used for Int and nothing else, it should be annotated accordingly. Otherwise the compiler, quite correctly, determines that foo is more "generally useful", which may have unintended consequences.


In this case, the issue is with using + for string concatenation, a very bad idea IMHO.


Note that the interface between the machine and the human is one of the slowest. Being able to learn things about numbers, or about code, without touching the keyboard / mouse is important.


Well have the machine annotate the code then.


What the machine annotates and what the programmer wants to read are two very different things, for we often condense some long, confusing type signatures into things we can talk about - for example, a lens.

    type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t 
If we want to compose several of these, for example, with our regular function composition operator, then we check the resulting type - it's not what you really want to see.

    fun l1 l2 = l1 . l2
    :t fun
The signature given is less than helpful - if you manage to parse it you'll probably have forgot what you were tying to do next. When all we really wanted was a signature like

    Lens' b y -> Lens' a b -> Lens' a y
Making the machine pick the "right" representation is probably a futile task - there could be any number of ambiguous representations of the same type, but the only meaningful one is the one the original programmer intended.


If you follow the train of this thread, you're now arguing against type inference by sighting an example from the canonical type-inferred language.


I'm not arguing against type inference, I'm arguing against omitting type signatures.


> It's like when you see programmers doing arithmetic in their heads. "You're sitting in front of a glorified calculator!"

At the company where I work, most engineers have real, actual calculators on their desks. I think I even saw a slide rule once. So they sit there, sometimes, doing arithmetic on their calculators and then type the results into an Excel spreadsheet.


> With a good editor, this documentation is right at your fingertips. When you type a function's name, a call-tip will remind you of the expected arguments. If you've added type annotations, the types will be there, too.

Without type annotations, the types will be there too for a language with type inference. The editor just has to ask the language implementation "hey, what is the type for this function?" and show the result back to the programmer.


I kind of like languages that require "a complete upheaval of [one's] mental model of the universe". :)


I think it's necessary to look for such alternative approaches when the ones we're using are resulting in ~5 bugs per 1k lines of code on average, frequent security issues, crashes, race conditions, etc. Plus the hundreds of man hours spent fixing these problems.

With technology coming more and more into our lives, such that our lives will depend upon it behaving as was intended, we're really going to reach a point where we've got to say enough is enough. The languages (and approaches to development) we're using are not fit for purpose anymore.


The lang I want would be mixture of C, C++, Go, Rust, Julia, JavaScript 6, PHP 7, Swift, Lua.

  * Optional typing (like PHP7/Hack/ES7)
  * support for compilation (statically linked native
    binaries) and JIT (like Visual Basic 6 with its P-Code)
  * memory safety (no null/dangling pointers like Rust)
  * procedural & object oriented & functional style 
    (like JavaScript/PHP/C++)
  * modern base standard library / API (like 
    C/Go/JavaScript/PHP but more modern with better naming, 
    not as heavy as Java class libraries or .Net Framework)
  * third party libraries (like Nodejs NPM)
  * online documentation with code samples and 
    community comments (like PHP.net)
  * good debugger
  * IDE plugins for IDEA/Eclipse/VS
  * good OS support (32 & 64bit) for:
    Windows & OSX & Linux & BSD & iOS & Android.
Edit:

  * Actor model (like Erlang)
  * Built-in concurrency primitives:
    ° light-weight processes (like Fibers in WinAPI, 
      Coroutine in Lua, Goroutine in Go), 
    ° Channels (interprocess communication and 
      synchronization via message passing like in Go 
      and OCaml)


You might want to check out Dart (https://www.dartlang.org)

* Optional types

* The VM compiles source into machine code on the fly, or you can embed the VM into your app.

* No pointers.

* Core library is full-featured (https://api.dartlang.org/apidocs/channels/stable/dartdoc-vie...)

* Third-party libs and packages in https://pub.dartlang.org

* Docs at api.dartlang.org and www.dartdocs.org

* Debugger works in Eclipse, IntelliJ, WebStorm

* Works in Win, Mac, Linux, 32 and 64 bit.

* Also runs on ARM and MIPs

* Isolates for memory-safe concurrency

* Async primitives like Future, Stream, and async/await

(disclaimer: I'm a PM on the Dart team)


Other than IDE plugins for other IDE's, I think Racket has most of this covered.

I don't however think that's what was desired.


D?

Hadn't it boast itself as a C++ replacement(which pissed off the C++ community=bad rap), I think it would have been a very good candidate. D minus all the unsafe features looks better than Go, IMHO. Go lack of expressiveness forces developers into copy/paste mode.


The language closest to what you have described I believe is Dart, but it does have a VM.


"modern base API", what does that mean?


I meant "language standard library" (I updated the text).

C/C++/Object-C/PHP have short functions with names like strlen, strstr, etc. Java/C#/JavaScript have .length() and things like System.out.println()/Console.Write(). Ideally, I would like a middle ground. The Java/C# standard library is too verbose and the C standard functions are a bit cryptic.

It's great one can use different programming styles in JavaScript/PHP/C++. In PHP many functions are available in procedural and object oriented style, e.g. http://php.net/manual/en/mysqli.query.php


My own point of view is biased, admittedly, but I couldn't stop thinking that Common Lisp was relevant for each item in your list.

Of course, much isn't built-in, but this is not as bad as it sounds: concurrent channels are not part of the language specification, but you can install the "calispel" or "cl-actors" libraries that gives you what looks like primitive constructs.

Regarding "modern" languages, I like this quote from the author of pgloader: "I switched from Python to Lisp because I wanted a modern language" (also transcribed as: "[...] in searching for a modern programming language the best candidate I found was actually Common Lisp."; see http://tapoueh.org/blog/2014/05/14-pgloader-got-faster.html)


I would add the Actor model of Erlang.


the section on loops, in particular, really captured a small but ubiquitous frustration in every language I've used. we really should be able to say "this is the first iteration", " this is the last iteration" (though that one might not always be possible) and "did the loop run at all?" without getting into index comparisons or manually setting and modifying flags.


Peel 6 has phasers to do exactly this.


beautiful. i really should take the time to explore perl 6.


I find myself nodding along as I'm reading this.


I would usw tab instead of space, so everyone can adjust their indentation width.


That's fine for indentation, but all of it must be exclusively tabs, no spaces. But then to align other things, you have to hold space and do it manually.


In many respects I think what you want is Rebol.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: