Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1 based indexing? is there any particular reason Julia uses 1 based indexing? seems like a pointless thing to differentiate from other languages


Math/Science focused languages have usually used 1-based indexing, historically. Fortran, MATLAB, R, etc. In the primary fields where Julia is expected to be used, 0-based indexing would be a differentiator, not the other way around.


> We have chosen 1 to be more similar to existing math software.

https://github.com/julialang/julia/issues/558#issuecomment-4...


It's common in the area, R and Matlab uses 1-based indexing. Fortran does it by default too.


There seems to be a kind of Godwin's law for Julia that states that "When on an online discussion someone mentions Julia, the probability of a complaint about 1-based indexing is 1".


Or maybe it should be "When on an online discussion someone mentions Julia, the probability of that discussion turning into a 0-based versus 1-based indexing discussion is 1"


Personally, I find it annoying considering it's just based on convention from other math software. Otherwise, an amazing language.


When I started Julia I was annoyed by it.

Then I realized that 90% of my mental gymnastics with indices got simplified. Explaining indexing to newbies is now immediate where before it took a slide, several examples, and a clever picture. Slicing is also more natural:

"1:3 picks out index 1, 2 and 3, which are the first, second and third element of the array."

Instead of the python version of

"0:4 picks out index 0, 1 and 2, which are the first, second and third element of the array."


> "0:4 picks out index 0, 1 and 2, which are the first, second and third element of the array."

I don't think that's right? It should pick out four things, not three.

Of course, you could argue that getting that wrong proves your point! I personally still prefer 0-indexing, but I definitely agree with you that having an exclusive upper bound is confusing.


You're right of course. Not intentional.

0:3 picks out index 0, 1, 2.

So 1:3 picks out the second and third element, which are indexed by 1 and 2 respectively.

My guess is that what one prefers probably depends entirely on the data structures one works with most....


I like how you got it wrong, proving your point.


Less off by one errors for one.


Whereas 0 based indexing is based on convention from other languages where it arguably made more sense because it was just a convenience over pointer arithmetic. What reason other than convention is there for 0 based array indexing in any any language where an array isn't just a memory location and the index a calculation of type-size times index bytes?


Edsger Dijkstra on why numbering should start at zero: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E...


Which is a valid argument for numbering starting at zero, but not just for programming languages, but in general and including human spoken languages. The common counter-argument is "People don't generally think that way unless trained to do so". Like many things in life, often convention and momentum often trump reasoned argument.

In general, much of what's chosen for language design is influenced to some degree by convention. This is a valid design choice, because you're maximizing use of existing domain knowledge of your audience.

In this specific case, I would argue that since in many languages where an array isn't a small shim over memory management (Ruby, Python, Perl, JS, etc) where easing user friendliness is at a premium over performance, 1 based array indexing would make better sense... if most people adopting those languages weren't already fairly well acquainted with 0 based indexing from CS programs or exposure to other languages (and it wasn't a simple concept to understand). Since many new users are familiar with it, and it is easy to understand, it's simple to just go with what some percentage of people already know, so convention wins out.


The One True Wiki has a fairly complete discussion of this question:

http://wiki.c2.com/?WhyNumberingShouldStartAtZero

What I find most persuasive is that 0 based indexing brings measurement into accord with enumeration.


There are many rebuttals to this claim by Dijkstra, too.


Funny how the argument about veering into the unnatural numbers is voided by a Maybe type.


Maybe this is a stupid idea, but at this point the economy is tied to 0-based indexing so much, that I would rather get math moved to 0-based indexing, than computing to 1-based indexing.

Being able to easily port code between languages (and math articles) is important, when there's no clear winner, I like consistency.


Good luck with that. The 0-based indexing is non-existent outside of programming. No one who doesn't do pointer arithmetic would come up with it.

In mathematics 1...N is a perfectly good notation. Julia expresses the same idea with 1:N. This is consistent with spoken language (1 denotes the 1st element). Consistency, if anything, would not come down in favor of the software engineering choice here.


A lot of new very important papers in i.e. machine learning, quantitative finance come _after_ being implemented in computer code. I just looked at some, and they already seem to use 0 index for the initial states / base layer... here's just one example, where 0 and 1 based indexing is mixed, but I see it everywhere:

https://arxiv.org/pdf/1804.02717.pdf


Except European hotels, where the first floor is one up from the ground floor, making floor number effectively zero-based.


It's bad for me too, also missing some other features of Ruby (for example or, and keywords; ,,if'' keyword as a modifier, using blocks for passing a function as a parameter for functions), but there were some great surprises: the windows support seams superior to Ruby's, where I get a lot of compile errors; of course more/better math libraries.


Because they're designed for people, not machines. People count from 1.


There is no more correct. There is simply more to purpose.

0-based indexing won in programming because it is simply more to purpose in programming. We rarely operate on ranges, but we operate on offsets all the time.

And 0-based wasn't solely "C won so 0 won". We had 1-based languages for a LONG time, and, if they were sufficiently superior, they should have displaced C. They did not.

In addition, in proper programming languages you don't count--you iterate, fold, accumulate, etc.--and avoid the index altogether because it is error-prone.

1-based indexing causes all kinds of havoc in circular ranges. In particular when you try to access things in a circular manner (very common in programming--uncommon in mathematics), it causes grief.

// 1 based

new_index = index % N + 1

new_index = (index - 1) % N // Careful: the parentheses are REQUIRED

new_index = (new_index == 0 ? N : new_index)

// 0 based

new_index = (index + 1) % N

new_index = (index - 1) % N

Range discussion from Dijkstra in 1982: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E...


Actually what's funny here is that (index - 1) % N doesn't work in C, as it can be negative. You have to use (index+N-1) % N.

I still prefer 0 based indexing (for example for calculating the size of an array), but the worst thing is inconsistency between languages. 0 won, that's it.


Oh, that's hilarious. I tested this in Python where it worked fine and forgot that C has a "broken" modulo operator:

Python:

  >>> -1 % 20
  19
C:

  printf("%d\n", (-1 % 20));
  -1
Even worse, it still fails even on unsigned:

  printf("%d\n", ((unsigned int)0 - (unsigned int)1) % 20);
  15
So you still need to add N:

  printf("%d\n", ((unsigned int)0 - (unsigned int)1 + 20) % 20);
  19
Thanks for the reminder of humility.

(Side Note: For those reading this, the C operator isn't "broken", per se. There are three properties that modulo can adhere to but two of the three are mutually exclusive.)


C does not have a modulo operator. It has a remainder operator, which works perfectly fine. Calling C's `%` a 'broken modulo operator' is like calling `+` a 'broken minus operator'. Quite a few languages have separate operators, keywords, or functions for 'modulo' and 'remainder'.


> We rarely operate on ranges, but we operate on offsets all the time.

I would disagree with that assertion. Maybe it was true historically, but just look at how often ranges are used today - the fact that many languages have an abstraction for them in the core library is a testament to that.

I would also argue that having ranges (and underlying iterators) as opaque abstractions is preferable to conflating them with indices. Then you can have your cake and eat it too - the elements are counted naturally, but if you have an iterator to the first element, you can deal with 0-based offsets just as naturally.


Could also be that C is superior in other ways, there isn't really a big difference between 0 and 1 based indexing, and thus 0 based won on the back of Cs other strengths.

The Dijkstra discussion is only aesthetic preference, nothing more.

Edit:

In Julia the examples would also idiomatically be written in terms of the provided mod1 function:

    new_index = mod1(old_index + 1, N)
    new_index = mod1(old_index - 1, N)


> Could also be that C is superior in other ways, there isn't really a big difference between 0 and 1 based indexing, and thus 0 based won on the back of Cs other strengths.

Possibly, but then 1-based indexing certainly isn't enough of a positive to overcome the other stuff. And that's evidence, too.

> The Dijkstra discussion is only aesthetic preference, nothing more.

Dijkstra's comment says that people using the other 3 conventions were committing more errors--that's data.

> In Julia the examples would also idiomatically be written in terms of the provided mod1 function:

Agreed. The proper way is to encapsulate that behind a function so you don't have to think about it.

However, if you have to unpack that and repack it all the time (for example, Lua calling C), then you can't just encapsulate and forget about it.


Are you counting the fence posts or sections of the fence?

Edit: I think I screwed up that analogy.


They shouldn't. It's annoying that we live in the 21st century when the years start with "20".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: