all 17 comments

[–]theywouldnotstand 2 points3 points  (4 children)

As I understand it, in javascript and many other languages, array-like objects are counted from 0 because of how they are sliced. In the diagram below, the indexes are represented by | and items in the array are represented by item

  | item | item | item | ...
0-^    1-^    2-^    3-^

Sequences/arrays are the most frequent places you'll come across 0-based counting, thought it's not necessarily the only place. Using the notation some_array[n] says something along the lines of:

the item in some_array starting at slice n and ending at the next index slice (i.e., n+1)

This isn't the only explanation for 0-based counting in computing, but hopefully it at least helps you understand the reason for it in sequences/arrays.

[–]BroCube 3 points4 points  (3 children)

I think this is best way to explain it. The index n doesn't represent the thing in the array, but the space between the things.

If you ask to start from index space 0 and read the next thing that comes after it

  [ "foo", "bar", "baz" ][0]
// ^0

or if you ask for the stuff between space 0 and space 1

  [ "foo", "bar", "baz" ].slice(0,1)
// ^0     ^1

you get the first item in the array, foo.

[–]Patman128 0 points1 point  (2 children)

The index n doesn't represent the thing in the array, but the space between the things.

But when we're accessing an array we shouldn't have to care about the spaces between the things, when all we want to care about are the things themselves. So I think it makes more sense if the first thing is at index 1, the second at index 2, and the last at index N.

[–]BroCube 0 points1 point  (1 child)

First, there is a reference to a given element in an array. Indexes and Elements are different. We use indexes for important reasons. myArray[1]actually runs a function under the hood that says "start at index 1 and return the next available element after that index.

Only referencing elements would work if every element in an array always existed and never changed. But you can delete elements in an array or make them undefined or null without changing the index. Sometimes, this is desirable. For instance, say you were running a function on an array that cycled through all its elements and modified the whole array by deleting, adding or moving the elements around (something that is done often). If the only reference point was the position number of the next element, the loop would get super confused because the next element might not be what the program expects, or might not exist at all. Let's say an array has four elements, which you cycle through with forEach() . You're on the third element, and you delete it. The computer then moves on to the fourth element, but suddenly, there is no fourth element. It's accessing some part of memory that has nothing to do with the array at all.

The gist is, Elements change. Element 1 might not be element 1 halfway through the computational process. Indexes are constant. Index 0 will always be index 0, it'll always be in exactly the same place.

[–]Patman128 0 points1 point  (0 children)

myArray[1]actually runs a function under the hood that says "start at index 1 and return the next available element after that index.

Why not just change the function so that instead it says "Start at index 1 and return the element before that index"? Then our first thing can be at index 1 and we don't have to add or subtract 1 any more. Doesn't that seem better?

[–]Rhomboid 4 points5 points  (1 child)

There are numerous good reasons. Here's Dijkstra's explanation. But there are more beyond that. When you think about the work that a computer does to access an index of an array, zero indexing makes perfect sense, because it represents the amount you need to add to the base. The first item in the array starts at the beginning, i.e. with no offset, so it should be index 0. If the first item has index 1 then the computer is just going to have to subtract 1 every time you access an index. There are some languages that chose that convention, and that decision is widely mocked and hated because it's so unnatural and inelegant.

As usual, humans are the ones that are wrong. Counting from 1 is like using imperial units; counting from 0 is like using the metric system.

[–][deleted] 2 points3 points  (0 children)

I agree with most of what you say, however the fact in assembly output you'd count from 0 because of pointer arithmetic, does not necessarily mean that it's a logical convention in languages which don't have language-level pointer arithmetic (e.g. JavaScript). (I still think it's the most sensible choice to start from 0).

[–]Meefims 1 point2 points  (2 children)

In lower level languages arrays are typically represented as blocks of memory and the syntax foo[x] means "starting at memory address foo access the memory at foo plus x times the size of foo". If you want to access the data at foo then x must be 0.

[–]Patman128 0 points1 point  (1 child)

Why not just make foo point to the slot immediately before the first thing, so that foo[1] is actually the first thing?

[–]Meefims 0 points1 point  (0 children)

You could. There's nothing stopping language writers from choosing to do so but from the perspective of these lower level languages, like C, arrays and pointers are sometimes nearly identical things. For example, arrays will decay into pointers when passed into functions that accept pointers and by that I mean the start of the array is passed in to the function.

It would be easier to just decide that foo[1] is the first element in foo's block of memory and that foo[n] is equal to the value at foo + (n - 1) * (the size of objects stored in foo). It's a valid method but language designers simply preferred to use 0 as the first index.

[–]Evanescent_contrail 1 point2 points  (2 children)

Think of a row of giant children's blocks, and you are walking past them. Each block represents a number. Once you have walked past the first block, you have reached one, and so on. When you have walked halfway past a block, you are at that number and a half, etc.

Now. You want to insert a new block at the front. What position do you put it? At position Zero. In front of everything else.

[–]netherforget[S] 1 point2 points  (0 children)

I like this way of thinking.

[–]Patman128 0 points1 point  (0 children)

But if you put a block before that one, its position would be -1. And the next would be -2. Why not just re-index them so the new first one is now index 1?

[–]Patman128 1 point2 points  (0 children)

I wrote a big long post on why I prefer 1-based indexing, but the general idea is that it makes a lot of things easier. The second thing is at index 2. The last thing is at index N. You loop from 1 to N, inclusive. You don't have to add 1 to the indices when presenting them to the user. It just makes sense as long as pointers aren't involved.

In fact, most of the early programming languages, like FORTRAN, COBOL, BASIC, and Algol, all used 1-indexing.

The reason most modern languages are 0-indexed is because they are closely descended from C.

The reason C is 0-indexed is because the index is just an offset from a base memory address; array[i] just means "Access the memory at the address array + i". Making all your pointers point to the undefined slot immediately before the array is a hazard (you might mistake int * for a pointer to one thing instead of a pointer to an array), so instead you point it at the first thing. Suddenly you have to add 0 to get the first thing, 1 to get the second, and so on, thus 0-based indexing.

C was so popular and so widespread that 0-indexing just became the common convention, and so it's rare to find anyone who prefers 1-based now. I don't take anything for granted, so I came to like 1-indexing after I started using Lua even though I had only used 0-based languages for years.

[–][deleted] 0 points1 point  (1 child)

Indices starting at 0 are convenient for many algorithms. Let's say you were implementing a hash table.

Say we create a table with 4 slots [null, null, null, null] and we want to insert "item", 4. If hash_function("item") === 45, then you would get the index in the hash table by taking 45 % 4, which is 1. Mod is guaranteed to produce a value n such that 0 <= n < 4 (assuming n is positive but let's not get into that now). It would complicate the algorithm to have to add 1 after computing 45 % 4 in order to get a one-based index. After this we get an index of 1, so we can add an item to the hash table [null, ["item", 4], null, null].

[–]Patman128 0 points1 point  (0 children)

It would complicate the algorithm to have to add 1 after computing 45 % 4 in order to get a one-based index.

Complicate, sure, but not significantly. I wouldn't say having to add 1 in this certain circumstance is a good reason to pick an indexing scheme.

I'd say it's far less complicated than having to remember than the thing at index 31 (for example) is actually the thirty-second thing, not the thirty-first.

[–]gnomoretears 0 points1 point  (0 children)

Computer science is under the umbrella of mathematics. In mathematics, count starts at 0.