Should the next Propeller be code-compatible?

rokicki · 2008-10-27 17:04

Eh, only big endian makes sense; from an engineering perspective, with real people writing code, the confusion engendered by
little endian is not worth the trivial amount of circuitry required to support big endian.

I come from a 68K background; it is such a relief to working in a big endian world after spending too much time with the
little endians.

Phil Pilgrim (PhiPi) · 2008-10-27 17:32

Paul Baker said...
The only reason why we view big endian as "correct" is the fact our language is read left to right, but from an engineer's perspective it has (practically) no merit.

Paul, you're spot on with that observation! But there's more to it than that. We inherited our system of number representation from India by way of the Mideast, whose languages (e.g. Arabic) read right-to-left. In Arabic, numbers are represented, as in European-derived languages, with the most significant digit leftmost on the page. But here's the important distinction: when read embedded in text, the least-significant digit is encountered first. So the only reason that little-endian representations seem unnatural to some is that we've borrowed our number system from a right-to-left language and scabbed it onto a left-to-right language. If early European adopters of our number system had troubled themselves to reverse the digits to the more logical low-to-high order, we wouldn't be having this discussion.

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Just a few PropSTICK Kit bare PCBs left!

Paul Baker · 2008-10-27 17:56

@rokicki, Sorry I wasn't more explicit of what viewpoint I was taking. I wasn't taking the perspective of the programmer of the chip, but the designer's perspective. Imagine you have a byte addressable architecture that supports word and long sized variables as well. Lets take the example of adding two word sized variables together. Say we have 2 registers containing (in big endian) $1234 and $5678. In big endian they are stored $12,$34 and $56,$78. To perform the addition these are the steps that need to be taken:

1) point source to register 1
2) add 1 to source pointer
3) point destination to register 2
4) add 1 to destination pointer
5) add bytes pointed to by source and destination and affect the C flag
6) store value in register pointed to by destination pointer
7) subtract 1 from source pointer
8) subtract 1 from destination pointer
9) add bytes pointed to by source and destination with C flag as input and affect the C flag
10) store value in register pointed to by destination pointer

Now if the same variables are store in little endian (that is $34,$12 and $78, $56) this is the algorithm for adding them together:

1) point source to register 1
2) point destination to register 2
3) add bytes pointed to by source and destination and affect the C flag
4) store value in register pointed to by destination pointer
5) increment source pointer
6) increment desintation pointer
7) add bytes pointed to by source and destination with C flag as input and affect the C flag
8) store value in register pointed to by destination pointer

As you can see the logic to support little endian is less complicated than big endian. This is the point I was trying to make, big endian is purely for benefit of the programmer's bias, and complicates chip design (not by a huge amount, but every gate costs some amount of money to implement).

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

evanh · 2008-10-27 21:16

Paul Baker (Parallax) said...
The Propeller is little endian.

It might be useful to treat it as LE in the programming model then.

Paul Baker (Parallax) said...
If you look at the physical construction of mathmatic operations on individually addressed multi byte variables it is little endian that makes sense and big endian is a big pain.

Except that this doesn't happen. Unless there is a 64 bit hardware adder somewhere in the Prop?
And the opposite can be said of the stack, LE stacking of variables is burdened in the same way, btw.

Paul Baker (Parallax) said...
The only reason why we view big endian as "correct" is the fact our language is read left to right, but from an engineer's perspective it has (practically) no merit.

Damn good reason I'd say. As Phil said, our number systems are all Big Endian.

Phil Pilgrim (PhiPi) · 2008-10-27 22:11

evanh said...
As Phil said, our number systems are all Big Endian.

When you say "our", I presume you refer only to Western left-to-right languages. The selfsame numbering system in the right-to-left languages it's native to (e.g. Arabic) is little endian. My point was that our numbering system is a mistake of history, when Europe adopted the Arabic numbering system as-is without reversing the digit order. Printing numbers least-significant digit first is more logical, which the designers of Arabic numerals and of little endian storage systems realized.

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Just a few PropSTICK Kit bare PCBs left!

Paul Baker · 2008-10-27 22:21

The cog may be a native 32 bit processor that doesn't directly support other sizes, but the hub is byte addressable, this is where endian comes into play on the Propeller. As much as some people would wish otherwise, endian-ness is too major of a horse to switch midstream. The Propeller will always be a little-endian machine. It would be far simpler for someone to create a preprocessor which converts big endian entered numbers into little endian.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

evanh · 2008-10-27 23:27

Now sounds like a good time to me. Just when the decision to go non-compatible has been made.

evanh · 2008-10-27 23:58

Phil Pilgrim (PhiPi) said...
The selfsame numbering system in the right-to-left languages it's native to (e.g. Arabic) is little endian.

Would be interesting to know how they deal with, say 15000, as words. Do they say something like thousands fifteen?

Phil Pilgrim (PhiPi) · 2008-10-28 00:30

Remember "four and twenty blackbirds"? It's not unheard of, even in English.

Seriously (at the risk of verging slightly OT), if there are any native speakers of Semitic or other right-to-left languages out there, I'd also be curious as to the way long numbers are vocalized.

-Phil

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Just a few PropSTICK Kit bare PCBs left!

rokicki · 2008-10-28 01:10

Paul,

Of course I understand all that. But the propeller doesn't do *any* of that; it adds/subtracts *longs* and only longs.
Sure, it may load bytes and words into a register and store them back, but the datapath is 32-bits.

You can *implement* arithmetic either little-endian or big-endian on this programming model with similar ease. The
difference in code is trivial.

It's what we users see, understand, and manipulate that is important, and as a programming model, I think most
people simply find big-endian easier to work with.

But of course that horse has long gone.

(There have been architectures that have supported both endian-ness and fairly easily too.)

Roy Eltham · 2008-10-28 07:46

I've been programming for 28 years, using machines form 8bit to 64bit and both little and big endian.

The arguement of little vs big endian has been around for a long long time, and it's not going to be resolved here.

Honestly, I could care less what the memory representation is, as long as it's known and consistent. Besides, these days you rarely have to read the byte stream directly. Any decent hex editor or memory view tool can show byte, word, dword, and qword data as the whole number properly transposed to human readable form. In fact, Visual Studio's memory view (and I'm sure other tools do too) can also show you floats and doubles "decoded" into human readable form.

Seriously, it really just doesn't matter.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Check out the Propeller Wiki·and contribute if you can.

evanh · 2008-10-28 10:51

Seriously, it does matter.

While .net'ers might feel M$ has it nicely hidden away, here we are dealing with the hardware quite directly. And intentionally I might add.

heater · 2008-10-28 12:47

@evanh : I have to ask you again, can we have an example of the Propeller/Spin/PASM having a "mixture of both Big Endian and Little Endian" ?

Just now I'm not seeing the problem.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

evanh · 2008-10-28 13:21

Phil kind of answered it about our view of numbers. The editors are presenting everything in BE. The formatting of the bit order is displayed as BE. And any subsequent grouping of those bits are BE formatted also.

evanh · 2008-10-28 13:32

Actually, to respond to Phil's sort of query about western adoption of the exact format of the Arabic number system. There is one really obvious reason why it is so: - Trade would be possible with no spoken common language. Just the numerically written receipt of the transaction was needed.

This also demonstrates that the west was the backwash back in those days.

heater · 2008-10-28 14:39

@evah : Blimey, I think I finally cottoned on to what you are getting at. The binary dump of a number in Spin can come out either big endian or little endian depending on where the numbers are used. For example:

DAT
  A long $12345678

PUB  main
  A := $12345678

In the binary dump is:

00 1b b7 00 00 12 10 00 24 00 2c 00 1c 00 30 00
14 00 02 00 0c 00 00 00 78 56 34 12 3b 12 34 56
78 c5 08 08 32 ff ff f9 ff ff ff f9 ff

This is, well, crappy.

But given this talk of Arabic I can see now that the problem is that the numbers are OK but our language is back wards.
We should write $1234 = A and read it from right to left.

Perhaps this is why many children have a problem with arithmetic from the get go. First we teach them to read from left to right then we have them adding up big numbers from right to left. This sort of thing, like the non-phonetic English spelling really confuses the logic of young minds.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

BradC · 2008-10-28 15:50

heater said...

In the binary dump is:

00 1b b7 00 00 12 10 00 24 00 2c 00 1c 00 30 00
14 00 02 00 0c 00 00 00 78 56 34 12 3b 12 34 56
78 c5 08 08 32 ff ff f9 ff ff ff f9 ff

No, if you look at the interpreter source, this makes perfect sense. In this particular case pragmatism wins the day.

And to be honest, these are details the compiler takes care of and really should have little bearing on what the user sees (unless they are trying to do something clever, in which case they should be more than capable of coping with it)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

heater · 2008-10-28 17:24

@BradC: You might have to elaborate on "No" for me. I may not live long enough to read and understand the interpreter source but I'm all for pragmatism.

I notice that for "A := $12345678" the literal constant here is not long aligned in the byte code or even word aligned. This implies to me that the interpreter must be doing rdbyte 4 times to load it. Seems awful slow.

Am I on the right track here ?

Is it possible to speed up code by defining constant values as items in a DAT section. Assuming they are then aligned correctly and picked up atomically.

Also if the same constant is used many times code size may shrink by having it as a DAT item. (not CON)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

BradC · 2008-10-28 18:06

heater said...
@BradC: You might have to elaborate on "No" for me. I may not live long enough to read and understand the interpreter source but I'm all for pragmatism.

I do apologise for my terseness there, my mind may well be not 100% on the job.

heater said...

I notice that for "A := $12345678" the literal constant here is not long aligned in the byte code or even word aligned. This implies to me that the interpreter must be doing rdbyte 4 times to load it. Seems awful slow.

Am I on the right track here ?

You are absolutely correct. The 1st byte of the constant defines how many bytes make up the following data, allowing 8/16/24/32 bit constants to be efficiently packed.
It is my observation (and mind you I'm not even close to knowing the truth) that the interpreter does many things in the interest of code density that perhaps are not in the interest of speed. To be honest, to do what it does in the space it does it is in some ways a minor miracle in itself.

Anyway, I digress...

heater said...

Is it possible to speed up code by defining constant values as items in a DAT section. Assuming they are then aligned correctly and picked up atomically.

Also if the same constant is used many times code size may shrink by having it as a DAT item. (not CON)

It will most certainly make a size difference, and at a glance I'd hazard a guess to say it'd be faster. The two people who can tell you with any certainly are Ale and Cluso99, as they have the clever stuff set up to time it. *OR* do what I do.. knock out a quick test app that performs each load 1000 times inside a waitcnt period and prints the timing results to a tv_text object [noparse]:)[/noparse]

I'd do it now but I have no way of loading my propeller until Mr UPS phones me up to tell me my little box of Parallax goodies has arrived. I feel like an addict waiting for his next hit..

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

heater · 2008-10-28 18:49

OK BradC, so the interpreter picks up a byte, shifts it 8 places, picks up another byte, shifts 8 places, as many as required. This accounts for the pragmatic bigendian representation in the byte code. I guess the length is not in the first byte of the constant because in my example there is only one byte code before the constant which must be some kind of load operation with the length encoded in it, whatever.

Yes, given the constraints Spin and it's interpreter are amazing.

Normally I would exactly knock up some experimental code for such things however just now my home made demo board style board is bust and my soldering iron is back in my old apartment...

Cheers.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

BradC · 2008-10-28 18:56

heater said...
OK BradC, so the interpreter picks up a byte, shifts it 8 places, picks up another byte, shifts 8 places, as many as required. This accounts for the pragmatic bigendian representation in the byte code. I guess the length is not in the first byte of the constant because in my example there is only one byte code before the constant which must be some kind of load operation with the length encoded in it, whatever.

mmm.. sorry, by 1st byte I meant the command itself. There are effectively 4 commands for constant loads. 1/2/3/4 bytes..

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Pull my finger!

evanh · 2008-10-28 21:13

BradC said...
... these are details the compiler takes care of and really should have little bearing on what the user sees (unless they are trying to do something clever, in which case they should be more than capable of coping with it)

LE is added complexity for no good reason. It's that bucket of water from the doorway that falls on me every morning. Sure, I can deal with it, but what were they thinking!?

Brad, time for an Arabic editor. :P

Phil Pilgrim (PhiPi) · 2008-10-28 23:16

I think all this could be resolved if memory dumps were displayed right-to-left. After all the values are displayed least-significant ( == lowest numbered) -bit or -nybble rightmost. Why not display the least-significant (== lowest addressed) byte/word/long rightmost as well? Viewed in this fashion, the sense of LE order becomes (painfully to some, possibly) obvious. By "sense" I mean that the bit order from the begining to the end of memory is monotonic, not disjoint as it would be in BE order. Bit12 of a long, for example, is byte1 bit4 — very easy to compute.

I have to confess that I was once a "biget" (big-endian ... um ... advocate). But this is only because it was the first system I was exposed to, and I became imprinted by it. Logic can sometimes take awhile to trump habit.

-Phil

Addendum: I suppose BE order could still have logical consistency if we reversed the way we number bits in a byte/word/long. Instead of having the MSB of a long be bit31, for example, it would be bit0, and the LSB would be bit31. That way, in BE order, and reading from left to right, the bit order for the entire memory would be monotonic. (I'm not advocating this, BTW, but just saying that's what it would take for BE order and bit order to be consistent with each other.)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
'Just a few PropSTICK Kit bare PCBs left!

Post Edited (Phil Pilgrim (PhiPi)) : 10/29/2008 12:05:02 AM GMT

evanh · 2008-10-29 10:54

Phil Pilgrim (PhiPi) said...
By "sense" I mean that the bit order from the begining to the end of memory is monotonic, not disjoint as it would be in BE order.

It's only disjointed because some dork engineered the hardware as LE. And, yes, I'm talking about Intel now.

Phil Pilgrim (PhiPi) said...
I have to confess that I was once a "biget" (big-endian ... um ... advocate). But this is only because it was the first system I was exposed to, and I became imprinted by it. Logic can sometimes take awhile to trump habit.

Or lack of choice forced you to fool yourself.

Phil Pilgrim (PhiPi) said...
Addendum: I suppose BE order could still have logical consistency if we reversed the way we number bits in a byte/word/long. Instead of having the MSB of a long be bit31, for example, it would be bit0, and the LSB would be bit31. That way, in BE order, and reading from left to right, the bit order for the entire memory would be monotonic.

Not likely. That setup might suit LE having the bit order the same as the significance. But it still doesn't change the fact that the programming environments are BE. And until we are all talking Arabic, big endian is a no-brainer.

heater · 2008-10-29 12:07

There is no correct answer to this. Why do stacks grow downward? Why do electrons travel against the flow conventional description of current flow? Why do the Brits drive on the wrong side of the road?

There are places in India where writing has swapped from right to left to left to right, more than once if I understand correctly.

Intel have done a lot of architecturally ugly things, which I won't go into here, but I'm not sure the byte ordering is one of them.

Like Phil I was put out by Intel's little endianness as my first exposure to programming micros was with Motorola. But really this is only a little problem when examining hex dumps in bytes.

"until we are all talking Arabic" - Well there might be the problem, when we adopted the Arabic number system we did not turn our writing around to match. Had we done so Phils suggestion would be what we have.

In answer to someones question above about how numbers are spoken in arabic/hebrew etc my hebrew speaking girl friend tells be this: The writing goes from right to left, the numbers are written least significant digit on the right (as we do). So far so good and logical. But when speaking the numbers they come out most significant digit first(as we do) so 24 is spoken as "twenty and four" not "four and twenty". Strange.

I am a little disturbed by your discovery of a mixture of big and little endian in Spin but as discussed it is the pragmatic way to go.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

evanh · 2008-10-29 13:18

Heh, yeah, it's not like LE is practical for a programming model. I was poking fun by suggesting otherwise.

That sure is a surprise about the spoken endianess. Certainly puts a question mark on how palatable LE would be anywhere. Given this discovery, there is no way the west would have or should have reversed it's endianess to LE.

There is a correct answer to this: Design the PropII hardware to big endian. It's the very reason why I've posted in this very thread. Which, in turn, is the very reason why I've stirred the subject up at all.

The code is already incompatible. There is no external legacy data bus to deal with. There is nothing to lose and everything to gain by changing to BE.

heater · 2008-10-29 13:28

@evanh: Are we missing the point that from an Arabic/Hebrew right to left point of view the written numerals ARE littleendian?
They only look bigendian from a western point of view.

Anyway this is my last word on the subject.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

evanh · 2008-10-29 13:39

I understand. The thing is they still perceive the numbers in order of significance. They kind of read their own numbers backwards and in doing so maintain big endianess the same as the west.

Roy Eltham · 2008-10-29 16:22

Just FYI, Intel wasn't the first or only little endian architecture. The DEC VAX came before the 8086.

And as has been stated it is simpler in the hardware logic to deal with little endian layouts for many operations, which is likely why it was chosen. It may seem like trivial difference now, but in the mid/late 70s I don't think it was.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Check out the Propeller Wiki·and contribute if you can.

HighJump · 2008-10-29 22:16

Phil, et al,

I, too, am a Biget. However, since the mistake has been made, I have adapted to it. The dumps in my rForth-64 implementation displays in little-endian format:

[list]
> ' quit 100 dumpl  
       C        8        4        0 <__A_d_d_r_e_s_s_> 0   4   8   C   
FFFF377C 904865FF                   <000000010020CE60> _________eH_|7__
FFFF374C FFFF40B8 FFFFBF0C 00000048 <000000010020CE70> H________@__L7__
FFFF5B04 FFFF6DF0 FFFF40CC 00000038 <000000010020CE80> 8____@___m___[noparse][[/noparse]__
FFFF3704 FFFF4118 00000010 FFFF3738 <000000010020CE90> 87_______A___7__
FFFFC1FC FFFF8E68 FFFF8CD4 0000000C <000000010020CEA0> ________h_______
FFFF4B2C FFFF63E8 FFFFFFC0 FFFF36F0 <000000010020CEB0> _6_______c__,K__
FFFF4B1C FFFFD068 FFFF4B24 FFFFD040 <000000010020CEC0> @___$K__h____K__
FFFFFE44 0000008C FFFF3714 FFFF65A0 <000000010020CED0> _e___7______D___
FFFF74D4 00000020 4B4F2004 FFFF3548 <000000010020CEE0> H5___ OK ____t__
FFFF36CC FFFF59F0 FFFF4014 FFFFFEC8 <000000010020CEF0> _____@___Y___6__
6174530F FFFF3520 FFFF745C 00000024 <000000010020CF00> $___\t__ 5___Sta
FFFF74A4 776F6C66 7265646E 55206B63 <000000010020CF10> ck Underflow_t__
FFFF4024 FFFF8E08 FFFF751C FFFF6578 <000000010020CF20> xe___u______$@__
FFFF4014 FFFF74E8 00000080 FFFF3420 <000000010020CF30>  4_______t___@__
FFFFFC7C FFFF9D48 00000014 FFFF3688 <000000010020CF40> _6______H___|___
FFFF3644 FFFF4070 00000008 FFFF3650 <000000010020CF50> P6______p@__D6__
                  FFFF33B4 FFFFFF7C <000000010020CF60> |____3__________
 OK 0 
[/list]

HighJump

Phil Pilgrim (PhiPi) said...
I think all this could be resolved if memory dumps were displayed right-to-left. After all the values are displayed least-significant ( == lowest numbered) -bit or -nybble rightmost. Why not display the least-significant (== lowest addressed) byte/word/long rightmost as well? Viewed in this fashion, the sense of LE order becomes (painfully to some, possibly) obvious. By "sense" I mean that the bit order from the begining to the end of memory is monotonic, not disjoint as it would be in BE order. Bit12 of a long, for example, is byte1 bit4 — very easy to compute.

I have to confess that I was once a "biget" (big-endian ... um ... advocate). But this is only because it was the first system I was exposed to, and I became imprinted by it. Logic can sometimes take awhile to trump habit.

-Phil

Addendum: I suppose BE order could still have logical consistency if we reversed the way we number bits in a byte/word/long. Instead of having the MSB of a long be bit31, for example, it would be bit0, and the LSB would be bit31. That way, in BE order, and reading from left to right, the bit order for the entire memory would be monotonic. (I'm not advocating this, BTW, but just saying that's what it would take for BE order and bit order to be consistent with each other.)

Should the next Propeller be code-compatible?

Comments