Behavior of RDBYTE

escher · 2018-02-03 05:23

I'm witnessing bizarre behavior in my code with reading a byte via rdbyte. cpptr is a hub RAM address of a long which contains 4 colors e.g.

colors     long    %11000011_00110011_00001111_11111111

which are from left to right red, green, blue, white.

If I perform

rdbyte          pixbyte,  cpptr          ' Load color

and display pixbyte, the lowest byte is being displayed i.e. the white 11111111 color. This doesn't make sense to me, as the API states the byte at the address should be the one read (i.e. 11000011), not the lowest byte of the long pointed to by the address.

Stranger still, if I add 1 to cpptr:

add             cpptr,   #1
rdbyte          pixbyte,  cpptr          ' Load color

Then pixbyte is being loaded with 00001111, the blue color... which makes even less sense as incrementing the hub RAM address by 1 byte has now somehow gotten me one byte lower. Same thing for adding 2 and getting green, 3 and getting red.

The full source is a bit too complex to post here, but I have verified 100% that cpptr absolutely contains the address of the colors. What am I missing here?

Phil Pilgrim (PhiPi) · 2018-02-03 05:31

The Propeller uses "little-endian" byte order in its words and longs. IOW, the least significant byte gets the lowest address in a word or long.

-Phil

escher · 2018-02-03 05:35

Phil Pilgrim (PhiPi) wrote: »

The Propeller uses "little-endian" byte order in its words and longs. IOW, the least significant byte gets the lowest address in a word or long.

-Phil

That is uniquely counter-intuitive and infuriating. What is the I'm sure incredibly obvious reason for it totally making sense and making everything easier?

Phil Pilgrim (PhiPi) · 2018-02-03 06:32

This matter has been discussed ad nauseum in other threads, so there's no point in rehashing it here. Just enter "little endian" in the forum's search box, and I'm sure you'll find more info, opinion (both pro and con), and -- yes -- sometimes vitriol than you might wish to read. As a disclaimer, I'm firmly in the little-endian camp and not ashamed to say so.

-Phil

escher · 2018-02-03 06:39

Phil Pilgrim (PhiPi) wrote: »

This matter has been discussed ad nauseum in other threads, so there's no point in rehashing it here. Just enter "little endian" in the forum's search box, and I'm sure you'll find more info, opinion (both pro and con), and -- yes -- sometimes vitriol than you might wish to read. As a disclaimer, I'm firmly in the little-endian camp and not ashamed to say so.

-Phil

Haha fair enough. I'm working on execution-time-sensitive code, so having to fangle addresses to compensate is a negative. Thanks for the quick response!

ozpropdev · 2018-02-03 07:16

Instead of

colors     long    %11000011_00110011_00001111_11111111

do this

colors     byte    %11000011,%00110011,%00001111,%11111111

escher · 2018-02-03 17:30

ozpropdev wrote: »

Instead of

colors     long    %11000011_00110011_00001111_11111111

do this

colors     byte    %11000011,%00110011,%00001111,%11111111

Yeah that was the band-aid I ended up using. Seems like there is some religious fervor over the two different endian-ness formats... for wanting to be able to index specific sequential bytes from a base address, big would be more useful.

One thing that obscured this behavior from me was the fact that rdlong and even rdword seem to load the value at the requested address in big-endian format. As the extensive amount of address and value manipulation I have done for my project on values loaded these ways have run into zero inaccuracies until working with individual bytes instead.

Electrodude · 2018-02-03 21:37

escher wrote: »

Yeah that was the band-aid I ended up using. Seems like there is some religious fervor over the two different endian-ness formats... for wanting to be able to index specific sequential bytes from a base address, big would be more useful.

One thing that obscured this behavior from me was the fact that rdlong and even rdword seem to load the value at the requested address in big-endian format. As the extensive amount of address and value manipulation I have done for my project on values loaded these ways have run into zero inaccuracies until working with individual bytes instead.

If you just read constants in little-endian order, i.e. right-to-left, all the weirdness goes away - sequential bytes will seem sequential in little-endian. The only problem is in the compiler's number parser (and in the way Arabic numerals were imported into the European world, way back when). Convincing yourself that the least significant bits come first, even though they're on the right, should help make things easier by making them more consistent, and make you less afraid of leaving the underscores and reversing the order of the four bytes instead. If you just read all constants backwards, everything else will feel like it's the right way around. If you find yourself needing any extra code as a result of little-endianness, you're doing something wrong.

evanh · 2018-02-04 00:39

Deluding yourself of little-endian's readability in whatever manner is not the problem. The problem is the resulting confusion of some people not caring or not realising the difference until long after the protocol is in use. The original documentation doesn't mention the endian details because the authors were all unaware/uncaring. The most often outcome on little-endian hardware is a mixture of endianess that generates untold bugs and needs very careful post-release documenting!

Big-endian hardware simply never had nor has this problem (Well, not until it has to deal with one of the above hodgepodges), for the very simple reason of all humans read numbers as big-endian. It's the one thing in human languages that was universally adopted thousands of years ago.

It's way past time Intel sorted their mistake.

Peter Jakacki · 2018-02-04 01:27

Wow! It's like people really do think in boxes in that they "expect" things to be their way around rather than the way it is, a bit like people think of Oz as "down under" etc

.

As far as I see it endianness only exists when you want to access a number in smaller chunks which in this case is simply because you can rdbyte on a long in hub memory. Now internally in the cog you can't read a byte of cog memory, they are all 32-bit registers, and what happens when you read that "long" and want to extract that "left-hand" byte?

Com'on, I think you know you are going to have to right shift that 24 bits to get that "first" byte

However I use big-endian in my bytecode because that allows a number to be built up much like we enter digits into a calculator. Horses for courses.

If however we allow ourselves to googleearth from any position we see things from that relative position as they are rather than from the perspective of an artificial fixed absolute position (as in north at top).

evanh · 2018-02-04 01:46

Peter,
I'm not sure if that was replying to me at all. But if it was then as long as little-endian continues to exist in hardware then, yes, coders will continue to make a mess through either ignorance or lack of vigilance. Little-endian hardware requires constant developer vigilance, for no good reason.

We all are taught to think big-endian, and all non-computer applications of numbers are purely big-endian. End of story.

Fundamentally, it's the hardware architecture that's at odds, for no good reason. It's way past time Intel sorted their mistake.

Peter Jakacki · 2018-02-04 01:57

evanh wrote: »

Peter,
I'm not sure if that was replying to me at all. But if it was then as long as little-endian continues to exist in hardware then, yes, coders will continue to make a mess through either ignorance or lack of vigilance. Little-endian hardware requires constant developer vigilance, for no good reason.

We all are taught to think big-endian, and all non-computer applications of numbers are purely big-endian. End of story.

Fundamentally, it's the hardware architecture that's at odds, for no good reason. It's way past time Intel sorted their mistake.

No it wasn't to you personally, but I taught myself about hardware and software from an early age and to me they are not separate, just inter-woven parts of a whole, so I wasn't taught what was proper to think. Thankfully this helped me to not know that something was impossible, so in my ignorance I made impossible things possible

Perhaps that is why I don't have any problems switching between traditional languages and Forth as I just look at them as they are rather than have difficulty because of preconceived ideas. The problem I think is in the way many are taught, it's all so packaged and easier for educators with more emphasis on teaching about things rather than thinking about how and why etc.

evanh · 2018-02-04 02:04

Just as I opened my reply to Electrodude - the problem is not about how one individual sees it. It's the general messes that do occur.

Little-endian is completely unneeded, and it'll never stops making new messes as long as it's kept around.

evanh · 2018-02-04 02:05

It's way past time Intel sorted their mistake.

Phil Pilgrim (PhiPi) · 2018-02-04 03:20

Little-endian

long bit 11 == byte #1 bit 3 == byte# * 8 + 3

Bit least in byte least.
___________________

Big-endian

long bit 11 == byte #2 bit 3 == (3 - byte#) * 8 + 3

Bit least in byte most?
___________________

Now tell me which is the most natural and least confusing. Or should we also number bits now, starting with the most-significant as bit 0?

-Phil

escher · 2018-02-04 05:35

To me, this entire argument comes down to whatever is personally most intuitive.

If I want the nth byte of a long, I personally think about the stored value with the highest byte being addr+0 and lowest addr+3, because I'm a much more visual person and when I visualize the memory where this variable is stored that's what makes sense to me: big-endianness.

I completely understand how to others it makes far more sense to prefer little-endian as it makes more sense bitwise: your offset is one-to-one with the byte starting at zero.

But, when I want to iteratively recurse through contiguous bytes which are part of words or longs (and were stored in little-endian format as a result), the fact of the matter is that I have to execute more operations to parse the bytes than if they were big-endian.

I think @evanh makes the clincher argument (for me at least) by pointing out the terribly-documented nature of the hardware's endianness and its effects on operations such as rdbyte, and the resulting headache it has caused throughout the industry due to either lack of awareness or lack of understanding or both.

ozpropdev · 2018-02-04 06:45

BTW the Propeller manual does mention for WORD and LONG the following

... since the data is stored in little-endian format.

escher · 2018-02-04 07:07

ozpropdev wrote: »

BTW the Propeller manual does mention for WORD and LONG the following

... since the data is stored in little-endian format.

Seems like an afterthought for something that defines the very fabric of the memory model :P

Phil Pilgrim (PhiPi) · 2018-02-04 18:26

Endianness is where language and numbers collide. The fact that this is even a debate derives from borrowing our numbering system from a culture whose language reads from right-to-left, while ours reads from left-to-right. The collision occurs when we verbalize the numbers we see.

When we see a number like 43, we pronounce it "forty-three," reading from left to right. In languages that read right to left, the "3" is pronounced first, then the "4." But even some Western languages adhere to this order. In German, it's "dreiundvierzig," "three and forty." Not even English escapes this convention at times, viz: "Four and twenty blackbirds ..."

But that convention seems to end with the tens. Even in arabic, which is written right-to-left, a large number like 56789 is pronounced starting from the most-significant (leftmost) digit, only reverting to right-to left order for the last two digits.

I think it's for this reason that the thousands separator (comma in English) was invented. It makes scanning ahead (or back, depending) easier, in order to determine the significance of the first digit being pronounced.

Even in dealing with numbers in math, which digit to start with depends upon the operation being performed. When adding a column of figures, we start on the right. But in long division, we start on the left. It's entirely possible that little-endianness derived from the order that early eight-bit computers needed to address memory for computing multi-byte sums.

-Phil

localroger · 2018-02-04 23:06

The Endian Wars go back to the 1970's at least. Little Endian, as the prop uses, has some algorithmic advantages such as that the pointer address always contains bits 0-7 of the target whether it's a byte, word, or long. Intel's early processors such as the 8080 and those that emerged from them like the Z80 and of course x86 are all little-endian. Big Endian has an advantage in readability of object code, and Motorolla's 68000 and thus all early Macintosh computers were big-endian, as are the protocols of Ethernet and the Internet. But while readability of object code was a thing in the 1970's it's less of a thing as code gets bloated and we depend more on compilers and other tools.

AntoineDoinel · 2018-02-05 09:47

here, strictly right to left:

56789 = sivy amby valopolo sy fitonjato sy enina arivo sy dimy alina = nine (is) the rest (of) eighty(,) and seven hundred(,) and six thousand(,) and fifty thousand

P.S.: google translator tries hard but ends with slightly wrong answer when feeded with that, better to split rows like this:

sivy
ambin'ny
valopolo
sy fitonjato
sy enina arivo
sy dimy alina

Phil Pilgrim (PhiPi) · 2018-02-05 20:25

What language is that, Antoine?

-Phil

escher · 2018-02-06 00:10

Sounds like Malagasy or something Barito

AntoineDoinel · 2018-02-06 00:20

That's malagasy, I've been studying it for more than a year, but I'm still ways away from a basic conversational level.

I've been somewhat inspired by those guys:

(video has english subtitles)
it's crazy they reached that level in about two years!

Behavior of RDBYTE

Comments