Plucking BYTE out of a LONG

Erlend · 2016-11-11 07:43

In the below code I am breaking a LONG value into BYTEs to fit the format of SPI data transfer to the chip. It works fine to this by &'ing with masks, but should it not work with the alternative code too - using the BYTE command? When I try it it does not provide the expected values. Where do I go wrong? And is there an even more elegant way?

PUB Run(Dir, Speed) | parSpeed                                                    'Dir 1=Forward 0=Reverse, Speed steps/S, max 15625  
                                              
   LONG[@parSpeed]:= Speed * 67                                                'Convert from steps/Sec to some internal format        
   NoOpSynch
   Cmd3Bytes(cmdRun + Dir, (parSpeed &$FF0000)>>16, (parSpeed &$FF00)>>8, (parSpeed &$FF))    'SpeedByte2(nibble), SpeedByte1, SpeedByte0
  'Cmd3Bytes(cmdRun + Dir, (BYTE.parSpeed[2], BYTE.parSpeed[1], BYTE.parSpeed[0]))                         'Alternative syntax?
   
   RETURN (GetStatus & 1<<7) >>7                                              'Returns the CMD_ERROR bit (=1 for command error, else =0)

Erlend

msrobots · 2016-11-11 07:58

LONG[@parSpeed]:= Speed * 67
is the same as
parSpeed:= Speed * 67

BYTE[@parSpeed] gives you the equivalent of (parSpeed & $FF)
BYTE[@parSpeed+1] gives you (parSpeed &$FF00)>>8
and so on

I even think

parSpeed.Byte[0] gives you the equivalent of (parSpeed & $FF)
parSpeed.Byte[1] gives you (parSpeed &$FF00)>>8
and so on

Enjoy!

Mike

Erlend · 2016-11-11 08:22

Thanks!
I seem to never really a hundred percent get the stuff about variables and pointers. I started out with the straightforward parSpeed:= Speed * 67, but when the code did not work, to put in LONG was my first 'fix'. Bummer - as often with hasty fixes.
The parSpeed.Byte[1] gives you (parSpeed &$FF00)>>8 definitely is elegant, except it is a bit surprising that the [index] counts 'backwards', i.e beginning with LSB.
But it works!

Erlend

Electrodude · 2016-11-11 14:59

Erlend wrote: »

Thanks!
I seem to never really a hundred percent get the stuff about variables and pointers. I started out with the straightforward parSpeed:= Speed * 67, but when the code did not work, to put in LONG was my first 'fix'. Bummer - as often with hasty fixes.
The parSpeed.Byte[1] gives you (parSpeed &$FF00)>>8 definitely is elegant, except it is a bit surprising that the [index] counts 'backwards', i.e beginning with LSB.
But it works!

Erlend

It's not backwards. The Propeller is a little-endian processor, and stores lower bytes in lower memory locations. Therefore, byte[ptr] == (long[ptr] & $FF), and byte[ptr + 3] == (long[ptr] & $FF00_0000) >> 24.

This argument has been argued before here and elsewhere many many times, but AFAICT the only good reason big-endian seems to make sense to anyone is because we write numbers in big-endian order. However, even that's a poor excuse in a way (but, unfortunately, almost certainly unfixable at this point in time) because when we borrowed our numbers from the Arabs (who read right to left, and use little-endian numbers), we failed to flip the numbers around, and thus the ugly beast that is Big-Endianness was born. Whether or not that's a good argument against big-endian, there is no technical reason that I know of that big-endian might be better. In fact, the human argument is mostly useless at this point, since nobody manually reads core dumps anymore.

Hugh · 2016-11-12 05:54

Isn't the reason for using little-endian that it makes it straightforward to read a chunk of memory from a pointer to the first byte, with big-endian having to read a byte at a time, stepping backwards from the pointer?

Peter Jakacki · 2016-11-12 08:13

Spin and Tachyon store literals in big-endian format because this is the most efficient in terms of cog code size since when we read another byte we simply left shift accumulate. However all variables and constants are still handled in the native little-endian machine format. IMO I can see no other advantage for big-endian though and little-endian makes sense in that numbers grow as the address grows.

Tor · 2016-11-12 15:15

One good thing about big endian is that the bytes are ordered the same way as bits, so that you have bit 15-14-13-12-11-10-9-8-7-6-5-4-3-2-1-0 instead of 7-6-5-4-3-2-1-0-15-14-13-12-11-10-9-8 for a 16-bit value in RAM.

Note that when you look at *registers* on a little-endian CPU, they're also big endian. Like above. It's only memory that's little endian. I remember how that created some trouble for a guy designing for a VAX DR11-WA DMA board back in the eighties.. it communicated over 16-bit DMA with a different type mini which was big-endian, and the developer thought he had to physically swap the bytes on the cable to handle that. But no, the DMA register on the VAX is big endian too.. but stored in memory as little endian. So all is good, just connect, transfer, use.

Little endian was better than big endian for 8-bit processors because when adding 16-bit values or larger the processor can start adding the least significant byte while fetching the next byte. So that operation was faster on a little-endian 6502 than on an otherwise comparable big-endian 6800.

On 32-bit processors which can read at least 32 bits at the time it makes no difference. Could as well be big endian. But a subset of FORTRAN programmers on VAX liked to be able to pass 4-byte integer values to a function expecting 2-byte integers.. remember that in FORTRAN, function parameters are passed as addresses.. with little endian, the address of the integer would always be to the LSB, so it works fine to pass a 4-byte integer instead of the 2-byte one, if you know what you're doing. Can't do that in FORTRAN on a big endian CPU. Some non-portable FORTRAN came out of that.

Phil Pilgrim (PhiPi) · 2016-11-12 15:58

Tor wrote:

One good thing about big endian is that the bytes are ordered the same way as bits, ...

It's all how you look at it. It's only because we, as Western language speakers, are used to reading left-to-right, that big-endian seems more natural. But bits are numbered right-to-left. If memory locations were also ordered right-to-left, little-endian would seem more natural. Thus native Arabic and Chinese speakers should be quite at home with it.

So which endianness we are most comfortable with is just a cultural/linguistic bias, nothing more.

-Phil

Tor · 2016-11-13 01:15

Actually, left-to-right or right-to-left isn't really the issue. If you look at computers from the mini era, some of the documentation from the time would number the bits differently, e.g. 1-8 instead of 7-0. Big- vs little-endian, when it comes to computers, is about adjacency: What lines up with what. And for little endian byte order, the bytes are in the opposite sequence of the bits, as they are numbered. Little-endian wouldn't be an issue if the sequence of bits in the byte were little-endian too, but they aren't. So you get a bit stream (and I've worked a lot with bit streams..) of bit 4-3-2-1-0-15-14-13-12 and so on. Any kind of mixed endianity is bad in my opinion. year-month-day = good. day-month-year = good. month-day-year = bad. year-day-month = bad. And so on.

Phil Pilgrim (PhiPi) · 2016-11-13 03:37

Tor wrote:

Actually, left-to-right or right-to-left isn't really the issue.

Actually, yes, it is.

Here's an illustration of how a Westerner (left-to-right bias) encountering little-endian notation compares with an Eastern-language speaker (right-to-left bias) sees the situation:

So, you see, there's no logic to the selection. It's entirely cultural. Well, that is, unless you consider the advantages of having the least-significant (i.e. lowest-addressed) bytes holding the least significant bits. Or the advantage of being able to say

byte[@long_var] == word[@long_var] == long_var

when long_var contains a value ranging from 0 to 255.

-Phil

Tor · 2016-11-14 08:33

Phil,
I think you proved my point. You had to shuffle the bytes to make reading from right to left get the bits in sequence.
And, as I said before, I worked a lot with bitstreams in the past, and that was on systems where you worked directly with memory. The big endian systems were clearly the winners.

Phil Pilgrim (PhiPi) · 2016-11-14 16:46

Tor wrote:

You had to shuffle the bytes to make reading from right to left get the bits in sequence.

No, I just put them in their natural order -- at least natural if Arabic, Hebrew, or Chinese is your native language. That's what I mean when I say it's purely cultural.

-Phil

Electrodude · 2016-11-14 17:05

Tor wrote: »

Phil,
I think you proved my point. You had to shuffle the bytes to make reading from right to left get the bits in sequence.
And, as I said before, I worked a lot with bitstreams in the past, and that was on systems where you worked directly with memory. The big endian systems were clearly the winners.

All of your bitstream work obviously sent data MSB first. You can send data LSB first just as easily.

Heater. · 2016-11-14 18:42

I never quite figured out what endianness the Propeller/Spin uses. And/or why.

Data defined in DAT and VAR is big endian but constants in CON and literals in your source code are little endian. Or is that the other way around, I forget?

Presumably the Spin interpreter uses an endianness that optimizes the amount of code it needs to do the job. I can't imagine Chip choosing a non-optimal endianness.

Normally when one speaks of endianness the order of bits in a byte is not under discussion. After all you program cannot address anything below the 8 bit boundary so the order of bits in physical RAM is irrelevant.

Of course bit order does matter when you are talking about the order UARTS and such shift bits onto the line.

Also I'm not sure about the cultural references here. I thought Arabic and/or Hebrew and/or Indian and Chinese wrote their numbers from left to right, MSD to LSD, as we do. Even if other text was right to left.

Electrodude · 2016-11-14 19:06

Propeller/Spin endianness is very simple:

The Propeller is a little-endian processor. The less significants bytes of a word or long are stored in the lower memory locations. VARiables and DATa are stored little-endian once compiled. However, since we (i.e. humans who are used to LTR languages) are used to writing numbers in big-endian order, the Spin compiler, for convenience, lets us write "byte $76543210" when we want "byte $10, $32, $54, $76".

However, PNUT stores constants/literals (there's no difference as far as PNUT is concerned; a constant is just a name for a literal) in big-endian order. That way, it can do the equivalent of "x := (x << 8) | byte[pcurr++]" multiple times, which saves on code space. This allows only one function to handle any size constant (1..4 bytes). TACHYON does the same thing for the same reason.

RTL languages do write their numbers in the same direction we do: MSD left, LSD right. An RTL-reading person is reading some text and comes across a number. Which digit of the number do his eyes hit first? The least significant one? Sounds like little-endian to me!

Human brains are VERY bad at reversing sequences of information. Try reciting a poem you have memorized backwards - it's nearly impossible to do without cheating by remembering small sequences forward and gluing the results back together in reverse. When you add numbers on paper, you do it right to left, and if you read numbers left to right (like practically everyone who natively reads a LTR language), it's a pain to do arithmetic in your head because you have to process the two input numbers backwards. The Arabs write their numbers right-to-left, LSD first (i.e. they write them the same way we do but think about them in the opposite order), meaning that when they add numbers, they get to add the digits that comes first (in their minds) first, which is significantly easier.

I read a very useful book on mental arithmetic tricks that I highly recommend (Secrets of Mental Math by Arthur T. Benjamin), and it says (and I agree with it) that the fastest way for a LTR-reading person to do mental addition is to do it MSD-first, ignoring carry initially, and then adjusting each already-calculated digit for carry as necessary. Yes, it's more work, but it's actually faster, because you get to process data in the same order you think about it.

Heater. · 2016-11-14 19:30

Electrodude,

No, it's not so simple in Spin/PASM. Perhaps I did not make my point clearly.

Let's ignore the order in which bytes, words and longs are written in the source code for a moment.

Write some Spin/PASM with some constants in CON, variables in VAR and DAT and literals in the Spin source.

Now compile that with BST and look at the listing it produces. Sure enough some longs are stored in actual memory one way around and some the other.

Normally of course we don't see this. One is not going to get a pointer to a literal in ones Spin code without trying hard.

I don't much care how humans do arithmetic here. That is what the computer is for. Right

Electrodude · 2016-11-15 00:14

It is very simple. Everything is little-endian once compiled, except for the one exception of literals, which are big-endian because it makes the PNUT interpreter's internals simpler. This exception isn't really a problem because literals aren't intended to have pointers take to them. If you need to take a pointer to a literal, you should put it in a DAT block, and it will then be stored in little-endian form in the final compiled image. However, it may then be slightly more expensive to access.

Consider the following Spin code:

PUB a
  return $76543210
DAT
data1   long $76543210
data2   byte $10, $32, $54, $76

Let's look at the listing BST produces:

|===========================================================================|
Objects : -
Untitled1

Object Address : 0010 : Object Name : Untitled1

Binary Image Information :
PBASE : 0010
VBASE : 0028
DBASE : 0030
PCURR : 0020
DCURR : 0034
|===========================================================================|
|===========================================================================|
Object Untitled1
Object Base is 0010
|===========================================================================|
Object Constants
|===========================================================================|
|===========================================================================|
Object DAT Blocks
|===========================================================================|
0018(0000) 10 32 54 76 | data1   long $76543210
001C(0001) 10          | data2   byte $10, $32, $54, $76
001D(0001) 32          | 
001E(0001) 54          | 
001F(0001) 76          | 
|===========================================================================|
|===========================================================================|
Spin Block func with 0 Parameters and 0 Extra Stack Longs. Method 1
PUB func

Local Parameter DBASE:0000 - Result
|===========================================================================|
2                        return $76543210
Addr : 0020: 3B 76 54 32 10  : Constant 4 Bytes - 76 54 32 10 - $76543210 1985229328
Addr : 0025:             33  : Return value  
Addr : 0026: Data : 32                       2

Note that data1 and data2 are the same 4 bytes in the same order - 10 32 54 76 - even though they're represented differently. In both cases, the $10 is in the lower memory location than the $76. long[@data1] == long[@data2]. When you think about (hex) numbers, you think of them in big-endian notation, because that's how you read. Therefore, the number $76543210's MSB is $76, and its LSB is $10. Since the LSB is stored in the lowest address in this case, we conclude that it is little-endian. Local and global variables are stored in the same way - the result of the @ operator applied to anything you can legally apply it to is stored in the same manner, little-endian.

Now, let's examine the Spin method "func". It contains two instructions: a $3B Push Constant 4 Bytes, followed by 4 bytes of data, and then a $33 Return Value. Note that the 4 bytes of data for the Push Constant 4 Bytes instruction are in big-endian order - MSB in the lowest memory location. If you examine the source of the Spin Interpreter, you'll find that it reads constants by running "x := (x << 8) | byte[pcurr++]" the correct number of times. This reads a big-endian constant, and is the most compact way of reading a variable-length literal.

TACHYON Forth also stores literals big-endian, for the same reason. However, it does else everything little-endian, due to the fact that the Propeller's native byte order is little-endian.

The only tricky part here is because of syntactic sugar in Spin.

I don't care how humans do arithmetic either when I'm programming, which is why I prefer little-endian in almost every case. (By the way, you should still read that book, even though the majority of it won't help with computer programming)

Heater. · 2016-11-15 04:38

Yep, that's what I meant.

I think the only slight puzzle for me is this:

If literals are big-endian because it makes the PNUT interpreter's internals simpler, then why is everything else stored the other way around?

Electrodude · 2016-11-15 05:29

It sounds to me like you're saying that since an exception exists, everything else is backwards while the exception is the norm. I'm guessing I don't understand something you're trying to say.

Just because using big-endian in one place allows an optimization doesn't mean that everything else is backwards; it, in fact, means that the optimization does it backwards. But that's not suprising, since optimizations often do things backwards because it happens to be faster, such as doing loops backwards to take advantage of the djnz instruction. How does the fact that an optimization that does something backwards exists make everything else backwards?

Peter Jakacki · 2016-11-15 06:08

Heater. wrote: »

Yep, that's what I meant.

I think the only slight puzzle for me is this:

If literals are big-endian because it makes the PNUT interpreter's internals simpler, then why is everything else stored the other way around?

Little-endian is the default because it is native format but it is not necessarily native format for the Spin or Tachyon bytecode interpreter, the core of which must fit within a cog's memory. So we could have 4 functions or perhaps 3 for byte,word, and long literals that are not necessarily aligned either. If we had plenty of code space we might just encode the literals as little-endian although the alignment would not allow us to handle it efficiently.

However, since code space is restricted and the literals are on byte boundaries and shift accumulate method will work if we use big-endian. This is the relevant part in the Tachyon bytecode interpreter.

{ *** LITERALS *** }

' Accumulate a literal byte by byte from 1 to 4 bytes long depending upon the number of times this routine is called.
' This allows literals to be byte aligned.
ACCBYTE                 call    #GETBYTE                ' Build a big endian literal by reading in another byte
                        shl     ACC,#8                  ' merge it into the "accumulator" byte by byte
                        or      ACC,X
ACCBYTE_ret             ret

' Read the next byte of code memory via IP
GETBYTE                 rdbyte  X,IP                    ' Simply read a byte (non-code) and advance the IP
                        add     IP,#1
GETBYTE_ret             ret

Heater. · 2016-11-15 07:34

I'm not arguing that either case is "backwards".

Seems that at the end of the day it's because we actually have two different architectures in the Propeller. One is the actual native processor, the COGS, and the memory layout in HUB dictated by the instructions that read and write bytes, words and longs. The other is the virtual machine of the Spin byte code interpreter.

I have not got to the bottom of why it's optimal to have a different endianness for those two machines yet. Without looking at the details I can imagine why the interpreter would prefer one endianness over another.

Plucking BYTE out of a LONG

Comments