Yet anothe Arduino vs Propeller - what I am doing wrog?
Konstantin
Posts: 4
Hi all,
I am trying to make an application which is supposed to transmit sensor data to PC via serial over USB.
It is supposed to transmit a lot of data.
So I am trying to compare different boards performance.
Arduino Mega with 16 MHz clock speed transmists ~13000 reading in 10 sec.
Propeller board with clock speed of 80MHz transmits 8700 reading in 10 sec.
I somehow expected better speed from Propeller.
Is this correct or am I doing something wrong?
Here is Arduino code:
and here is Propeller code:
I am trying to make an application which is supposed to transmit sensor data to PC via serial over USB.
It is supposed to transmit a lot of data.
So I am trying to compare different boards performance.
Arduino Mega with 16 MHz clock speed transmists ~13000 reading in 10 sec.
Propeller board with clock speed of 80MHz transmits 8700 reading in 10 sec.
I somehow expected better speed from Propeller.
Is this correct or am I doing something wrong?
Here is Arduino code:
//random output speed test int x; void setup() { Serial.begin(115200); randomSeed(analogRead(0)); } void loop() { x=random(300); Serial.print("DATA"); Serial.println(x); }
and here is Propeller code:
CON _clkmode=xtal1+pll16x _xinfreq=5_000_000 OBJ pst: "Parallax Serial Terminal" PUB main |rand rand:=300 pst.Start(115200) repeat pst.str(string("DATA")) pst.dec(||(?rand)//300) pst.NewLine pst.LineFeedPS it has nothing to do with different implementation of random generation, I have the same results when transmitting just a hardcoded number.
Comments
- The cogs actually run at 20 MIPS (as instructions (with a few exceptions) ) take four clock cycles on the prop
- Spin compiles to an interpreted byte code, the Arduino compiles C to the native instruction set
- using prop GCC would be a fairer comparison
- the maximum speed for both is limited by the relatively low serial bit rate, try using 1Mbps
My guess is this is a bit of an unfair comparison.
As I recall the Arduino has a dedicated serial output buffer.
(If this is not true then "Never Mind".)
So there is little overhead coming up with the next character to send.
Your program uses the same cog to both send the character and find the next character.
I would use 2 cogs, 1 for the main routine and a second to send the character.
Duane J
You can reality check the Baud limit, along the lines of {assumes both value prints average 5 chars length}
(115200/10)*10/(4+5) = 12800 in 10 sec
(115200/10)*10/(4+5+2) = 10472 in 10 sec
Notice the second case shows the impact of 2 additional chars.
It seems the fist code is not sending the NewLine.LineFeed, as the time is close to a continual 9 bytes.
- so it seems close to half of the skew, looks to be message variation ?
If you want to minimise the SW overhead, use larger packets.
It also looks like the 2nd case is being pulled down by SW speed, whilst the first one is at Baud Ceiling, and so could run even faster at higher baud rates.
Also note the AVR has a Hardware uart, so whilst a Byte is being sent, the core can be doing other tasks, but a Prop has no Hardware uart, so a single cog design has to bit-bang THEN get the next byte.
That means a Prop will benefit from the highest possible BAUD speed, as that saves bit-bang time.
Your 8700 can be modeled as two delays, a Char-send, plus a SW overhead,
1/((1/26303)+(1/13000)) = 8700.0737
ie if you can shrink that comms time to zero, by fastest-possible-baud, (or using a second COG), then the SW overhead can approach 26303 readings sent.
Look at FT232H devices, which can run up to 12MBaud, and it looks like 4MBaud is the highest legal 80MHz fraction.
if all we are concerned with is being fast, try 5achyon forth. Peter reports 3 mbps over bluetooth. can't beat that with a stick.
As I suspected.... Is this comparison done?
But I will need to use I2C to get real data. As I understand it there are ready to use libraries in Spin language, but how do I handle it in GCC?
Do I need to write all I2C protocol engine from scratch? (no way )
I was totally unaware there is no hardware UART implementation on Propeller...well, if it has to implement RS232 protocol programmatically then it explains everything. Thank you for replies!
There is an I2C library driver included with GCC. I don't know what devices it supports other than memory (EEPROMs). You'll have to look at it. Most likely it has the necessary low-level functions to support nearly any I2C device, but you'll have to write the high level operations. For best performance, you may want to have the I2C I/O routines run in their own cog in parallel with your main program and communicate through a buffer of some sort. It all depends on what kind of I2C device and data you're talking about.
Welcome to the forum by the way.
Not exactly. A cog based serial object, ie written in PASM, like PST or FullDuplexSerial can drive the serial line at 115200 baud just as well as a hardware UART. I have even written a software UART in C that runs in COG and can do 115200 using propgcc.
In your case you are using pst.str, pst.dec etc methods in Spin and I suspect that is where all the slow down is.
Moving to GCC has a substantial speed benefit over Spin as you see.
It's a design choice that here is no hardware UART I2C and other blocks in silicon on the Propeller, all such things are to be "soft". This makes for maximum flexibility of the device even at the expense of a little speed here and there.
Loopy, I'm no where near a Propeller chip, but the code below compiles, so it might do the trick.
Memory model COG, optimization O2
Arduino code (see first message) produced 13000.
Arduino code does include CRLF (this is what println is for).
Maybe you guys can port that to Arduino so that those folks can have a fair comparison and fall in love with the easiness and greatness forth ... if anyone needs to be forth-proselytized it's those Arduino folks.
Actually in the COG case quoted it is one and only one COG doing all the work.
Mind you, he could change to the quad serial driver, and handle four serial ports... Arduino would not fare as well.
Not according to this:
My basic point to him was that his original comparison kept the Arduino 100% busy, while using 25% of a Prop.
As Bill says, the Arduino was going full speed while most of the Propeller was still idle. A better benchmark would entail interrupts versus cogs and some tasks requiring 32 bit arithmetic. An ISR would take cycles from the main loop, while two cogs can run concurrently, so that should work decisively for the Propeller.
Well, I was responding to what you had specifically quoted.
Anyway, I grokked the comparison and the need for a reasonable one.
As for printf, yes, it's pretty big and slow except in the COG case.
Still using a separate full duplex serial COG with simple wrappers should be faster - see attached.
A more accurate benchmark would be to compare the Prop to the Pic32 or the mbed(LPC1768)since these are 32 bit processors. Comparing to a 8 bitter is apples vs. oranges.
I don't think comparing Propeller to PIC32 is a useful comparison either. They are certainly very different.
I'd say the Arduino is waiting 90% of the time for the UART-TX-READY flag, and can do other things in this time, especially if the UART transmit is done with interrupts.
This code can not be seen as a benchmark, the lmiting factor here is the baudrate and not the execution speed of the processor / language.
Andy
I don't use Pic32's or mbed's, but I use both the Arduino and the Propeller. Both are about the same cost and I would be genuinely curious which had more oomph. Frankly the 8 versus 32 bits isn't as important as oomph per $.
http://forums.parallax.com/showthread.php?129972-fft_bench-An-MCU-benchmark-using-a-simple-FFT-algorithm-in-Spin-C-and-...
And'd don't forget that fft_bench only uses one COG to do it's work. I'm working on a parallel fft to see how far we can push performance there.
Another kind of "oomph" is a bit more tricky to compare. Response to multiple, external real-time events. Clearly having a COG handle each such event can be a winner over using interrupts on a single CPU.
A less obvoius kind of "oomph", which does not immediately hit you from reading the data sheets and manuals, is programmer productivity. With the Prop one can easily throw together a bunch of objects from OBEX or elsewhere, add a little of your own secret spice and you have a working project. No worries about real-time tasks fighting for interrupt or thread proiority or slowing each other down or introducing odd timming glitches.
Forth already has an ATmega model in full function, complete with an Arduino pre-written template image. The only thing is it overwrites the boot loader as the flash memory need to be written to one byte at a time for the Forth dictionary. You would have to reinstall the Arduino boot loader to get back to the nest.
Why not use Tachyon Forth as Peter has gotten some pretty fast benchmarks that may knock the socks of an Arduino? Propeller will trump the ATmega's serial in many different programing languages.
The main thing is that Konstantin is being introduced to the fact that you just don't need a hardware USART to get a good serial port. Assembly language can optimize the speed. It is programing at its finest.