PDA

View Full Version : Parallel processing



Veronis
05-21-2008, 12:14 AM
I am a prof at GWU and plan to write a manual on parallel processing using the Propeller as the main platform. Could someone direct me to the following:

(1)Articles on this forum or elsewhere discussing the parallel processing capabilities of the Prop.
(2)Example code setting parallel processing between clogs in one chip and intercommunication between chips.

I would be very grateful for any assistance I receive.

Andrew

Mike Green
05-21-2008, 12:25 AM
Have you looked at the Propeller page on Parallax's website, particularly the downloads page? There are a lot of magazine articles on the Propeller. The Propeller Education Kit's tutorials are also useful.

There's very little on intercommunications between multiple Propellers since only a few people have done this for high speed general purpose communications and have focused mostly on how to get the data from Propeller to Propeller using only a few I/O pins and not much in the way of other resources.

There's all sorts of examples of communications between cogs since most I/O drivers contain a mix of high level (Spin) and low level (assembly) code and the assembly code occupies a cog when running.

Have a look through the various "sticky threads" in this Propeller forum. They contain a lot of information including examples of Spin and assembly.

ErNa
05-21-2008, 02:22 AM
Take a look at: http://forums.parallax.com/showthread.php?p=725595 It is just a beginning, it is not translated (nobody asked me) and I will expand the mechanism to synchronise cogs of different chips to. Just have to finish my current application.

Cluso99
05-21-2008, 07:19 AM
Take a look at my DataLogger (just released). It uses all 8 cogs. 4 of them are used interleaved to achieve a sample rate of 1 clock cycle (12.5nS @ 80MHz).

rjo_
05-21-2008, 10:01 AM
Andrew,

I am not an expert... so, please consider what I say with the grain of salt that it deserves:)

I found one of your ppt presentation about PL design using VHDL, but I have not been able to find much "on-line."

http://books.google.com/books?q=andrew+veronis&source=citation&sa=X&oi=print&ct=title&cad=bottom-3results&hl=en

produces a very impressive list of your books and articles.

This forum is full of highly talented and very serious thinkers... don't be afraid to challenge them!!! What seems to work best are very specific questions, which always get answered in a highly informative manner.

You should consider Andre's book for one of your classes.

To make the best argument for the PROP, I think you will have to choose a problem, which has a grain that is coarse enough to justify the time required to read from or write to uSD but small enough to fit the algorithm into RAM. I have a short list of good problems.

I think a "PROP only" network is ideal... but to get there quickly is quite a challenge. I would love to see an intermediate (educational) networking product based on a FPGA/CPLD to generate the clock, handle the housekeeping and share the memory problem. Andre Lamothe has done some good work in this area.

You might consider using ImageCraft's C compiler in one of your courses. This will allow you to test more extensive algorithms.

Mike Green's various implementations of Femtobasic are a treasure trove of technical solutions and programming solutions.


Rich

rjo_
05-21-2008, 10:11 AM
I personally think that if you want the Prop to shine you have to emphasize the deterministic nature of the entire implementation and explore the clocking solutions that the integrated solution allows...

... given all of "that" ... you might consider the possibility of inviting a NIST guru to send you a super-duper clock... which might also provide you a long term funding solution:) No doubt there is a technology transfer officer waiting for your call:)

Post Edited (rjo_) : 5/21/2008 2:24:41 AM GMT

ErNa
05-21-2008, 02:34 PM
I once worked with an z80 and build my own interrupt controller. This controller could check inputs and created a call to certain addresses. The point was: this controller also had output pins, which where patched to the inputs. So the z80 run short routine, which did small tasks, and activated pins befor "HALT". The interrupt controller, polling the inputs, then woke up the processor to do the next required task. That was what we called "softwareinterrupt" and it became so simple to run a multi-tasking system. Later I implemented TurboDOS and had a combination between multitasking an multiprocessing on Z80 processors. Then I fell in love with the transputer. (David May is still on and running: www.cs.bris.ac.uk/~dave/index.html (http://www.cs.bris.ac.uk/~dave/index.html ). Not only my efforts crashed with Intels and MS advantage (no, MS is not the decease). The propeller brought back the ability to play with parallel concepts. The difference is: at nearly no cost!
I want to say: it doesn't matter, how you teach people to work with parallel processes, it is important to show them the advantages before the are caught by the promises of the virtual world of easy living.

ImageCraft
05-21-2008, 03:28 PM
... And now you can do it with Propeller C http://forums.parallax.com/images/smilies/smile.gif

void main()
{
extern int _textmode;
int i;

_textmode = 1;
asio_init(57600);
asm("cogid %i");
printf("My COGID is %d waiting for a key()\n", i);
getchar();
printf("starting COG%d, return value %d\n", 0, coginit(0, BlinkLED22, &stack[20]));
printf("starting COG%d, return value %d\n", 2, coginit(2, BlinkLED21, &stack[40]));
printf("starting COG%d, return value %d\n", 3, coginit(3, BlinkLED16, &stack[60]));
printf("starting COG%d, return value %d\n", 4, coginit(4, BlinkLED17, &stack[80]));
printf("starting COG%d, return value %d\n", 5, coginit(5, BlinkLED18, &stack[100]));
printf("starting COG%d, return value %d\n", 6, coginit(6, BlinkLED19, &stack[120]));
printf("starting COG%d, return value %d\n", 7, coginit(7, BlinkLED20, &stack[140]));

printf("entering echo loop\n");
while (1)
putchar(getchar());

}


***
A COG to blink one LED. Talk about a waste of power http://forums.parallax.com/images/smilies/smile.gif I will release BETA4 in the next few hours with the capability.

// richard

Leon
05-21-2008, 08:43 PM
David May is behind the new XMOS architecture:

http://www.xmos.com/

I've thinking of designing my own board for the XS1 chip, like I did for the transputer when it first came out, and have started creating the PCB part.

Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle

Post Edited (Leon) : 5/21/2008 12:50:22 PM GMT

pharseid
05-23-2008, 02:56 AM
· Leon,

······ Would you consider creating a Yahoo group or some other venue to discuss your Xmos board? I think there would be a lot of interest in it and until it dovetails with a Parallax product it probably isn't appropriate to discuss it in depth here.· A number of applications immediately come to mind (one current Propellor thread discusses using other micros as intelligent memory/port expanders and the Xmos chip would be perfect for that). I wrote a simple z-buffer renderer for transputers in the early 90's and I think there might be just enough on-chip RAM to implement a 16-bit z-buffer for Hydra users.

-phar

allanlane5
05-23-2008, 03:10 AM
The best book I've run into so far is the Andre LaMothe book that comes with the HYDRA system for $200, or is sold by itself for $50 or so.

The Parallax reference books tell you what the keywords are and do, but the LaMothe book tells you exactly what's going on when you load 'Spin' code and then use that code to load Assembly into other Cogs, and how those Cogs can use "Main Memory" to communicate. It also uses a multi-Cog approach to implement computer games utilizing an off-the-shelf PS2 PC Mouse, off-the-shelf PS2 Keyboard, Composite Video, VGA Video, and NES controller. With excellent performance and all source code included.

The HYDRA system board also has already wired a simple RJ-11 phone jack for implementing board-to-board communication.

Now, mind you I've only been working with the Hydra for a month or so. And Parallax does sell a nice Propeller board-with-chip for $20 for multi-chip/multi-board exercises (you'll need to order a $25 Prop-Clip or Prop-Plug to program the $20 board)

Leon
05-23-2008, 04:03 AM
I was thinking of starting an XMOS forum, so I've done it:

http://groups.yahoo.com/group/XMOS_UG

XMOS themselves will probably form their own group before long, like Inmos did with the Occam Users Group.

Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle

rjo_
05-25-2008, 09:34 AM
Andrew,

this is off topic; if that is a problem, please feel free to say so.

Leon et all,

I'm struggling to understand this. So, I'm not trying to act ignorant, I actually am ignorant:)

Do you see the potential to use the XMOS FPGA for hooking up a network of Props?

Wouldn't it be better to use an industry standard for this sort of thing?

For example, I've seen imaging solutions for under $150 from Altera, and the development tools look unreal.

Where does XMOS fit?

Rich

pharseid
05-25-2008, 10:50 AM
·· This is probably a question better posed in Leon's Yahoo group, but I'll give my take on this here. The XMOS chip isn't an FPGA, its a multicore chip that seems to be positioning itself as an alternative to FPGA's (as the TILE chip seems to also). Perhaps this is a reflection of the fact that more people have software skills than VHDL skills or maybe they're just looking at a no mans land between micros and FPGA's. But it's philosophically much like the Propellor, just provide a whole bunch of processing power that is configured through software. XMOS adds communication links that seem to harken back to the transputer, but better. If I understand the Xlink model, it would seem to be very simple to implement a paradigm like Distributed Shared Memory on a chip or network of chips. The advantage of using the XMOS chip as a network processor would be that they have up to 16 Xllinks per chip, and a huge on-chip interconnect bandwidth, so for the sort of humble applications that interest me, I can consider the bandwidth infinite and worry about other things. Although in my case I wasn't thinking about a network at the moment, one would seem to be enough to occupy me for a while.

-phar

Leon
05-25-2008, 08:04 PM
The XMOS is a fast microprocessor intended for parallel processing, with high-speed communication links between the chips. Being so fast, and extensible, it could replace FPGAs in a lot of applications, at much less cost. It's intended for different markets than the Propeller, and will be much cheaper (down to $1 in quantity) and very much faster than the Propeller. The first chip (available from Digi-Key in a few weeks) is the XS1-G4000 which has four 400 MIPS processors on one 512 ball BGA chip. Smaller single-core chips will be in 44-lead QFP packages. The development software - C compiler, XC language, debugger, simulator and profiler - will cost $1,000.

The chip is attracting a lot of interest and they have plenty of VC funding, so it should be quite successful.

Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle

Post Edited (Leon) : 5/25/2008 12:21:11 PM GMT

rjo_
05-25-2008, 10:53 PM
Doctor Veronis,

One of the most attractive features of the forum is the free flow of information and opinion... I will now try to track back to your topic:)

Guys,

Thanks for the response.

IF we were to compare the MIPS equivalent of the counters on the PROPI and PROPII to the XS1, where are we in terms of overall bandwidth? (No cheating allowed...)

How do we quantify the MIPS effect of interrupt driven logic to state of the art Propeller logic?

On the software side... $1000 seems very reasonable:)

Doctor Veronis,

Notice that my attempt to return to your topic didn't actually work... but I got close!!!

There will be a PropIII... I have a feeling that it will look a lot like multiple PropIIs on a single chip...
AND VC will not be required:)

You are in the right place... with the right skill set... at the right time.

Rich

Leon
05-26-2008, 12:13 AM
rjo_ said...
Doctor Veronis,

One of the most attractive features of the forum is the free flow of information and opinion... I will now try to track back to your topic:)

Guys,

Thanks for the response.

IF we were to compare the MIPS equivalent of the counters on the PROPI and PROPII to the XS1, where are we in terms of overall bandwidth? (No cheating allowed...)

How do we quantify the MIPS effect of interrupt driven logic to state of the art Propeller logic?

On the software side... $1000 seems very reasonable:)


Rich


It's a completely different approach. For instance, a single XS1 core can have up to eight threads, with single threads devoted to SPI, PIO, I2C, UART etc. Cores can operate in parallel, with their own threads. Peripherals are implemented in software, like the Prop, and only take a few lines of code. Here is a UART transmit function written in XC:




unsigned UartTransmitByte(unsigned char
byte, unsigned time, out port txd)
{
unsigned i;
unsigned data = byte;
int i = 0;
txd @ time <: 0; // Start bit
time += g_bittime;
for (i = 0; i < 8; i += 1)
{
txd @ time <: >> data;
time += g_bittime;
}
txd @ time <: 1; // Stop bit
time += g_bittime;
return time;
}





Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle

Post Edited (Leon) : 5/25/2008 4:20:13 PM GMT

Mike Green
05-26-2008, 01:32 AM
Just looking at the documentation on the XMOS site, it seems like the XMOS is conceptually very similar to the Propeller with a different allocation of resources. In particular, there is a "state" associated with each thread consisting of a full register set, but the threads timeshare a processor. Each core consists of this processor, the round-robin scheduler, and the set of thread states. Associated with the core is the "main memory" and the communications (link) hardware. The core can be duplicated on-chip with sharing of other resources like memory and communications links.

The Propeller has a much larger state (2K bytes) associated with each processor and the processors are not scheduled unless they attempt to use one of the hub resources (like the shared memory). In other words, they run at full speed simultaneously unless they're explicitly blocked.

Leon
05-26-2008, 02:38 AM
Each Xcore Tile actually has 64k bytes of SRAM, 64 I/Os and Xlinks, as well as 8k bytes of OTP memory. Resources aren't shared by other cores.

Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle

ErNa
05-26-2008, 03:04 AM
This processor is just a different world. May acquired capital and hopes to bring his vision into reality. But the transputer was a different and better way to system design and an occam machine. But people had nothing to do but writing parallel c compilers. The propeller is a cheap and simple chip and that will give us the power to convince people that play with other, "cheaper" processors to discover the advantages of true parallel processing. Thinking "parallel" is not a question of a processor. It is just to become aware of the fact, that the real world is running in parallel and that serial solutions are only simple under very restricted conditions. When it comes to stacked interrupts, the world looks different.
May has to find the "killer application" and he will never bring the money back if he addresses amateurs as we are.

Luis Digital
05-26-2008, 03:25 AM
rjo_ said...

On the software side... $1000 seems very reasonable:)
Rich


Is software and full hardware. (http://www.amazon.com/exec/obidos/ASIN/B001706YLG/ref=nosim/findnet0f-20)

pharseid
05-26-2008, 03:42 AM
·· The transputer pretty much appeared at just the wrong time. You had half a dozen RISC processors competing with the 2 established CISC processors and not many would make it. The supercomputer industry would take an enormous hit with the end of the cold war. So you have an architecture which runs no existing applications competing with architectures that run a hundred billion dollars worth of existing applications. Look where we are now in the desktop arena.

·· On the other hand, in the area of microcontrollers and DSP's, software compatibility doesn't count for that much. If you just look at TI DSP's, they have conventional DSP's, transputer-like DSP's, VLIW DSP's. I think there's plenty of room for XMOS, they just have to outperform the competition in enough applications. And I don't think they should cater to amateurs either, but that doesn't mean being unacommodating. Microchip, for instance, has done pretty well giving away software. I don't imagine that decision was based on the hobbyist market, but it certainly makes it nice if you are a hobbyist.

·· From my perspective, the XMOS chip is an experiment and if people like us aren't going to experiment, who is?

-phar

evanh
05-26-2008, 06:44 AM
The XMOS brochures hint strongly at a specific target market of networking boxes, switches/access points/routers. That will be why it's getting any VC investment at all. They will have all their tools and libraries streamlined for that market. Anything else will be a bonus to them.

I note Veronis has not replied.


Evan

ErNa
05-26-2008, 01:38 PM
And I think, we didn't answer his questions ?


(1)Articles on this forum or elsewhere discussing the parallel processing capabilities of the Prop.
(2)Example code setting parallel processing between clogs in one chip and intercommunication between chips.

I twice tried to point to the cmaptools which can be downloaded for free cmap.ihmc.us/ (http://cmap.ihmc.us/). With help of this mapping tool it could be possible to bring better order into the object exchange, document software, describe programming patterns and so on. As a professor you can have the server software too.

cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_ 5511&partName=htmltext (http://cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_ 5511&partName=htmltext)
just points to the WWW-representation of the actual cmap. The advantage is, that anyone can have access and can contribute his knowledge. The backdraw is: the result come from cooperation http://forums.parallax.com/images/smilies/wink.gif

I didn't go on with this work, because there was no resonance, but different other effords to bring more order into the chaos, and, as long as there is not real working together, this only blows the chaos up.

Post Edited (ErNa) : 5/26/2008 1:51:54 PM GMT

peterz
05-26-2008, 07:46 PM
Since many months the XMOS web site has had·no changes. So I doubt they will release anything in a 'few weeks'. Has anyone seen an actual XS1 chip?

Leon
05-26-2008, 09:25 PM
The evaluation kit is available, and they told me that they have supplied early silicon to some customers. My contact there offered me a used demo kit, at a discount.

Leon

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle