Well, jazzed, very good question. It was kinda suggested by Heater, and I gotta give him the credit because it's, let's say, a challenge.
And, the SDRAM hardware may also alleviate the storage memory issue, but I may have to squeeze some data down to 2,048 bytes, which is the RAM size in COGs. It have been done on PICs and ARM Stellaris MCUs, suggesting that it's also possible on P8X32A..
I am going to read through the rest of uCLinux's main OS kernel source code file tonight to see what can be done, and what can't, for P8X32A chip. I may have to sacrifice some bells & whistles unless I also have either hard drive or SD card available for extra softwares. I will have to wade through as it's a 332MB source BZIP, but it's worth it because it's a treasure to behold - containing the tidbits I would never have thought I will live without.
If uCLinux is compiled too big, then it's onto a FAT32 formatted IDE hard drive, and have the firmware loader on a serial FRAM load the Linux boot image outta this hard drive, like a PC. After all, I have bunches of older hard drives I would use for something, although I am saving better one for this prototype for some reasons. For now, I may build the master controller board, with a single P8X32A chip with 32MB Micron SDRAM, and several IO controllers, like Vinculum II, RS232 and USB to IDE controller - I will also leave the header for it to plug into the additional P8X32A boards - it's just a prototyping hardware.
I finally settled onto a 500mA switching voltage regulator from Microchip, http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=MCP1602-330I/MF-ND - I will find a 1A regulator if it needs to be that way. I also included Panasonic 10uF 6.3V SMD Aluminum Polymer capacitor to be able to filter all those nasty switching noise as possible.
I am going to order the last batch of active components for a prototype controller board, then order the headers and the PCB. After soldering, it's definitely playtime!
I have formatted my 40GB hard drive with FAT32 containing 32KB sectors. It was difficult, until I figured something out on my WD Data lifeguard software, and I forced it to format 100% of hard drive data area with 32-bit FAT tables, so now it should be easier for P8X32A to be able to scan through the hard drive and finding whatever it wants. For some reasons I decided to skip 16KB sector option, picked the biggest one possible.
Well, jazzed, very good question. It was kinda suggested by Heater
Not really, I just focused on it from your list of suggestions.
Thing is to get any Linux running on the prop you will need:
0) A GCC C compiler - Linux relies on some odd GCC extensions that a lot of other compilers don't have. No problem, the only solution just now is the zpugcc tool chain and the Zog ZPU virtual machine to run the binaries. Will be very slow. Prop II looks like it will be getting a new GCC generating LMM which will be much quicker.
1) A big load of RAM - No problem we have the 32MB solution in hand from GG.
2) A timer interrupt for the scheduler - Zog does not support interrupts but could be persuaded to do so I guess. Linux will need moding to use this I think.
4) A serial console port - No problem but a Linux driver will need to be written for it.
5) A block device for file system - No problem we have SD card support, probably needs a block device driver for Linux to be written.
6) Memory Manager - Probably not something you want to create an emulation for on the Prop, it would be horrible slow. Better to get back to a Linux that works without virtual memory. For that there is the Embedded Linux Kernel Subset (ELKS). No idea if that is still being maintained.
7) Probably a bunch of things I have not thought of yet.
After all that you end up with the slowest Linux implementation in the world:) Perhaps interesting as an academic exercise but not of any use.
Now I'm know here for tackling pointless projects, a Z80 emulator and CP/M for example but Linux on the Prop is way beyond my pointlessness threshold:)
Memory Manager? Just have to use very small RAM paging from the boot firmware on the I2C flash (or on my board, 64KB (or 128KB?) Ramtron Ferroelectric RAM) for the main RAM memory - it won't be pretty, but I just need to cheat a bit, though.
A block device.... I may use would be a hard drive for this prototype. Serial console bus would be done via an USB to RS232 bus and a modified TTY console driver. Timer? I will have to make a virtual nested vector interruption machine, which will take a bit resource, and I can't do Out-of-Order execution tricks on few programs either, as it will make uCLinux quite unhappy - Kernel itself won't really care if it's run in-order or not, just that few software will need to be run in-order. And, Propeller II Cogs is very likely to be superscalar, so you got raw speed-ups as a bonus from upgrading from P8X32A to Prop II. A big difference between those COGs' CPU cores.
It can be very slow, but still it is going to be a difficult OS to crash, unless the overflow stacks in it is obviously screwed up like from badly written VMMU software.
Like what you said, not a project for easily-intimidated programmers.
Memory Manager? Just have to use very small RAM paging
No. What I meant was that Linux uses protected virtual memory. That would require
emulating the paging hardware in the Propeller. Including raising exceptions on
page faults etc etc. Having Linux and it's apps access memory through that
emulated virtual memory layer would make it glacially slow.
A block device.... I may use would be a hard drive for this prototype.
Better to use an SD card initially, requires few pins and works, just needs a
Linux driver to interface to it.
I will have to make a virtual nested vector interruption machine
As the Prop has no interrupts your VM has to check for them all the time,
including those page faults. Even more glacially slow.
Not sure why you are worrying about out of order execution all the time. Either
a processor does it or it does not. The Programer cannot change that and the
programs being run don't know or care. All the results come out in the same
order as a simpler CPU so there is no issue.
You could put out of order execution into an LMM or Zog or whatever virtual
machine but hat would just make it an order of magnitude or two slower than it
is already for no benefit apart from being able to say "look I emulated an out
of order execution CPU on a Prop".
All in all Linux on the Prop is not going to happen.
Good point.... Better to do it on Propeller II chip as it could possibly have superscalar CPUs in all 8 COGs, even though RAM is still a limiting factor. I am a bit confused now, thinking "how the heck do they do it on ARM Cortex M3 microcontroller from Luminary Micro / TI, the L3MS102 chip???"
If it's severely watered-down, then it could make sense. uCLinux won't need strict Virtual RAM paging, though (while full-blown Linux that I could see booting up startingly fast on my computer containing Phenom II Deneb CPU require the MMU to be fully brought up and paged for 4k prefetching)...
I am starting to think it will be better to make my own OS for this chip, while Out-of-Order software would allow it to speed up greatly, owing to flexible branch prediction selection as I would know what's in both of the softwares.
uCLinux, although originally designed for microcontrollers, will have to be out of the picture for a while. (Fun fact: 30 to 56% of post-2005 MP3 players use uCLinux as a boot firmware, and I have Toshiba Gigabeat F10 that contain this type of firmware OS kernel.)
Why not SD? Well, for some software stuff, I would prefer to use the hard drive, and Vinculum II has many flavors of buses, RS232, I2C, SPI and FIFO - and it's also quite cheap. I can also remove the hard drive, and plug in the SD reader into the USB port.
I will try to explain here. You know that P8X32A will have to execute everything in-order, right? It's nice and great, but there is some problems with it: there are such things as bad branches, which cause it to set back and redo everything all over again, or in rare case, the CPU core inside the COG will have to drop the instructions onto the 2KB RAM memory, do something else first until the loop or djnz is found, and CPU's branch prediction circuitry finds out that those set of codes go together - it then execute them together. That's also kinda explain why the emulators are slow. They have to carry out limited set of codes, until all pieces together, and keep them if the branch prediction matches, iand if not the CPU is screwed.
But if I allow a bit time for Prop to analyze the branch predictions in a sacrificial COG, and couple hits would be reported back to the other COGs, and if there are misses, it can test that by reshuffling the instruction calls / flags and report it to the COGs too. All COGs will have the same software to ensure they are all kept on their toes all the times, as it is already well known that codes can change without any warning - sometimes without any inputs. AI just don't need too much branch prediction, while for better FPU emulators will take tolls on branch prediction circuitry. In my software I may write soon, it would pick the best chance all of the machine codes can be done shorter is put on priority list (scoreboarding methods that some early Out-of-Order processors used, and probably some today's OOOE CPUs still do) until the shorter ones are exhausted, they are saved onto the SDRAM or COG RAM depending on size, then do complicated machine codes - they are all later pieced together. Branch predication do much better, because the wrong branch won't harm the processor's chances of finishing as much.
Lastly, I wouldn't be surprised if I find out that Prop II's COGs has what is a true superscalar, 2-issue Out-of-Order CPU core.
And, you can also try delayed-issue on P8X32A if you're either bored or lazy, you would notice difference in speed if certain threads are delayed somewhat carefully.
You know that P8X32A will have to execute everything in-order, right?
I hope so. I want it to execute things in the order I specify and no other.
It's nice and great, but there is some problems with it: there are such things
as bad branches,...
What is a "bad branch", can you give an example?
...which cause it to set back and redo everything all over again,...
Not as far as I know. Such a mechanism would seriously mess up the Props deterministic timing.
...or in rare case, the CPU core inside the COG will have to drop the
instructions onto the 2KB RAM memory, do something else first until the loop or
djnz is found,...
Also not as far as I know. Do you mean 2k HUB or COG memory? No matter, COGs don't write anything to anywhere except as specified by the code I run on them. I hope:)
... and CPU's branch prediction circuitry finds out that those set of
codes go together - it then execute them together.
Again as far as I know the Prop has no branch prediction circuitry.
That's also kinda explain why the emulators are slow. They have to carry out limited set of codes, until all pieces together, and keep them if the branch prediction matches, iand if not the CPU is screwed.
Emulators are slow because no matter how you slice it it going to take many PASM instructions to fetch an opcode, decode and dispatch it, execute it and write back the results. Setting whatever status flags are needed in the emulated CPU.
But if I allow a bit time for Prop to analyze the branch predictions in a
sacrificial COG,...
But the Prop does not do branch prediction (as far as I know).
...and couple hits would be reported back to the other COGs,
and if there are misses, it can test that by reshuffling the instruction calls /
flags and report it to the COGs too.
Whatever that means it sounds like you would be doing more work counting hits/misses, reshuffling code and communicating between COGs than if you just executed the code normally.
All COGs will have the same software to ensure they are all kept on their toes
all the times, as it is already well known that codes can change without any
warning - sometimes without any inputs.
Is it? What do you mean "codes can change without any warning". A lot of PASM programs rely on self-modifying code, which hopefully is under the authos control, so the "same" code in two COGS may not always be the same at all times depending on the processing going on. Unless the data and timing is the same in both in which case there is no point to having two copies running.
AI just don't need too much branch prediction, while for better FPU emulators
will take tolls on branch prediction circuitry. In my software I may write soon,
it would pick the best chance all of the machine codes can be done shorter is
put on priority list (scoreboarding methods that some early Out-of-Order
processors used, and probably some today's OOOE CPUs still do) until the shorter
ones are exhausted, they are saved onto the SDRAM or COG RAM depending on size,
then do complicated machine codes - they are all later pieced together. Branch
predication do much better, because the wrong branch won't harm the processor's
chances of finishing as much.
Can't make heads or tails out of that but if you can write a faster floating point object as a result of these ideas I'm sure we would all be forever greatful.
Lastly, I wouldn't be surprised if I find out that Prop II's COGs has what is a true superscalar, 2-issue Out-of-Order CPU core.
I would be very surprised, as I said such things as pipelines and branch prediction seriously mess up deterministic timing. We like that every (most) instruction always takes the same number of clocks no matter what. We also like that pretty much all instructions take the same number of clocks.
And, you can also try delayed-issue on P8X32A if you're either bored or lazy, you would notice difference in speed if certain threads are delayed somewhat
carefully.
Love to, do you have a code example that demonstrates this effect?
Okay. You win. Apparently, I thought there is a branch prediction in CPU core in the COGs, as the datasheet mentioned dynamic branch picking - now you pointed out that it's totally static. I guess I am shot. And, for couple reasons I like to use Out-of-Order processor as it tend to finish up the complicated codes quickly, and there is apparently no such as free lunch, you know?
I have been busy for a while... I will try to start ASAP. Just taking my time, though...
Anyways, I am wondering personally, about that: Via the 74LV573 after the SDRAM tied to P8X32A, it would be linked to another P8X32A and copy some content to another RAM memory or just directly with CS pulled up to temporarily disable SDRAM's IO (just so it can ignore what's going on) as I won't have much pins available after filling the controller board with the goodies You guys think I can try it, or will I end up having some weird glitch (phantom signals) at the IO pins or greeting me with weird memory glitches?
I had to ponder how to get ALL 8 pins tied together with all P8X32A, and something to copy the data onto SDRAMs quickly if I perform broadcast copying.
Okay.... This thread went silent. I understand - maybe this project isn't worth pursing - I could be wrong, though. I could just continue without any assistance, if you think it's best this way. No offense taken... -____-;
I have been wondering about the IO layout, trying to get better resolution between the numbers of pins and bitrate.
Either way, I may have to tie the SDRAM IO with other P8X32A's SDRAM IO with enable lines tied to the RAM chip, to be able to independently control the condition of the RAM and CPU to CPU IO.
How many Cogs will you sacrifice? I'm wondering how many sacrificial Cogs I will need in my machine. I hope I don't need any. I'd like to merely put the cog to work and then reinitialize the cog (after the organizational work is complete) with new code to keep it on the up and green. But you know how it goes..the Machine Gods may think otherwise.
The project is definitely worth pursuing. You just need to start. Put one foot in front of the other. After a while you will have nice linear progressive progress. Two months later, it could go "development exponential."
One to two COGs, depending on how complex is the FP / integer routines - they have to read through very complex branch hints. Afterward, they are free to do whatever they want, like accessing RAMs or the SD / hard drive storage.
About "Do it or lose it" routine, it's pretty much how the advanced branch prediction in PowerPC (including AMD Phenom II based on PowerPC G6 core - a definite advantage in term of speeds and I am loving this CPU!) works - if it's not fulfilled, the CPU evict it out - many time, CPU may elect to save it on L2 / L3 cache RAM to try again if necessary, though.
I think I may turn the first prototype into a type of personal mainframe with a couple of P8X32A, tethered to RS232 to USB converter (DLP-TXRX dongle) for output until I can make my own video card for it, and the video card would follow parallelism - all four P8X32A chips for graphic processing (three for high-speed analog RGB 3D calculation - kinda like older workstation video cards, such as professional version of Matrox video cards - each with its own SDRAM for video buffering) - I will let something bounce in my head for a bit.
BTW, the RGB VGA video card was just something out of curiosity to see if I can squeeze enough color signature out of each P8X32A chips for individual RGB wires outta the VGA monitor and to see if it could do reasonably good graphic (SNES or Playstation 1 3D graphics) with those processors. I also may want to clock it at 100 or 120MHz to be able to do it quickly.
I have built the "mother of all prototype" which I may work with for a while now - I also got to convert the SiT8003 8MHz MEMS oscillator into DIP-4 oscillator. The experimenter board is for the purpose of testing the DO firmware with very high-precision clock (I also have crystal pack I could insert in there too: 4, 8, and 12MHz speeds for three in anti-static pack.) I know it's not as beautiful as SMD ones, though. =P It gotta do for now.
BTW, I apologize for crappy image quality... And, about mounting the SMD MEMS oscillator the live-bug style, it was frustrating trying to get it right, as the solder weren't behaving as well, so I had to do it right, and I succeed in the end: an instant DIP-4 MEMS oscillator.
Very nice to see the oscillator board and for things to get under way.
If you put a convex lens in front of the camera and focus it, the resulting macro lens image would be much more clear. Nice experiment for trying out a number of precision clocks and the end effect is obviously for over-clocking to faster speed. I look forward to reading about your results.
I'm working on a low cost version super coolant agent that can help in high speed over-clocking projects. This is part of the Cryogenics Project over here:
I was at the electronics and computer surplus center that has around 200 stores in one central location and went around searching, and collecting parts / supplies this past week. I can report on more results soon.
That's quite good! However, if you have been using QFN-44 package, you could cool the P8X32A die rather quickly as you would already have the thermal viases on the GND die attach pad so you would have easier time cryogenically cooling it. R134a is difficult at best, but the results would be spectacular (as in higher core frequency as you lower the transistor's RS_on resistance. In my opinion - take it with a grain of salt.... I think going to 300MHz or over that is possible as long as you get better ceramic 0.1uF low-ESR MLCCs around the Vss / Vdd so you won't have to worry too much about the CPU DC rail stability and going to 1 GHz would be just jaw-dropping but heat-wall is going to prevent it anyways so we need to focus on what we can reach easily.)
I also am considering buying handful of QFN-44 Propeller chips (8 or 16) for the second part of Dendou Oni prototyping and put Peltier TEC plates on the Copper backing, then put the board on it. I wold love to see how fast the P8X32A can go! (I have done OC before, if you have been wondering, I did it with Phenom II temporarily just for thrill: I went from 3.0GHz to 3.8GHz without changing the Northbridge, just core multiplier, on air using modified stock heatpipe coming from AMD, with Sunon 80mm fan rated for 110CFM sitting atop the fan bracket with 80mm / 70mm funnel and I had to let it cruise at low speed as it screams loudly once it hits 6,000 RPMs.... Yet, it was interesting at how much I can get away with Black Edition CPUs...)
Oh, and you will want to use special foams (Neoprene foam, the black, non-porous stuff used in fridge's plumbings to cover the cold line from getting all encrusted in ice.) and your P8X32A chips will thank you for not letting the condensation shorting it out. (hence the reason I recommended QFN-44 package as it can also make foam applying process somewhat easier, at least for me.) DIP-40 just take a while before finally touching the base of sub-zero temperature and it just make it interesting (Remember kids, interesting = bad!), the differing temperature can shear the plastic package, so it need to be done slowly - within 4 hours to go from about 30 C down to - 70C.
P.S. I eventually uploaded a nice closeup of SiT8003 MEMS oscillator in DIP-4 socket that I put in for to reduce the chance of damaging this oscillator - you can see why the soldering job on turning it into a bug was very tricky. And also convenient as I can also pop it in the place of quartz crystal oscillator.
And, I just picked up the camcorder battery from Radio Shack on discount - it's a 6V 2.7 Amps/hr Nickel Metal Hydride battery - I am wondering about how would I crack it open without shorting whatever's inside (I did crack it partly - the batteries are enormous - the diameter is about the same size as US Quarter Dollar (25 cents) and / or 10 Yens coin. No wonder it's so powerful....) so I could use it in Dendou Oni prototype or final version of Propeller II Dendou Oni for power backup of "keep-alive computing" which is basically a board that is kept alive even when power goes out, and the power button isn't pressed to shut down when that occured (if it was pressed, the batteries would be disconnected, though.)
I really wanted a nice and BIG Lithium-ion camcorder batteries so I could do the same, only it will be able to go on for few days active, and possibly 6 years with ALL P8X32A (or PII) on standby and SDRAMs shut down. But, there were none on discount, so I had to hit NiMH up as it was for $4.00 here, just use what I can use... (And with that kind of Ampere - about 45A estimated short-circuit power, I could be able to keep P8X32A on for probably a week until it hits 4.2 V then needs to be recharged up to 6.4V. Not too bad.)
That's quite good! However, if you have been using QFN-44 package, you could cool the P8X32A die rather quickly as you would already have the thermal viases on the GND die attach pad so you would have easier time cryogenically cooling it. Oh, and you will want to use special foams (Neoprene foam, the black, non-porous stuff used in fridge's plumbings to cover the cold line from getting all encrusted in ice.) and your P8X32A chips will thank you for not letting the condensation shorting it out. (hence the reason I recommended QFN-44 package as it can also make foam applying process somewhat easier, at least for me.) DIP-40 just take a while before finally touching the base of sub-zero temperature and it just make it interesting (Remember kids, interesting = bad!), the differing temperature can shear the plastic package, so it need to be done slowly - within 4 hours to go from about 30 C down to - 70C.
Dr. Mario, good approach to plate the underside of the QFN-44 chip and empower it with the Peltier junction. However, that form factor prop takes some special equipment to solder. As you know, with the P8X32A-D40 and the dry ice method there is no condensation to short out close or wide-spaced pins. That's because of pure sublimation. No water involved. Dry Ice is non toxic, non corrosive, non conductive and does not damage any metal surfaces. It goes directly from a solid to a gas with no liquid-state melting in between. It also does not require on board or external power consumption like the Peltier.
As it's electrically nonconductive, it's used for cleaning by pellet blasting surfaces, and for surface to chip deposition in just the right amount on boards in diagnostic troubleshooting when the board is powered up.
Yea. Dry ice is ideal. However, I have a concern about them shearing the plastic package due to thermal difference. Yet, dry ice is also valuable as refrigerant too.
I have been wondering about the battery power for backup UPS for a board - you think I am cool with Ni-MH or should I build a Lithium : Iron-Phosphate battery pack? (I have enough experience dealing with them to know what can be done, and what can't be - regular Lithium-ion batteries are just too moody and touchy, so I felt I would give 3.2V Lithium Iron Phosphate batteries a shot, as they don't care about short-circuit abuses.)
BTW, I had to turn my LED flashlight down a bit so I don't blind the CCD sensor (Some Kodak digital cameras are just so picky.... Oh well, I gotta work with what I got.) as well. And, yep, this picture shows up very well. (I had to use old camcorder front lens like you suggested, and took the picture within an inch between the lens and DIP-4 MEMS oscillator - the result was surprisingly high-resolution image.)
Lastly, if no special equipment is present, I just use Infrared toaster oven, the one that doesn't have a magnetron which can fry the chips with nasty dosage of microwave emission - IR soldering method is what I would usually get away with - and it really take a lot of patience placing all those SMD parts - I just take my time with those stuff. Then oven it to 215 - 245 F until solders are fully melted (for no longer than 10 - 30 minutes as described in the datasheet outlining recommended soldering methods - I just leave out SMD solid-polymer electrolytic capacitors until the last moment, then soldered with hand soldering iron as they HATE heat.)
After few hours of carefully-planning and breaks from the dangerous battery surgery (or butchering, depending on your POV), I finally got five HUGE batteries out of the pack (which I discarded the torn plastic pack) - which you can see in the picture with one Energizer AA NiMH for a size. I have done few "Kids, do not do that at home!" stuffs before, so I am more experienced, but still I have to treat Nickel - Metal Hydride batteries like they're Lithium : Cobalt Oxide batteries (the most dangerous type of batteries next to Lithium Polymer batteries) however, in the end, I got what I wanted.
What am I going to do with them? I may use for the battery backup unit in Dendou Oni prototype supercomputer (For Prop II ones, I felt it would be best to go with 6.4V 10A/hr Lithium : Iron Phosphate batteries so it could wait out long enough until power is restored.)
BTW, I want to inform the readers, even though the Lithium : Iron Phosphate batteries won't care about the short-circuit current, they will be more than happy to blow up a beefy motor, I would still have to emphasize: They still can explode. How?
If the Li:FePo4 battery's Anode top gets dinked, the vent is done for, making recharging process now more dangerous as there is nowhere for released Oxygen gas to go. Still, respect the warning labels printed on those batteries, and you will be fine. They're safer when treated with due respects.
Speaking of using a Lithium : Iron Phosphate batteries for Propeller II-based supercomputer for redundant boards that may need to be kept alive even when experiencing black-outs, I think I may need to go much deeper in researching much as possible to create better battery management system (BMS) board to ensure the batteries being recharged are kept in check, that the battery voltage are kept constant. That may necessitate the limited selection of microcontrollers, I would like to use Propeller in this one, but I may need to decide if cheap PIC18 will work as it has for certain Japanese camcorder batteries (most likely for Sony Handycam) before buying probably 12 Li : FePo4 batteries to be able to even provide enough juice for laptop hard drive and the separate board to be kept alive for a good portion of times until either power is restored or the computer on BMS has decide to shut down due to batteries being spent (from 6.4 V sufficient charged down to 5.4 V discharged / trip-out).
Anyone who have done BMS are welcome to put in what you have done in constructing the BMS, as I have to research few things first before building an intelligent battery pack. (I am familiar with how Li:Ion battery protection circuitry, but is willing to learn a bit more so I could make better ones in the future.)
Your BMS sounds very interesting. All batteries require care, more or less, but LIPOs need the most attention. A couple questions. 1) Do you expect blackouts to occur often? 2) Approximately how many props do you expect will need backup? We have found ways to continually lower the power consumption in an array of props including a standby mode that can last a long time and hold programs. Plus now we have circuits that are Joule savers. I can see a circuit which detects loss of power and diverts to a joule saving device with a battery. Of course a lot of the power requirements may abruptly change when hypering an array of Propeller chips.
It was just as a safe-guard, and when shutting down the computer for 90% (up to 240 PII) to be maintained, while 10% (32 PII, within 10A switchmode VRM limits I chose for safety margins for battery - I would allow 40A for start-up of VRMs, including a separate USB / SATA laptop hard drive) is just isolated (which the critical data is to be processed, and available and ready when ATX SMPS is switched on, and 90% of other boards booted up) - LiFePo4 batteries can stand being punished so I chose this type for that reason, to be able to withstand huige current drawing to the point either MOSFET on BMS explodes or fuses just blow (although on the properly-designed BMS, the individual fuses for each batteries will never pop, just the CPU in MCU has shut the MOSFET driving rail down to trip out on short-circuit condition) - I am willing to do 20A max. on the backup Li:FePo4 battery pack specifically created for this supercomputer (it also would have RS232 that can be checked upon by another PC when being tested under load).
Lastly, I avoid Li : Po at all cost, even though I got few around - they hate being banged upon table, they explode right away. Sometimes I wonder why some Youtube users said I was wrong about abuse intoleration. (while I am actually right because I am familiar with battery warnings and safety.) Couple peoples says that Li : Po are much more safer, I actually doubt it for couple reasons: They're still made so flimsy (insulated by SURPRISINGLY THIN plastic isolator so easy to be torn), and packaged in pouches - no protection at all except for over / discharge protection board. They are still being used anyways (because they are actually cheaper than bullet-proofed Lithium Ion varieties. The old saying, "You get what you pay for" never dies.) They are fine for usage, as long as the current drawing are never exceeded over 100% of battery rating (better to derate that by 60% anyways, for safety reasons) and mechanical shocks are never applied to the Li : Po batteries. Even a hand-slap onto the Li : Po battery is enough to tick it off.
I am surprised that Li:FePo4 batteries assembled in 11.85V pack can start a car. O______O 250A cranking current?! Now that's really impressive, but I would still prefer to treat mine as it is the weakest batteries in the world (so I can get longer life expectancy outta them), hence the reason for BMS. I got the general feelings of BMS, as I have done the surgery on laptop battery pack (again, I need to stress, it's still very dangerous - I just have enough experience to deal with it. I got free still-good Lithium Ion 18650 cells outta dead pack, and am still using them.) - the BMS in laptop battery pack is smart enough to know what's going on - basically a built-in DMM and circuit breaker combined on a board.
I didn't think too much about battery backup because of living in a country with 220 VAC standard where remarkably has not had a single power outage in four years . I don't know how they achieve that kind of power regulation. In the USA, there were frequent outages due to tests, rewiring and storms. Backup for the hard drive due to code eating virus is another story.
Well, both in USA (Montana) and Japan (Tokyo), I would experience frequent black-outs (before then, black-out in Japan was so rare, now thank to Fukushima Daiichi getting destroyed in the process, it's getting a bit too much for black-outs in Tokyo.... I am hoping they will correct future black-outs - probably convert to ALL 60Hz links, although it's still considered a wishful thinking.)
Even so, I personally would like to have the batteries as an insurance policy, even when I want to be able to pull out a cabinet containing the 32 PII chips still operating to isolate something, or just peeping into what it is thinking, you know?
Of course, that's the recommendation - and you will want to read throughout all datasheet to ensure safety (building BMS jus don't leave enough room for any mistake, properly-built BMS is a must.
BMS requirement (for people curious about using Li:FePo4 batteries):
For each 3.2 V Li:FePo4 batteries, the battery charger / controllers must -
Cut off at 3.6 - 3.7 Volts, NEVER up to 4.2 V - at that point, charging process should stop.
Cut off at 2.4 V (some needs to be cut off at 2.8V) when discharged.
However, you can also put 10A fuses for each batteries, before the circuit breaker IGBTs/MOSFETs. (Li:FePo4 batteries are extremely tough - the only thing that will explode under stresses would be the higher-ampere fuses.) For smaller Li:FePo4 batteries, like CR123A size common in expensive digital camers, 2.5A fuse is fine and would be recommended.
Comments
And, the SDRAM hardware may also alleviate the storage memory issue, but I may have to squeeze some data down to 2,048 bytes, which is the RAM size in COGs. It have been done on PICs and ARM Stellaris MCUs, suggesting that it's also possible on P8X32A..
I am going to read through the rest of uCLinux's main OS kernel source code file tonight to see what can be done, and what can't, for P8X32A chip. I may have to sacrifice some bells & whistles unless I also have either hard drive or SD card available for extra softwares. I will have to wade through as it's a 332MB source BZIP, but it's worth it because it's a treasure to behold - containing the tidbits I would never have thought I will live without.
If uCLinux is compiled too big, then it's onto a FAT32 formatted IDE hard drive, and have the firmware loader on a serial FRAM load the Linux boot image outta this hard drive, like a PC. After all, I have bunches of older hard drives I would use for something, although I am saving better one for this prototype for some reasons. For now, I may build the master controller board, with a single P8X32A chip with 32MB Micron SDRAM, and several IO controllers, like Vinculum II, RS232 and USB to IDE controller - I will also leave the header for it to plug into the additional P8X32A boards - it's just a prototyping hardware.
I finally settled onto a 500mA switching voltage regulator from Microchip, http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=MCP1602-330I/MF-ND - I will find a 1A regulator if it needs to be that way. I also included Panasonic 10uF 6.3V SMD Aluminum Polymer capacitor to be able to filter all those nasty switching noise as possible.
I have formatted my 40GB hard drive with FAT32 containing 32KB sectors. It was difficult, until I figured something out on my WD Data lifeguard software, and I forced it to format 100% of hard drive data area with 32-bit FAT tables, so now it should be easier for P8X32A to be able to scan through the hard drive and finding whatever it wants. For some reasons I decided to skip 16KB sector option, picked the biggest one possible.
Not really, I just focused on it from your list of suggestions.
Thing is to get any Linux running on the prop you will need:
0) A GCC C compiler - Linux relies on some odd GCC extensions that a lot of other compilers don't have. No problem, the only solution just now is the zpugcc tool chain and the Zog ZPU virtual machine to run the binaries. Will be very slow. Prop II looks like it will be getting a new GCC generating LMM which will be much quicker.
1) A big load of RAM - No problem we have the 32MB solution in hand from GG.
2) A timer interrupt for the scheduler - Zog does not support interrupts but could be persuaded to do so I guess. Linux will need moding to use this I think.
4) A serial console port - No problem but a Linux driver will need to be written for it.
5) A block device for file system - No problem we have SD card support, probably needs a block device driver for Linux to be written.
6) Memory Manager - Probably not something you want to create an emulation for on the Prop, it would be horrible slow. Better to get back to a Linux that works without virtual memory. For that there is the Embedded Linux Kernel Subset (ELKS). No idea if that is still being maintained.
7) Probably a bunch of things I have not thought of yet.
After all that you end up with the slowest Linux implementation in the world:) Perhaps interesting as an academic exercise but not of any use.
Now I'm know here for tackling pointless projects, a Z80 emulator and CP/M for example but Linux on the Prop is way beyond my pointlessness threshold:)
Memory Manager? Just have to use very small RAM paging from the boot firmware on the I2C flash (or on my board, 64KB (or 128KB?) Ramtron Ferroelectric RAM) for the main RAM memory - it won't be pretty, but I just need to cheat a bit, though.
A block device.... I may use would be a hard drive for this prototype. Serial console bus would be done via an USB to RS232 bus and a modified TTY console driver. Timer? I will have to make a virtual nested vector interruption machine, which will take a bit resource, and I can't do Out-of-Order execution tricks on few programs either, as it will make uCLinux quite unhappy - Kernel itself won't really care if it's run in-order or not, just that few software will need to be run in-order. And, Propeller II Cogs is very likely to be superscalar, so you got raw speed-ups as a bonus from upgrading from P8X32A to Prop II. A big difference between those COGs' CPU cores.
It can be very slow, but still it is going to be a difficult OS to crash, unless the overflow stacks in it is obviously screwed up like from badly written VMMU software.
Like what you said, not a project for easily-intimidated programmers.
No. What I meant was that Linux uses protected virtual memory. That would require
emulating the paging hardware in the Propeller. Including raising exceptions on
page faults etc etc. Having Linux and it's apps access memory through that
emulated virtual memory layer would make it glacially slow.
Better to use an SD card initially, requires few pins and works, just needs a
Linux driver to interface to it.
As the Prop has no interrupts your VM has to check for them all the time,
including those page faults. Even more glacially slow.
Not sure why you are worrying about out of order execution all the time. Either
a processor does it or it does not. The Programer cannot change that and the
programs being run don't know or care. All the results come out in the same
order as a simpler CPU so there is no issue.
You could put out of order execution into an LMM or Zog or whatever virtual
machine but hat would just make it an order of magnitude or two slower than it
is already for no benefit apart from being able to say "look I emulated an out
of order execution CPU on a Prop".
All in all Linux on the Prop is not going to happen.
If it's severely watered-down, then it could make sense. uCLinux won't need strict Virtual RAM paging, though (while full-blown Linux that I could see booting up startingly fast on my computer containing Phenom II Deneb CPU require the MMU to be fully brought up and paged for 4k prefetching)...
I am starting to think it will be better to make my own OS for this chip, while Out-of-Order software would allow it to speed up greatly, owing to flexible branch prediction selection as I would know what's in both of the softwares.
uCLinux, although originally designed for microcontrollers, will have to be out of the picture for a while. (Fun fact: 30 to 56% of post-2005 MP3 players use uCLinux as a boot firmware, and I have Toshiba Gigabeat F10 that contain this type of firmware OS kernel.)
Why not SD? Well, for some software stuff, I would prefer to use the hard drive, and Vinculum II has many flavors of buses, RS232, I2C, SPI and FIFO - and it's also quite cheap. I can also remove the hard drive, and plug in the SD reader into the USB port.
Could you please explain a little what you mean by this?
I will try to explain here. You know that P8X32A will have to execute everything in-order, right? It's nice and great, but there is some problems with it: there are such things as bad branches, which cause it to set back and redo everything all over again, or in rare case, the CPU core inside the COG will have to drop the instructions onto the 2KB RAM memory, do something else first until the loop or djnz is found, and CPU's branch prediction circuitry finds out that those set of codes go together - it then execute them together. That's also kinda explain why the emulators are slow. They have to carry out limited set of codes, until all pieces together, and keep them if the branch prediction matches, iand if not the CPU is screwed.
But if I allow a bit time for Prop to analyze the branch predictions in a sacrificial COG, and couple hits would be reported back to the other COGs, and if there are misses, it can test that by reshuffling the instruction calls / flags and report it to the COGs too. All COGs will have the same software to ensure they are all kept on their toes all the times, as it is already well known that codes can change without any warning - sometimes without any inputs. AI just don't need too much branch prediction, while for better FPU emulators will take tolls on branch prediction circuitry. In my software I may write soon, it would pick the best chance all of the machine codes can be done shorter is put on priority list (scoreboarding methods that some early Out-of-Order processors used, and probably some today's OOOE CPUs still do) until the shorter ones are exhausted, they are saved onto the SDRAM or COG RAM depending on size, then do complicated machine codes - they are all later pieced together. Branch predication do much better, because the wrong branch won't harm the processor's chances of finishing as much.
Lastly, I wouldn't be surprised if I find out that Prop II's COGs has what is a true superscalar, 2-issue Out-of-Order CPU core.
And, you can also try delayed-issue on P8X32A if you're either bored or lazy, you would notice difference in speed if certain threads are delayed somewhat carefully.
I hope so. I want it to execute things in the order I specify and no other.
What is a "bad branch", can you give an example?
Not as far as I know. Such a mechanism would seriously mess up the Props deterministic timing.
Also not as far as I know. Do you mean 2k HUB or COG memory? No matter, COGs don't write anything to anywhere except as specified by the code I run on them. I hope:)
Again as far as I know the Prop has no branch prediction circuitry.
Emulators are slow because no matter how you slice it it going to take many PASM instructions to fetch an opcode, decode and dispatch it, execute it and write back the results. Setting whatever status flags are needed in the emulated CPU.
But the Prop does not do branch prediction (as far as I know).
Whatever that means it sounds like you would be doing more work counting hits/misses, reshuffling code and communicating between COGs than if you just executed the code normally.
Is it? What do you mean "codes can change without any warning". A lot of PASM programs rely on self-modifying code, which hopefully is under the authos control, so the "same" code in two COGS may not always be the same at all times depending on the processing going on. Unless the data and timing is the same in both in which case there is no point to having two copies running.
Can't make heads or tails out of that but if you can write a faster floating point object as a result of these ideas I'm sure we would all be forever greatful.
I would be very surprised, as I said such things as pipelines and branch prediction seriously mess up deterministic timing. We like that every (most) instruction always takes the same number of clocks no matter what. We also like that pretty much all instructions take the same number of clocks.
Love to, do you have a code example that demonstrates this effect?
I have been busy for a while... I will try to start ASAP. Just taking my time, though...
I had to ponder how to get ALL 8 pins tied together with all P8X32A, and something to copy the data onto SDRAMs quickly if I perform broadcast copying.
I have been wondering about the IO layout, trying to get better resolution between the numbers of pins and bitrate.
Either way, I may have to tie the SDRAM IO with other P8X32A's SDRAM IO with enable lines tied to the RAM chip, to be able to independently control the condition of the RAM and CPU to CPU IO.
How many Cogs will you sacrifice? I'm wondering how many sacrificial Cogs I will need in my machine. I hope I don't need any. I'd like to merely put the cog to work and then reinitialize the cog (after the organizational work is complete) with new code to keep it on the up and green. But you know how it goes..the Machine Gods may think otherwise.
http://forums.parallax.com/showthread.php?125614-Propeller-supercomputer-hardware-questions&p=1018449&viewfull=1#post1018449
The project is definitely worth pursuing. You just need to start. Put one foot in front of the other. After a while you will have nice linear progressive progress. Two months later, it could go "development exponential."
See this post:
http://forums.parallax.com/showthread.php?124495-Fill-the-Big-Brain&p=1018114&viewfull=1#post1018114
About "Do it or lose it" routine, it's pretty much how the advanced branch prediction in PowerPC (including AMD Phenom II based on PowerPC G6 core - a definite advantage in term of speeds and I am loving this CPU!) works - if it's not fulfilled, the CPU evict it out - many time, CPU may elect to save it on L2 / L3 cache RAM to try again if necessary, though.
BTW, the RGB VGA video card was just something out of curiosity to see if I can squeeze enough color signature out of each P8X32A chips for individual RGB wires outta the VGA monitor and to see if it could do reasonably good graphic (SNES or Playstation 1 3D graphics) with those processors. I also may want to clock it at 100 or 120MHz to be able to do it quickly.
BTW, I apologize for crappy image quality... And, about mounting the SMD MEMS oscillator the live-bug style, it was frustrating trying to get it right, as the solder weren't behaving as well, so I had to do it right, and I succeed in the end: an instant DIP-4 MEMS oscillator.
Very nice to see the oscillator board and for things to get under way.
If you put a convex lens in front of the camera and focus it, the resulting macro lens image would be much more clear. Nice experiment for trying out a number of precision clocks and the end effect is obviously for over-clocking to faster speed. I look forward to reading about your results.
Here's another site which has free oscillator samples.
http://www.sitime.com/support/request-samples
I'm working on a low cost version super coolant agent that can help in high speed over-clocking projects. This is part of the Cryogenics Project over here:
http://forums.parallax.com/showthread.php?124495-Fill-the-Big-Brain&p=971277&viewfull=1#post971277
http://forums.parallax.com/showthread.php?124495-Fill-the-Big-Brain&p=971283&viewfull=1#post971283
http://forums.parallax.com/showthread.php?124495-Fill-the-Big-Brain&p=971454&viewfull=1#post971454
I was at the electronics and computer surplus center that has around 200 stores in one central location and went around searching, and collecting parts / supplies this past week. I can report on more results soon.
I also am considering buying handful of QFN-44 Propeller chips (8 or 16) for the second part of Dendou Oni prototyping and put Peltier TEC plates on the Copper backing, then put the board on it. I wold love to see how fast the P8X32A can go! (I have done OC before, if you have been wondering, I did it with Phenom II temporarily just for thrill: I went from 3.0GHz to 3.8GHz without changing the Northbridge, just core multiplier, on air using modified stock heatpipe coming from AMD, with Sunon 80mm fan rated for 110CFM sitting atop the fan bracket with 80mm / 70mm funnel and I had to let it cruise at low speed as it screams loudly once it hits 6,000 RPMs.... Yet, it was interesting at how much I can get away with Black Edition CPUs...)
Oh, and you will want to use special foams (Neoprene foam, the black, non-porous stuff used in fridge's plumbings to cover the cold line from getting all encrusted in ice.) and your P8X32A chips will thank you for not letting the condensation shorting it out. (hence the reason I recommended QFN-44 package as it can also make foam applying process somewhat easier, at least for me.) DIP-40 just take a while before finally touching the base of sub-zero temperature and it just make it interesting (Remember kids, interesting = bad!), the differing temperature can shear the plastic package, so it need to be done slowly - within 4 hours to go from about 30 C down to - 70C.
P.S. I eventually uploaded a nice closeup of SiT8003 MEMS oscillator in DIP-4 socket that I put in for to reduce the chance of damaging this oscillator - you can see why the soldering job on turning it into a bug was very tricky. And also convenient as I can also pop it in the place of quartz crystal oscillator.
I really wanted a nice and BIG Lithium-ion camcorder batteries so I could do the same, only it will be able to go on for few days active, and possibly 6 years with ALL P8X32A (or PII) on standby and SDRAMs shut down. But, there were none on discount, so I had to hit NiMH up as it was for $4.00 here, just use what I can use... (And with that kind of Ampere - about 45A estimated short-circuit power, I could be able to keep P8X32A on for probably a week until it hits 4.2 V then needs to be recharged up to 6.4V. Not too bad.)
Dr. Mario, good approach to plate the underside of the QFN-44 chip and empower it with the Peltier junction. However, that form factor prop takes some special equipment to solder. As you know, with the P8X32A-D40 and the dry ice method there is no condensation to short out close or wide-spaced pins. That's because of pure sublimation. No water involved. Dry Ice is non toxic, non corrosive, non conductive and does not damage any metal surfaces. It goes directly from a solid to a gas with no liquid-state melting in between. It also does not require on board or external power consumption like the Peltier.
http://www.cwrmglobal.co.za/dry-ice.htm
As it's electrically nonconductive, it's used for cleaning by pellet blasting surfaces, and for surface to chip deposition in just the right amount on boards in diagnostic troubleshooting when the board is powered up.
http://wiki.answers.com/Q/Is_dry_ice_a_good_electrical_conductor
I have been wondering about the battery power for backup UPS for a board - you think I am cool with Ni-MH or should I build a Lithium : Iron-Phosphate battery pack? (I have enough experience dealing with them to know what can be done, and what can't be - regular Lithium-ion batteries are just too moody and touchy, so I felt I would give 3.2V Lithium Iron Phosphate batteries a shot, as they don't care about short-circuit abuses.)
BTW, I had to turn my LED flashlight down a bit so I don't blind the CCD sensor (Some Kodak digital cameras are just so picky.... Oh well, I gotta work with what I got.) as well. And, yep, this picture shows up very well. (I had to use old camcorder front lens like you suggested, and took the picture within an inch between the lens and DIP-4 MEMS oscillator - the result was surprisingly high-resolution image.)
Lastly, if no special equipment is present, I just use Infrared toaster oven, the one that doesn't have a magnetron which can fry the chips with nasty dosage of microwave emission - IR soldering method is what I would usually get away with - and it really take a lot of patience placing all those SMD parts - I just take my time with those stuff. Then oven it to 215 - 245 F until solders are fully melted (for no longer than 10 - 30 minutes as described in the datasheet outlining recommended soldering methods - I just leave out SMD solid-polymer electrolytic capacitors until the last moment, then soldered with hand soldering iron as they HATE heat.)
What am I going to do with them? I may use for the battery backup unit in Dendou Oni prototype supercomputer (For Prop II ones, I felt it would be best to go with 6.4V 10A/hr Lithium : Iron Phosphate batteries so it could wait out long enough until power is restored.)
BTW, I want to inform the readers, even though the Lithium : Iron Phosphate batteries won't care about the short-circuit current, they will be more than happy to blow up a beefy motor, I would still have to emphasize: They still can explode. How?
If the Li:FePo4 battery's Anode top gets dinked, the vent is done for, making recharging process now more dangerous as there is nowhere for released Oxygen gas to go. Still, respect the warning labels printed on those batteries, and you will be fine. They're safer when treated with due respects.
Anyone who have done BMS are welcome to put in what you have done in constructing the BMS, as I have to research few things first before building an intelligent battery pack. (I am familiar with how Li:Ion battery protection circuitry, but is willing to learn a bit more so I could make better ones in the future.)
Lastly, I avoid Li : Po at all cost, even though I got few around - they hate being banged upon table, they explode right away. Sometimes I wonder why some Youtube users said I was wrong about abuse intoleration. (while I am actually right because I am familiar with battery warnings and safety.) Couple peoples says that Li : Po are much more safer, I actually doubt it for couple reasons: They're still made so flimsy (insulated by SURPRISINGLY THIN plastic isolator so easy to be torn), and packaged in pouches - no protection at all except for over / discharge protection board. They are still being used anyways (because they are actually cheaper than bullet-proofed Lithium Ion varieties. The old saying, "You get what you pay for" never dies.) They are fine for usage, as long as the current drawing are never exceeded over 100% of battery rating (better to derate that by 60% anyways, for safety reasons) and mechanical shocks are never applied to the Li : Po batteries. Even a hand-slap onto the Li : Po battery is enough to tick it off.
I am surprised that Li:FePo4 batteries assembled in 11.85V pack can start a car. O______O 250A cranking current?! Now that's really impressive, but I would still prefer to treat mine as it is the weakest batteries in the world (so I can get longer life expectancy outta them), hence the reason for BMS. I got the general feelings of BMS, as I have done the surgery on laptop battery pack (again, I need to stress, it's still very dangerous - I just have enough experience to deal with it. I got free still-good Lithium Ion 18650 cells outta dead pack, and am still using them.) - the BMS in laptop battery pack is smart enough to know what's going on - basically a built-in DMM and circuit breaker combined on a board.
Even so, I personally would like to have the batteries as an insurance policy, even when I want to be able to pull out a cabinet containing the 32 PII chips still operating to isolate something, or just peeping into what it is thinking, you know?
Of course, that's the recommendation - and you will want to read throughout all datasheet to ensure safety (building BMS jus don't leave enough room for any mistake, properly-built BMS is a must.
For each 3.2 V Li:FePo4 batteries, the battery charger / controllers must -
Cut off at 3.6 - 3.7 Volts, NEVER up to 4.2 V - at that point, charging process should stop.
Cut off at 2.4 V (some needs to be cut off at 2.8V) when discharged.
However, you can also put 10A fuses for each batteries, before the circuit breaker IGBTs/MOSFETs. (Li:FePo4 batteries are extremely tough - the only thing that will explode under stresses would be the higher-ampere fuses.) For smaller Li:FePo4 batteries, like CR123A size common in expensive digital camers, 2.5A fuse is fine and would be recommended.