Hub Execution Makes Things Too Open-Ended
cgracey
Posts: 14,151
It's been creeping up on me that perhaps hub exec makes things so open-ended, that it actually is slowing people's engagement down, because they are not dealing with a limited space, anymore, and anything is possible. Would you all concur?
When something is finite, it's fun to attack, since it poses a finite challenge and you are anxious to master it. When something becomes somewhat infinite, the drive to engage can become more diffuse, because of an awareness that anything is probably possible, but..... so what, then? Tight spaces invite challenges, while big spaces just languish like a floaty question mark.
When something is finite, it's fun to attack, since it poses a finite challenge and you are anxious to master it. When something becomes somewhat infinite, the drive to engage can become more diffuse, because of an awareness that anything is probably possible, but..... so what, then? Tight spaces invite challenges, while big spaces just languish like a floaty question mark.
Comments
I know I will be able to do a lot more - and a lot more interesting things - because of hubexec.
LMM was great on the P1, it allowed porting C ... but it also had a huge performance hit. That performance hit is basically history.
Now we will need great docs, but frankly, it will be a lot easier to explain hubexec to people, than the limitations of having to fit into a cog.
No need to answer the "HUH? You have 256KB of ram, but you can only write <2KB programs????"
NOT so open-ended as You think.
It is why I made my question in other thread about LOCBASE.
AS I write COG based code --- BUT will address some pointers that exist in HUB --->
BUT that pointers will be filled by COG code in runtime
I agree, in a sense. So much more room just makes me have to think longer about what to do, whereas if we had an instruction area of only, say, 128 instructions, I would have all kinds of things I'd want to try to fit.
You know what's kind of boring to imagine? Imagine a processor with an infinite memory space and only a few simple instructions - ones that would lend themselves to compilers. That just doesn't seem interesting to me, as it's flavorless and infinite. Not until you built some libraries to do interesting things would it start to get interesting. The Prop2 is designed to be engaging to a human at the assembly-language level.
32k opcodes & 32k data
Given recent threads, the Hub Execution may need to introduce the idea of memory-typing, so that the assembler can check ORG and references etc, really are where the user expected.
Jellybean equates may not be enough anymore.
Such memory-typing also helps symbolic debug, and watch handling.
The thing about all of us here is that we really enjoy and savor IMPLEMENTATION DETAILS. Modern engineering thinking seems to eschew this whole area of discipline, in favor of generic solutions that hide implementation. Implementation is almost a dirty word. There's no ingenuity in avoiding implementation matters, it seems to me..
256KB means 64K long opcodes, with no data.
Doh - that's what happens when fingers are faster than a distracted brain.... fixed above.
Hmm... setmap the memory into four 128 long chunks, and you effectively have four baby cogs in a cog
Actually I love the opportunities tasks, hubexec, and dare I say it, even threads open up.
Many classes of applications that were impractical on P1 (ie more vision processing etc) become feasible on the P2.
And if your application/driver fits.. there is still a certain elegance to fitting in a cog.
Architectures that pander only to compilers are BORING.
So we are in total agreement here
Personally, I don't care if compilers don't use even 80% of those assembly language oriented instructions. Even if they could, it would not be economical to analyze for every possible special case.
But so what?
Assembly language programmers will use them all - when appropriate.
Exactly!
And no compiler generated code will be able to touch code hand crafted by assembly experts that uses all the features & tricks.
Given Assembler is going to be important on a P2 for the lowest level crafting, I have always liked the look of HLA
(Assembler control, without the tedium)
http://en.wikipedia.org/wiki/High_Level_Assembly
and a quick example
http://www.plantation-productions.com/Webster/HighLevelAsm/hla_examples/calc2.hla.txt
First an example...
We all know the 486 has lots of instructions.
Do the compilers use all these? Probably not!
How many write assembler for the 486? Probably not a lot!
We all know Chip writes in assembler for the 486+.
I wrote an emulation for the ICL mini I worked on - I used 486 Assembler and I targeted the new RISC (fast) instructions.
So, to answer your original question Chip, it will not detract from what we want to do in P2 assembler, it will just enhance it!
Now, we do have a problem looming. It has been touched on in different threads.
We need to be able to get at both the hub address, and the cog address, for various instructions. We did strike this in the P1 and @@@ was introduced. But it is already an issue now with hubexec mode as well as cog mode. We now have relative and absolute addresses, but these need to be both for hub and cog modes.
This is a sw issue, but one which needs to be addressed. I don't have any proposals yet - it is early days. But we are going to have mixed mode programs, just like the monitor program.
Yesterday I spent some time looking at the monitor (including compiling it - thanks for posting the source Chip) to see what I could use of the monitor rom in hub. I thought I could use the tx_string but then I realised it had to call a cog routine to get each character. I used the hub strings successfully though. Certainly interesting times ahead
One thing to keep in mind is the new limits are still small for different sorts of tasks. Think doing an HDTV screen. That cramps the hub just as much as a simple VGA screen did P1.
Right now things seem big, but they won't be once we start actually doing stuff. Soon, we will need tasks to maximize COGS, soon we will need HUBEX to maximize memory, or swap in code or work around data, soon we will need to communicate across the PORTD to make effective use of math and DACS and counters.
The scope of challenging tasks doesn't need to get much bigger to start to squeeze, and data already will be that way. As usual, the video, audio and I/O capabilities far exceed the HUB RAM.
We have opened it up, but it is by no means open ended, IMHO of course.
No, it's a good thing. Not only does it save memory, it will often run faster as well. Leaves more memory and cpu power for other things. Waste not, want not.
I have been trying to follow the development, but honestly after all of the contributions and conversations, I don't really understand the limits or limitlessness of Hub Execution.
I like the concept of easy access to HUB RAM and expanded code space. It should actually make coding the kinds of things I am interested in far simpler, but as to whether this capability belongs in all of the Cogs or just a single Cog is beyond me. And (correct me if I am wrong) it seems that the performance boost is well worth the overhead.
I still don't have the skill to take a video driver and make it work from multitasking or take a driver that has been written for multitasking and make it single tasked.
BUT I think the video driver example is the exception rather than the rule. I suspect that most of the time, an object will simply have to be tagged as to whether it works under multitasking or requires HUB EXEC.
At this point I would be very concerned about how all of this is ever going to be made tangible to the wider audience, but I have full confidence that Parallax's
World class team of technical writers is more than up to the task.
I like the idea of not having to use any of the advanced features, if I don't want to. As long as this remains largely the case, I don't think you should
be concerned. AND I certainly don't think you should try to explain all of this yourself! Brief and to the point if some of us don't quite get "it," that is not your problem.
Thank you.
Rich
There is a lot of MONEY involved in keeping the implementation HIDDEN. The whole semiconductor industry is now the ART OF HIDING.
Ofuscated RTL, intellectual property, layers and layers and layers of complexity to make main memory (DDR,NAND FLASH), mass storage (SATA), and protocols (USB, etc ...) away from people to be used, and only in the hands of a small group of industry associated leaders.
The few books that gets their hands dirty into implementation are keep as gold. And is not easy to find implementation even in the university or academic circles because they are also following the same hidden practices.
If some protocol get reverse engineered and used everywhere: replace it immmediatly with a new great, fantastic, bright ... but with an exponentially more complex to implement state machine in order to be controlled only by authorized hands.
If some company makes a product that bites a big industry: buy it, or destroy it (legally, with our law).
Engineering is not in the hands of engineers, but in the hands of business groups (and some business in the hands of governments too). Parallax seems to be the exception. And this is what I am here.
I think a lot of existing users value the propeller for its open ended approach - being free to put peripherals on any pin, use the cogs co-operatively or independently etc. The P2 is even more of a 'wide open road', and should be celebrated and marketed as such.
I wouldn't worry about hubex as much of a source of disengagement. Here are a few reasons
- code written to fit a tight constraint often isn't as easy to follow
- hub ex allows us to consider porting big slabs from alternative code bases, more easily. I'm now tempted to port across some algorithms currently performed in an upstream PC that I wouldn't consider tackling previously
- in many ways the hub execution is a more natural way for non P1 users to approach multicore processing. We existing P1 users need to step back a bit to appreciate this
- limits are arbitrary. Is the (initially expansive) 512 long per cog / 32 kB hub limit of the P8x32A discouraging from engagement / solving problems? Not really...
- there is nothing worse than not being able to access enough RAM when you need it (data, but code too).
- the cog hasn't grown at 512 longs, but we now have 4 tasks. It'll be easier to fill a cog.
- Hub ex is an elegant growth option once you fill a cog the old fashioned way.
The possible downside to hub ex that I see may be to fracture the way people code share code destined for the obex. I don't know whether that is a real problem or not. And complicating the addressing slightly, perhaps. I'm about to find out for sure.
On the other hand there are many things that can be done to encourage engagement. The new range of instructions and peripherals go a long way
There's a bit of "choice paradox" or perhaps aesop's "fox and cat" in all this. I'm very pleased you consider this kind of thing, Chip.
But most people do not engage with computers at the assembly language level. I guess we tend to forget that because we spend so much time at that level in forums like this - but we we are the exception, not the rule!
Many (perhaps even most) P2 programmers will use a higher level language - SPIN, C or C++ (or FORTH or LISP or JAVA). For them, the P2 will be just another microprocessor (via HUB execution mode) that also has a quirky additional microcode-like facility (COG execution mode) that gurus can use to write hardware interfaces or device drivers.
Ross.
I posted a proposal on this in the "Blog" thread - here. The response so far has been underwhelming. I think most people don't yet get that this is going to be an issue, or that what I proposed is supposed to get rid of the need for @, @@ or @@@.
Ross.
Roger.
Update: So my gut feeling tells me applications requiring hitting I/O speeds and processing data probably in the ballpark of several tens of Mbps will become the more challenging ones for P2 to solve depending on what extra HW capabilities we may get with any future "SerDes" feature added to P2. The Hubexec model may not necessarily always be able to cater for these data rates given the extra hub access cycles it may require to execute. It will primarily depend on how real-time hubexec can be kept using its instruction cache and how exactly the time critical I/O loops are able to be fashioned.
Hold on! The P2 is still TINY in comparison pretty much anything with it's level of power/ability. Working with it to take full advantage of it's ability is going to be a real challenge for us all. I think you just aren't thinking big enough in what you think it can do? Imagine some of your real time visualization dreams, some of those are going to need big data and big code.
Also, as a comparison, my day job involves dealing with hundreds of textures all of which are too big to fit in the P2 memory (some are 5.5MB in compressed form (22MB if it were RGBA 32bit form)), combined with a multitude of other pieces of data like sound, meshes, collision information, and so on (many of which individually are again too big to fit in the P2). I've been spending most of my time lately making our stuff more efficient, so we don't run out of address space on a 32bit Windows environment (which is 2GB). It's all just a matter of scale. When you are trying to render sufficiently detailed graphics and sound on modern machines at high resolution, you just need a lot more memory.
Trying to display sufficient detail at even 720p (let alone 1080p) can consume considerable space not just for the data, but for the code needed to manage, process, and display the data.
Anything that help make Assembler source faster to visually scan, and helps the tools do the housekeeping, is a good idea.
Getting rid of syntax strangeness like @, @@ or @@@ is a step in the right direction.
You articulated here a lot of things that affect my motivation. I miss the days when disparate protocols could be addressed through flexibility, alone. Now, not only is dedicated transistor-level hardware often required (for unavoidable reasons), but there's the patronage to be paid to the standards owners who keep their protocol details hidden. Not fun. Not the future I had hoped for. It's like the expanding wavefront squirreled off into another dimension that is unreachable, or at least unpalatable. Only the desire for money and the willingness to forfeit freedom can take one there.
There's still much that can be developed with simple circuits in pins to deal with voltages, currents, resistances, and capacitive/inductive reactances, in order to realize novel sensors and control loops. The natural world is still our friend, and always will be. I would be content to just live there, myself.
When I first learned about computers, I had this notion that they would evolve to be benevolent helpers that would be, first of all, reliable and loyal. It's turned out that they are primarily conduits for a lot of sludge, with a whole bunch of diseases and treachery coursing through them. Definitely NOT our friends, anymore.
I want the power to achieve the goals of my product. The last thing I want is to wonder about how quickly I'll exhaust code space before finishing something. I built a G-Code interpreter and XY table driver in SPIN once. It was complete except for a few very important details. Code space was gone. Sorry, but I'm not into programming just to don a straight-jacket.
Making tech accessible is always worth doing Chip. That's why I'm here. Picked up on it right away. Not hard to see when Parallax works hard to be as good of humans as they are competent technically.
We are going to hit a coupla hundred Mhz, multi-core, multi-task. That's a lot! Sure, the overall chip is small relative to many things people do, but it won't always be. Get near a Ghz, and suddenly it's possible to just go and do your own thing.
I think that's going to happen to some degree with open hardware movements out there gaining steam. We've got open other kinds of movements running well, and those are directly competing with closed stuff.
Personally, I think we are doing good things. Thanks for it. Not sure where the journey will take us all, but it sure is a fun ride!
One other thing. Big money tends to get kind of needy and heavy. Usually, the result is a lot of the value ends up taken out in terms of layers of cost. Whatever we all think about it isn't so important. People needing to do their own thing is growing more important, not less.
Perhaps there will be more enthusiasm when we start to produce drivers - guess we will see
However, what you also need to do with this chip, is make it usable for the masses, who aren't as hard core as all us prop heads are, and removing 2K code+vars limits will help them get over this hurdle, again with the C stuff that's being done, that should help bring in new customers too, and hopefully help make the P2 the success story that it clearly is to us all already!