Amazing Sand Physics demo uses 6 cogs to animate 10,000 grains of sand in realt
Dennis Ferron
Posts: 480
Recently I extolled the virtues of a raster-based, non-tiled graphics system, and speculated about modifying the tv.spin object to make it non-tiled, to make it easier to do pixel-by-pixel raster graphics instead of tile-by-tile graphics. Well, I did that, and now here's a demo showing just what a non-tiled game engine can do!
If you run the demo, you will see a continuous "snow" of grains of sand falling from the ceiling, and piling up on the ground. If you watch it long enough, the grains will pile up into sand dunes, which shift and flow into a red sinkhole on the right. The really neat thing is that the interaction of the falling sand, the flowing sand, and the sinkhole is chaotic and unpredictable. Sometimes the sink hole absorbs almost all the sand, but then a little tiny dune will escape the sink hole and grow to enormous size. Then it flows and falls over the sink hole and completely covers it. For a while it looks as though sand is coming in faster than it can leave, but suddenly the whole thing collapses and gets sucked into the hole. Then the process starts again, but with variations every time! The level of sand in the terrarium never goes too high or too low.
And the best part is: all this complexity arises out of just two "rules" that govern the individual grains! I can animate 10,000 grains of sand with just 12K of memory and a small processor because the grains of sand are cellular automata. I got the idea from Conway's Life game. In the game of Life, cells are "alive" or "dead", and there are simple rules which govern invidual cells to create complex overall behavior. After seeing what arises in Conway's Life, I figured cellular automata would work for doing the physics for particle interactions, too, and guess what, it does!
There are only two basic rules which operate on the sand particles:
1. A sand particle will fall if there is an empty space beneath it.
2. If 3 or more particles are stacked up, the bottom particle will be pushed left or right out of the tower, if there is space.
This is the sand physics engine I'm developing for my ant farm game. I'll need some more rules to allow tunnels, but even as it is now it would be good for something like a scorched-earth game (two tanks shoot artillery at each other; the sand would make great craters where the shells hit).
Another good feature of cellular automata is that they are trivially parallelizable - you can assign as many or as few processors as you want to the task without changing your code. This is a perfect fit for the Propeller - as I demonstrate by assigning 6 cogs to run the sand engine at once. The grains of sand are so tiny and there are so many, if two cogs step on each other's toes and accidently erase a grain of sand or duplicate it, you'd never even notice anything.
Unfortunately it appears that after a while the 6 cogs *do* interfere with one another, as evidenced by the demo getting slower and slower over time until it is running at the same speed as it did with one cog. The algorithm doesn't care if two cogs try to read/write the same grains of sand at the same time, but maybe the Propeller hardware doesn't like it. When the demo is started, I use a waitcnt between starting each physics cog to try to keep the cogs looking at different zones of the screen so that they won't interfere with each other. Eventually one of the cogs will catch up with another one and *either* it is causing the cog to crash until only 1 is left, or the cogs are getting stuck in lock-step waiting for each other because of read and write locks. I'm not really sure how to fix this.
If you run the demo, you will see a continuous "snow" of grains of sand falling from the ceiling, and piling up on the ground. If you watch it long enough, the grains will pile up into sand dunes, which shift and flow into a red sinkhole on the right. The really neat thing is that the interaction of the falling sand, the flowing sand, and the sinkhole is chaotic and unpredictable. Sometimes the sink hole absorbs almost all the sand, but then a little tiny dune will escape the sink hole and grow to enormous size. Then it flows and falls over the sink hole and completely covers it. For a while it looks as though sand is coming in faster than it can leave, but suddenly the whole thing collapses and gets sucked into the hole. Then the process starts again, but with variations every time! The level of sand in the terrarium never goes too high or too low.
And the best part is: all this complexity arises out of just two "rules" that govern the individual grains! I can animate 10,000 grains of sand with just 12K of memory and a small processor because the grains of sand are cellular automata. I got the idea from Conway's Life game. In the game of Life, cells are "alive" or "dead", and there are simple rules which govern invidual cells to create complex overall behavior. After seeing what arises in Conway's Life, I figured cellular automata would work for doing the physics for particle interactions, too, and guess what, it does!
There are only two basic rules which operate on the sand particles:
1. A sand particle will fall if there is an empty space beneath it.
2. If 3 or more particles are stacked up, the bottom particle will be pushed left or right out of the tower, if there is space.
This is the sand physics engine I'm developing for my ant farm game. I'll need some more rules to allow tunnels, but even as it is now it would be good for something like a scorched-earth game (two tanks shoot artillery at each other; the sand would make great craters where the shells hit).
Another good feature of cellular automata is that they are trivially parallelizable - you can assign as many or as few processors as you want to the task without changing your code. This is a perfect fit for the Propeller - as I demonstrate by assigning 6 cogs to run the sand engine at once. The grains of sand are so tiny and there are so many, if two cogs step on each other's toes and accidently erase a grain of sand or duplicate it, you'd never even notice anything.
Unfortunately it appears that after a while the 6 cogs *do* interfere with one another, as evidenced by the demo getting slower and slower over time until it is running at the same speed as it did with one cog. The algorithm doesn't care if two cogs try to read/write the same grains of sand at the same time, but maybe the Propeller hardware doesn't like it. When the demo is started, I use a waitcnt between starting each physics cog to try to keep the cogs looking at different zones of the screen so that they won't interfere with each other. Eventually one of the cogs will catch up with another one and *either* it is causing the cog to crash until only 1 is left, or the cogs are getting stuck in lock-step waiting for each other because of read and write locks. I'm not really sure how to fix this.
Comments
I have been wondering for quite sometime if a pixel specific video was ever going to be creatively used. [noparse][[/noparse]Though there are vector graphics, like Asteroids]
I am downloading right now and taking a look as you certainly got my attention.
Nonetheless, tiles really offer other convienences. They allow alpha-numeric data to handled with a minimum of hardware and they are efficent for the creation of real-time games.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
"If you want more fiber, eat the package.· Not enough?· Eat the manual."········
Just wait until I get Mode-X emulation working, then I can show how it's possible for a non-tiled system to do smooth scrolling yet use only 1 video page (no double buffering)!
There might be rare cases where they are, but in general the potential interdependence of each cell on each other is going to cause memory issues like the one you noticed; there's simply no easy way around it.
Now what you could do is to split the simulation into six "buckets" where sand could fall, but was guaranteed not to touch sand from another bucket. Then you could store sand info in cog memory (you'd need some pretty tight packing to get it all in, but i think it's doable). I have a feeling that's not what you're after, though.
It is a cool looking demo though. I did one of my senior projects on the use of cellular automata to predict racial segregation; maybe I should port those sims over to the prop.
Post Edited (Bergamot) : 2/25/2007 6:55:26 PM GMT
Instead, I further modified the TV driver to accept an "x origin" that it uses to determine where to start each line as it scans. I then over-allocated the screen memory so that there are 64 more pixels per line than can be displayed on the screen. In other words, the display is a 256x192 rectangle, but this logical window actually resides in a physical rectangle in memory that is 320x192 pixels. There is now a wide strip of pixels that is present down the side of the large rectangle in memory that can be drawn on just like the rest of the display, but is not actually visible on the TV. You can then change the x origin to slide the logical 256x192 rectangle (what the TV is showing), left or right within the wider rectangle in memory. Scrolling is as simple as writing to just one parameter, changing the x offset the TV driver uses to start each scan. Since the TV driver only loads this parameter at the beginning of each frame, there is no tearing or shearing - this graphics scheme achieves all the benefits of double bufferring, without the RAM overhead of a second video page. You also don't need to wait for a v-sync to avoid shearing; you just change the parameter whenever you want to and the TV picks it up the next time it starts a new frame.
Obviously you can scroll left or right 64 pixels this way, but what happens when you scroll past the edge of the extra strip I added? Well I also made the TV driver automatically wrap back to the beginning if you go past the end of the line, so you can scroll past the edge of the rectangle and it will wrap around to the beginning. In this example it allows the demo to repeat the same background over and over, but in a game you could use this to scroll different things into view. You would simply write new scenery onto the strip of the video memory that is offscreen, and there would be no visual artifacts from that because it is not visible yet. As you scroll forward, that strip that was previously offscreen will come into view. While that strip is coming into view, old pixels are scrolling off screen. You can then overwrite the old pixels (which now can't be seen) with more new scenery, and so on. Since the display wraps around like a mobius strip, you never run out of room to scroll, and you never have to draw the whole screen at once, just tiny strips of it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
my two bits
marty
Pat
Lawson: I tend to agree, but I'm going to write some code to test this theory - I want to see if multiple cogs do in fact slow down when writing the same memory. If that is what is slowing it down, I think I can fix it, but I want to be sure that's what the problem is.
Edit: In the long run it might be better to just try to make a single sand physics demo simply run faster on just one cog (optimize the code).
Edit: Yes, they could work off a random x-y position. It might look a little strange though, because sand on different parts of the screen might fall at different rates if they randomly get more "hits". With the whole-screen scan, everything gets equal consideration.
Post Edited (Dennis Ferron) : 2/27/2007 2:01:08 AM GMT
The way the test program works is that it runs an assembly language program that increments two counters. One counter is private to the cog and records how many times the cog's program loop executed. The other points to a shared counter whose value is not important, but which is read to and written from to demonstrate "access contention" - all the cogs running the test try to simultaneously read and write this shared counter at once. Just to be doubly sure that there will be a "collision" will occur, each loop reads and writes the same data to the shared memory location dozens of times in a row, so that the loop spends most of it's time banging on the shared memory, and if a collision could possibly occur, it should happen often.
It runs 3 copies of the assembly language program in 3 different cogs, with 3 private counters and one shared counter. The Spin program running on a fourth cog (the original cog) takes a snapshot of the three counters, uses the hardware counter (waitcnt) to wait for a period of time, and then takes another snapshot. This gives you a delta value telling you how many times each cog looped during the pause. Now, if the cogs were having to wait upon one another for read or write access to the shared location, then with three of them contending they would have to wait longer for access. So you would expect that when you run three of them together, your delta values (differences in loop count) would be lower than one cog running by itself at full speed with no contention. This is not what you see.
On a Propeller running at an 80MHz clock speed, you should see a delta value of about 3E9C to 3E9D for all three cogs. If you comment out one cog-start line and run the test with just 2 cogs, it is still 3E9C to 3E9D. If you comment out two cog-start lines and run the test with just one cog, it is still 3E9C to 3E9D. The loop speed in the test is wholly unaffected by the number of cogs that are writing and reading the same shared memory location. Therefore, it is impossible that my sand-physics cogs are slowing each other down due to write contention - frankly, write contentions do not exist on the Propeller, due to its spiffy architecture.
That still leaves me in the dark about the actual cause of the slowdown on the sand demo. Given the results of this test, it seems unlikely that cog to cog interaction has anything to do with it.
Pat
http://forums.parallax.com/forums/default.aspx?f=33&m=185675
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Realize that I am really a mad scientist··· and
Don't forget it!
http://raydillon.com/Images/Illustration/GameArt/WildIsle/WildIsle-Ink-ScientistClose.jpg
·
I like the dropping sand effect.
rob7
This is excellent. I definitely looks like we're interested in the same sort of things. I don't know if you tried out Fantasia or not, but the second version has particles. And a few days ago when I released it, 256 particles looked impressive. Now you've blown it out of the water with 10,000!
Of course we've taken very different approaches. You store your particles as pixels in a frame buffer. Mine are a list of X,Y coordinates of pixels. Good for different things. Yours good for large numbers of particles with celluar interactions between neighbours, just as you've done. Mine for starfields and sparks - no interactions between particles, but indiviidual velocity and acceleration.
I had been thinking about another approach for particles, which again has it's uses and drawbacks. Storing them as run-length encoded, either vertical or horizontal. And that ties into what Baggers is doing with his run-length encoded TV driver.
Post Edited (CardboardGuru) : 6/15/2007 6:24:56 PM GMT
I don't think any of your cogs are dying, or that you're getting into any clash conditions, or any such thing. I think what you are seeing is that difference in speed of various paths through the sand-physics code. When the demo starts hardly any of the screen is sand. So that path through the code is really short. Then, falling sand or sand on the top level of the dunes is faster than buried sand because the IsTower subroutine has a loop in it. A really small loop, but it's multiplied thousands of times.
I've come to this conclusion by doing a quick blinkenlights hack. Changes are described below. This gives a column of pixel-pairs at the top of the screen, one for each cog. They cycle through the 4 colours, once per frame. The fact that they all keep flashing means all the cogs are still alive. And they keep up their same pattern, one colour change following the other, but just gradually slowing down as more sand appears.
See what you think.
C:\Hydra\downloads\SandDemo\SandDemo.spin
Change
< gy := 10
---
> gy := 20
C:\Hydra\downloads\SandDemo\sand_physics.spin
Add the following after the line: mov sandy, #(ScreenH - 1)
> cogid py
> sub py,#1
> mov px,#16
> call #GetPixel
> add pixelcolor,#1
> and pixelcolor,#$3
> call #SetPixel
> add px,#1
> call #SetPixel
C:\Hydra\downloads\SandDemo\sand_physics.spin
Change:
< djnz sandy, #:loop_y
---
> sub sandy,#1
> cmp sandy,#16 wz,wc
> if_a jmp #:loop_y
Post Edited (CardboardGuru) : 6/15/2007 6:25:20 PM GMT
"From glancing at the code I'd guess that the "sand physics" cogs are catching each other and getting stuck in lock step."
I was thinking of the high level logic when I said that. I assumed that the time to complete a scan line depended on the ammount of sand on that line. If that was the case, a physics cog could get bogged down enough for the cog behind it to catch it. Now, since (as you've proven) each cog in the Prop acts 'independently' no hardware failure occured. I think it's simply that the two cogs now read the same data in one hub cycle, take the same time processing the data, and then write out the same results in another hub cycle. Hence they're now operating in "lock step" and it looks like one cog crashed.
add a re-synchronization system or make the physics cogs repel each other and I think this'll go away.
Sorry I wasn't clearer before,
Marty
I don't see why they would "get stuck in lockstep", even if one somehow managed to catch up with another. There are no locks used. And no waiting loops. I imagine they all just get slower as the sand builds up at the bottom of the screen.
On the other hand, it's probably worth working out a scheme to keep the cogs more or less evenly spaced on the screen to minimise the jerkiness of the animation.
ever heard of Fireflies synchronizing? Certain species of fireflies will spontaneously synchronize, this results in whole trees blinking on and off at the same time. What I'm describing is the same effect, it just has different underlying causes. www.jcu.edu/math/faculty/dpalmer/Animations.html is the best link I could find with Google. Unfortunately the Java animation's won't load for me so don't treat it as the final word. (lots of nice cryptic technical info found with Google too)
It's pretty easy to test. Let's see if the color of the pixel changes inbetween each cog reading it and writing it.
Add the following lines at the top of the loop, after px and py is initialised.
call #GetPixel
mov steve,pixelcolor
Then these at the end of the loop before movex and movey are added in:
call #GetPixel
cmp steve,pixelcolor wz,wc
if_ne mov dira,#1
if_ne mov outa,#1 'lights the HYDRA debug LED
After a short while the LED comes on. But wait a second, because the DEMO file is also writing to the screen. Those red blocks.
Comment out the following:
' Create a barrier
repeat gx from 80 to 100
repeat gy from 100 to 102
SetPixel(gx, gy, 2)
' Create a sink-hole
repeat gx from 200 to 207
repeat gy from 180 to 188
SetPixel(gx, gy, 2)
Now a couple of things happen. First of all the LED doesn't come on. So at no point are the cogs processing the same pixel. Secondly, without the sink hole, the sand keeps building up, and the processing gets slower and slower, and again it's easiest to see that with the blinkenlights.
Post Edited (CardboardGuru) : 6/15/2007 10:40:59 AM GMT
I'm aware of the phenomenon of fireflies synchronizing; that is indeed an ingenious hypothesis for the cause of this situation. That is a very interesting observation that if they ever got to the same place, all the cogs would follow the same paths through the code and never break out of step. So it'd appear as though no work is being done by the other cogs. If you are correct, then we may have stumbled upon a very important discovery about parallel computation. (It seems to me that if this kind of thing can actually happen, it's a kind of problem that could be generalized & be very important to know about.)
However, Occam's razor (the simplest explanation is the most likely to be true) leads me to think maybe it is just that there is more sand than before. I can test this second hypothesis by simply running one cog with more sand and with less sand, and seeing what the speed difference is. OTOH, I can test the lock-step hypothesis by kicking the cogs out of step.
I'm not at home right now so I haven't had a chance to try Cardboardguru's blinken lights, to verify whether it does disprove that the cogs are in lock step.
BTW Cardboardguru, I'm not Brian, I'm Dennis. [noparse]:)[/noparse]
To the person who tried the demo on the Hydra, the blue streaks are not a bug; I did that deliberately. I don't have to do it that way, but I just thought it looked more interesting if the first bits of sand changed the colors of the sky from black to blue. The eventual intention is to have the background scroll and continuously generate new dunes, with new black sky. So instead of filling the whole sky with blue, the sky would be black and blue streaks all the time as you moved forward. However, it's possible the TV driver doesn't look quite right on the Hydra, so maybe what you're seeing isn't identical to what I have. I'll have to try it on my Hydra when I get home and see.
If not, I will see what it takes to get a "raster-vga".