Jazzed,
Are you still using a 386 based PC?
There is something very sick about your computer or your virtualbox. fft_bench completes in less than 100us on my PC running Debian. I have not tried under virtualbox yet but I would expect it to be about the same.
Run times will vary under a multi-user multi-tasking OS depending on how things get scheduled.
I'm not sure if that will change your results, but it may. It didn't here, but my host OS (Scientific Linux 6) is configured to use hpet, which must influence Vbox to some degree. My results either way were a range of 157-224ms, with the majority around 160ms or so.
Jazzed, sorry about that. As Heater mentioned, you can add it to the kernel boot parameters at boot time. To make it permanent, you can edit your grub config file. I can't tell you where to find it on Ubuntu, though.
Jazzed, sorry about that. As Heater mentioned, you can add it to the kernel boot parameters at boot time. To make it permanent, you can edit your grub config file. I can't tell you where to find it on Ubuntu, though.
It's almost universally in /boot/grub/menu.lst, or very nearby. Look for 'kernel param=value param=value param=value' (with different params and values.. the first one is typically 'root=<some_path>')
(In case of lilo instead of grub it's in /etc/lilo.conf)
Gosh, progcc is steaming ahead.
That dhrystone result puts the Prop way ahead of any 68xxx macine. With overclocking to 100MHz it's up there with the CRAY-X-MP/12.
Have to find my old ZOG results somewhere...
Gosh, progcc is steaming ahead.
That dhrystone result puts the Prop way ahead of any 68xxx macine. With overclocking to 100MHz it's up there with the CRAY-X-MP/12.
Have to find my old ZOG results somewhere...
Interesting - perhaps a bit of a cheat not using floating point or standard time functions, but hey - who's counting?
With my new optimizations, I think Catalina may be competitive with gcc on code size, but Catalina is definitely still slower (more work to do!).
Of course, I can't tell how much of your total upload size (11928 bytes) is program code - can you give us the actual C program segment sizes?
Not much cheating going on really. The dhrystone guts do not use floating point math. Only the timing calculation does. So it would not much difference. There used to be a lot of non-standard timing functions in use back in the day. Code size would be interesting to know.
Not much cheating going on really. The dhrystone guts do not use floating point math. Only the timing calculation does. So it would not much difference. There used to be a lot of non-standard timing functions in use back in the day. Code size would be interesting to know.
That's what I love about having a standard language and a standard benchmark - everybody is free to modify either of them until they get the results they want
I do agree in this case the benchmark results should not be affected much by the inclusion or exclusion of floating point - however, the calculation of the actual result does become slightly less accurate without it.
But more importantly - Catalina's code size for this benchmark changes by a significant amount depending on whether I include or exclude floating point, so it is relevant to know if we are comparing apples with apples before reading too much into the actual numbers.
On the Propeller (with it's very limited RAM space) it is a constant battle to decide on the right tradeoff between code size and performance, and obviously Catalina and GCC will have made different decisions here - I'm more interested in understanding this than any individual benchmark result, since I'm fairly confident that (at the end of the day) GCC will win some and Catalina will win others.
With my new optimizations, I think Catalina may be competitive with gcc on code size, but Catalina is definitely still slower (more work to do!).
Most likely propgcc Dhrystones per second will drop from 6127 to 6112 so you won't have to sweat as much. Final code size for propgcc Dhrystone 2.2 is to be determined but the generated size dropped a little too so it's a bit of a wash.
Comparing code size on programs with significant library usage ends up comparing libraries as much as compiler output, so it's rather like trying to nail jello to the wall. What we'd need for comparing code size of different compilers/languages would be some application that doesn't use standard libraries -- something like the size of the smallest program to toggle a specific I/O pin.
Timing results can also depend on libraries, but not as severely (depending on the benchmark, of course).
Bah!
The original Dhrystone was written in Ada I believe. The original Whetstone in Fortran. How are we to make any comparisons when the language, the libraries and thre entire run tim changes?
At the bottom of it all is the actual work that the bench mark performs and how fast you can do it.
Compilers have been known to observe that no actual useful result is produced by a benchmark and optimize the whole thing away! So there is some kind of "spirit" of the bench mark, bending it to work without actually being fraudulent. I believe propgcc Dhrystone is in that spirit.
P.S. I guess no decision or trade off has been made wrt floating point in propgcc and Dhrystone. It's just that at this early stage they don't have a float implementation.
At the bottom of it all is the actual work that the bench mark performs and how fast you can do it.
I think you're missing the point. Performance is not the only result you get from benchmarks like these - code size is another. Especially on the propeller where we have to be very mindful of RAM usage because the total memory space is so limited. What use is fantastic performance at the cost of code sizes so large that no useful programs can be written? (Note: I'm emphatically not saying that is the case here - I am merely saying is that we don't really know yet, since some of the numbers have not been explained).
We have looked at code size in every other benchmarking thread we've had here (e.g. Zog vs Catalina), and it's generally been very interesting (and sometimes surprising!). But with gcc we can't do that yet, which makes the otherwise impressive results a bit difficult to properly assess. But you're right that it is early days yet. The performance may change as the team implements more of the language and the inevitable compromises have to be made.
Comments
Are you still using a 386 based PC?
There is something very sick about your computer or your virtualbox. fft_bench completes in less than 100us on my PC running Debian. I have not tried under virtualbox yet but I would expect it to be about the same.
Run times will vary under a multi-user multi-tasking OS depending on how things get scheduled.
About the Debian test... you might try it with the following modifications...
Force the VM to use the high precision event timer for the clock source...
Force Debian to do the same with the boot parameter...
You can see which clock source is being used before the mods by running...
I'm not sure if that will change your results, but it may. It didn't here, but my host OS (Scientific Linux 6) is configured to use hpet, which must influence Vbox to some degree. My results either way were a range of 157-224ms, with the majority around 160ms or so.
If anybody wants more info on timers in Linux, see chapter 6 of Understanding the Linux Kernel.
Using vboxmanage ... keeps run time at 150ms or less almost all the time.
I couldn't figure this one out though: clocksource=hpet
Any clues? GRUB maybe? init.d script?
(In case of lilo instead of grub it's in /etc/lilo.conf)
-Tor
That's where it is on my system, but it's using grub 0.97. Grub2 systems have a different configuration.
That dhrystone result puts the Prop way ahead of any 68xxx macine. With overclocking to 100MHz it's up there with the CRAY-X-MP/12.
Have to find my old ZOG results somewhere...
Interesting - perhaps a bit of a cheat not using floating point or standard time functions, but hey - who's counting?
With my new optimizations, I think Catalina may be competitive with gcc on code size, but Catalina is definitely still slower (more work to do!).
Of course, I can't tell how much of your total upload size (11928 bytes) is program code - can you give us the actual C program segment sizes?
Ross.
That's what I love about having a standard language and a standard benchmark - everybody is free to modify either of them until they get the results they want
I do agree in this case the benchmark results should not be affected much by the inclusion or exclusion of floating point - however, the calculation of the actual result does become slightly less accurate without it.
But more importantly - Catalina's code size for this benchmark changes by a significant amount depending on whether I include or exclude floating point, so it is relevant to know if we are comparing apples with apples before reading too much into the actual numbers.
On the Propeller (with it's very limited RAM space) it is a constant battle to decide on the right tradeoff between code size and performance, and obviously Catalina and GCC will have made different decisions here - I'm more interested in understanding this than any individual benchmark result, since I'm fairly confident that (at the end of the day) GCC will win some and Catalina will win others.
Ross.
Timing results can also depend on libraries, but not as severely (depending on the benchmark, of course).
Eric
The original Dhrystone was written in Ada I believe. The original Whetstone in Fortran. How are we to make any comparisons when the language, the libraries and thre entire run tim changes?
At the bottom of it all is the actual work that the bench mark performs and how fast you can do it.
Compilers have been known to observe that no actual useful result is produced by a benchmark and optimize the whole thing away! So there is some kind of "spirit" of the bench mark, bending it to work without actually being fraudulent. I believe propgcc Dhrystone is in that spirit.
P.S. I guess no decision or trade off has been made wrt floating point in propgcc and Dhrystone. It's just that at this early stage they don't have a float implementation.
I think you're missing the point. Performance is not the only result you get from benchmarks like these - code size is another. Especially on the propeller where we have to be very mindful of RAM usage because the total memory space is so limited. What use is fantastic performance at the cost of code sizes so large that no useful programs can be written? (Note: I'm emphatically not saying that is the case here - I am merely saying is that we don't really know yet, since some of the numbers have not been explained).
We have looked at code size in every other benchmarking thread we've had here (e.g. Zog vs Catalina), and it's generally been very interesting (and sometimes surprising!). But with gcc we can't do that yet, which makes the otherwise impressive results a bit difficult to properly assess. But you're right that it is early days yet. The performance may change as the team implements more of the language and the inevitable compromises have to be made.
I guess we just have to wait a bit longer.
Ross.