C: Data Type Consistency

idbruce · 2015-04-13 18:44

While working on Teacup, I used most of the following data types, uint8_t, int8_t, uint16_t, int16_t, uint32_t, int32_t, char, and int. And then as I go through various examples provided with SimpleIDE and various library files, although equivalent, I see the use of different data types.

such as int, char, unsigned int, etc.... And of course these all show up blue in the IDE. And I would imagine these would be prefered by other forum members...

So where can I find a copy of the keywords and prefered data types for SimpleIDE?

I would prefer that my code conform to the general usage throughout the forum and be consistent with program examples distributed with SimpleIDE, GCC, and by Parallax.

jmg · 2015-04-13 18:59

idbruce wrote: »

... I used most of the following data types, uint8_t, int8_t, uint16_t, int16_t, uint32_t, int32_t, char, and int.

I think the general trend is to include the bit size in the types, (as done here), as that avoids interpretation issues.

Also check ASM output, as sometimes the native size on a MCU codes smaller, than a smaller type if that smaller type is less naturally supported..

Heater. · 2015-04-13 19:11

idbruce,

The issue is that the size of int is not defined in C. Except that it will be a minimum of 16 bits. Similarly C does not define whether a char is signed or unsigned.

If these things bother you, which they should, us the C99 types you have listed.

On the other hand I find things like "int32_t" to be really ugly sprinkled throughout the code. After all they mix up letters, numbers and "_" in the same way as is recommended for unguessable passwords!

So I will often use int and char in any place where it's not likely to be critical (the "i" on loop counters, general strings for example). And make the assumption my code is never going to make it back to a 16 bit system.

Then again one can always define ones own types with typedef.

jmg · 2015-04-13 20:24

Heater. wrote: »

On the other hand I find things like "int32_t" to be really ugly sprinkled throughout the code. After all they mix up letters, numbers and "_" in the same way as is recommended for unguessable passwords!

For those averse to the _t names, I've seen some programmers take them down to a truly minimal
u8 u32 etc - I can still understand what they used.

DavidZemon · 2015-04-13 20:54

As is evidenced by all the code in PropWare, I too am a fan of int32_t over int.

idbruce · 2015-04-14 02:44

Well guys...

I must admit that your responses surprise me.

Being accustomed to the MS way of doing things, it was a little wierd using the uint8_t, int8_t, uint16_t, int16_t, uint32_t, and int32_t data types, but now I am starting to like this method, because as jmg points out, it includes the bit size, and this should avoid misinterpretation.

Okay, then I will keep using these.

Heater. · 2015-04-14 03:08

What are the surprising parts?

mindrobots · 2015-04-14 03:32

I'm glad MS came along to give programmers a "way of doing things" so we're no longer wallowing in a sea of confusion and ambiguity like we were before MS came along!

Heater. · 2015-04-14 04:04

The surprising thing is that the standard names for types are so wrong.

In the world of mathematics, you know, where numbers come from (or at least their definitions) we have:

"natural" numbers: 1, 2, 3, 4, 5, 6....

"whole" numbers: 0, 1, 2, 3, 4, 5, 6....

"integer" numbers: ..., 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, ......

"rational", or "fractional, numbers: p/q for integer values of p and q.

"irrationals": Those that cannot be written as p/q

"real" numbers: The rationals an irrationals combined.

So we see that C has it a bit wrong. What C calls an "unsigned int" is actually a "whole". In fact "unsigned int" is a contradiction in terms.

We have float, well not wrong exactly but why not call it a "rational" or "rat"?!

"long" and "double" are just silly.

"char", "signed char", "unsigned char" are dumb. A character is not a number. It is neither signed or unsigned.

What a mess.

idbruce · 2015-04-14 04:09

Heater

What are the surprising parts?

It is surprising to me that everyone basically supports the idea of using uint8_t, int8_t, uint16_t, int16_t, uint32_t, int32_t, as compared to "unsigned this and that". And it also surprises me that I have not found any of these data types being used in obvious Parallax documentation and coding examples.

idbruce · 2015-04-14 04:13

The surprising thing is that the standard names for types are so wrong.

In the world of mathematics, you know, where numbers come from (or at least their definitions) we have:

"natural" numbers: 1, 2, 3, 4, 5, 6....

"whole" numbers: 0, 1, 2, 3, 4, 5, 6....

"integer" numbers: ..., 6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6, ......

"rational", or "fractional, numbers: p/q for integer values of p and q.

"irrationals": Those that cannot be written as p/q

"real" numbers: The rationals an irrationals combined.

So we see that C has it a bit wrong. What C calls an "unsigned int" is actually a "whole". In fact "unsigned int" is a contradiction in terms.

We have float, well not wrong exactly but why not call it a "rational" or "rat"?!

"long" and "double" are just silly.

"char", "signed char", "unsigned char" are dumb. A character is not a number. It is neither signed or unsigned.

What a mess.

LOL I would have to agree 100%

Heater. · 2015-04-14 04:31

That's it then I'm going to adopt the following:

typedef unsigned char very_small_whole;

typedef unsigned short small_whole;

typedef unsigned whole;

typedef unsigned long big_whole;

typedef unsigned long long very_big_whole;

typedef float rat;

typedef double big_rat;

typedef long double very_big_rat;

davidsaunders · 2015-04-14 04:39

Well the common simple Data types in C are supposed to be:
int Integer of system specific size, though always greater than 16 bits.
short Usually 16-bits, though that is not for sure.
long At least 32 bits.
char 8 bit integer (usually unsigned, though not always).
float Real number represented in IEEE 32-bit floating point form.
double High precision real number in IEEE 64 bit floating point form.
unsigned modifies any integer data type to be unsigned.

The bit size extensions to the data types that were added with C99 are usually implemented as macros of the above.

idbruce · 2015-04-14 04:46

David

Well the common simple Data types in C are supposed to be:
int Integer of system specific size, though always greater than 16 bits.
short Usually 16-bits, though that is not for sure.
long At least 32 bits.
char 8 bit integer (usually unsigned, though not always).
float Real number represented in IEEE 32-bit floating point form.
double High precision real number in IEEE 64 bit floating point form.
unsigned modifies any integer data type to be unsigned.

This is basically what I have seen in Parallax documentation and coding examples, as well as in the libraries.

idbruce · 2015-04-14 04:59

I would really like to see the keyword index for SimpleIDE... and the control code.

In SimpleIDE, the data types that David mentions, int, short, long, char, float, double, and unsigned, all show up as dark blue, whereas uint8_t, int8_t, uint16_t, int16_t, uint32_t, and int32_t, all show up as gold. So I am assuming, from a SimpleIDE viewpoint, that int, short, long, char, float, double, and unsigned are the prefered data types.

EDIT: Now considering the beginners perspective, it could be very confusing to inter-mingle uint8_t, int8_t, uint16_t, int16_t, uint32_t, and int32_t with int, short, long, char, float, double, and unsigned.

EDIT: So we now have uint8_t, int8_t, uint16_t, int16_t, uint32_t, int32_t, int, short, long, char, float, double, and unsigned in two different colors. Oh my

The gold colored ones must be more powerful

Heater. · 2015-04-14 05:21

Let me clarify and correct that from the standard:

char - Smallest addressable unit of the machine that can contain basic character set : May be signed or not. That is to say undefined.

short - Signed, at least 16 bits in size. That is to say undefined.

int - Signed, at least 16 bits in size. That is to say undefined.

long - Signed, at least 32 bits in size. That is to say undefined.

long long - Signed, at least 64 bits in size. That is to say undefined.

In summary:

The minimum size for char is 8 bits, the minimum size for short and int is 16 bits, for long it is 32 bits and long long must contain at least 64 bits.

i.e. they are all undefined apart from their minimum size.

Of course my saying "undefined" is a bit harsh. The C language standard leaves them as implementation specific. And the implementations do define them.

kwinn · 2015-04-14 05:25

jmg wrote: »

For those averse to the _t names, I've seen some programmers take them down to a truly minimal
u8 u32 etc - I can still understand what they used.

I like that. Short, clear, and to the point.

Heater. · 2015-04-14 05:26

idbruce,

The problem with the syntax highlighting is that C language keywords will be one colour. Non C language defined things will be another colour. Variables, typedefs etc.

int32_t and friends are not part of the language syntax exactly. You have a point that an exception should perhaps be made for them when syntax highlighting. I bet there is a config file in SimpleIDE where that can be tweaked.

idbruce · 2015-04-14 05:37

I never really took the time to dig that deep into the subject. MFC was alway my preferred way of programming and that kind of hid a lot of C data type of stuff. Of course I had to work with C and C++ independently from MFC, but it was always more difficult. It is actually quite different when you get into programming strictly in C, because there is no MFC to hide all the difficult stuff

davidsaunders · 2015-04-14 05:42

Heater. wrote: »

Let me clarify and correct that from the standard:

char - Smallest addressable unit of the machine that can contain basic character set : May be signed or not. That is to say undefined.

short - Signed, at least 16 bits in size. That is to say undefined.

int - Signed, at least 16 bits in size. That is to say undefined.

long - Signed, at least 32 bits in size. That is to say undefined.

long long - Signed, at least 64 bits in size. That is to say undefined.

In summary:

The minimum size for char is 8 bits, the minimum size for short and int is 16 bits, for long it is 32 bits and long long must contain at least 64 bits.

i.e. they are all undefined apart from their minimum size.

Of course my saying "undefined" is a bit harsh. The C language standard leaves them as implementation specific. And the implementations do define them.

Thank you, you worded that much better than I.

DavidZemon · 2015-04-14 07:21

Heater. wrote: »

idbruce,

The problem with the syntax highlighting is that C language keywords will be one colour. Non C language defined things will be another colour. Variables, typedefs etc.

int32_t and friends are not part of the language syntax exactly. You have a point that an exception should perhaps be made for them when syntax highlighting. I bet there is a config file in SimpleIDE where that can be tweaked.

That seems odd to me. Aren't the terms like "int" defined from "int32_t" in the first place (obviously specific definitions are impl. specific)?

Dave Hein · 2015-04-14 07:33

This would be tricky to implement in general. SimpleIDE would have to examine the code, including header files that are included to find all the typedef's. It might make sense to highlight all the types defined in stdint.h, but the C libraries contains other types that are defined in the header files, such as FILE and size_t. Maybe there could be a config file that contains a list of types that the user would like to highlight.

DavidZemon · 2015-04-14 07:40

Dave Hein wrote: »

This would be tricky to implement in general. SimpleIDE would have to examine the code, including header files that are included to find all the typedef's. It might make sense to highlight all the types defined in stdint.h, but the C libraries contains other types that are defined in the header files, such as FILE and size_t. Maybe there could be a config file that contains a list of types that the user would like to highlight.

Careful now. You might turn it into a full-blown IDE :P

I would think, considering the name of the program, a simple hardcoded list of values would be fine. No need to scan and support user-defined types. It would be really cool... but you're not doing any sort of source code scanning at the moment are you? It would be a huge undertaking. I would think, if you wanted something like that, you'd be better off starting with a real IDE like QtCreator, Code::Blocks, Eclipse, IntelliJ, etc and then simplifying it down. Oh wait... that's OmniaCreator :P

Heater. · 2015-04-14 09:02

It's the other way around. char, int, long, long long are the language defined types. Problem is the language definition leaves their sizes open ended.

The uint_8 etc, C99 type, are defined in a header file "stdint.h" in terms of the language defined types. So uint_8 is defined as "unsigned char" just in case your machine has signed chars. For example. That header file may well be different on different compilers.

DavidZemon · 2015-04-14 09:09

Heater. wrote: »

It's the other way around. char, int, long, long long are the language defined types. Problem is the language definition leaves their sizes open ended.

The uint_8 etc, C99 type, are defined in a header file "stdint.h" in terms of the language defined types. So uint_8 is defined as "unsigned char" just in case your machine has signed chars. For example. That header file may well be different on different compilers.

Oh! I'd been reading the typedefs the wrong way around the whole time :P

Heater. · 2015-04-14 09:12

Ha, ha, that has caught me out on occasion!

abecedarian · 2015-04-14 09:50

Explicit definition, uint16_t and so on, as opposed to the implicit "int", "long" and such, removes architecture related ambiguities and can make the code more easily portable to other architectures, particularly if you're relying on the data type's properties to provide certain functionality like an int overflowing at 32767 + 1 to -32767 for instance: a 16 bit MSP430 will do this but what is an int on P8X32A... 32 bits right? So would it overflow an int in the same way?

If nothing else, I don't think it hurts being verbose when it makes sense to do so.

C: Data Type Consistency

Comments