Data Types

Integral Types

An integral data type is a type that is fundamentally an integer. That is, it has no fractional portion. Integral types come in different sizes. There are 6 different integer types (8 including char) and their sizes are dependent on the computer. Most of the computers and compilers we using are running 64-bit software. This is a diagram of the relative sizes:
Relative size of data types (typical 64-bit computer: LP64)

Note that the C Standard does not specify the sizes of any of the types except char which is always 1 byte. The sizes shown above are the minimum sizes that the types must support.

This table shows the range of values for the integral types on a 64-bit (LP64) computer:

Type Bytes Also called Range of values
(Binary)
Range of values
(Decimal)
signed char 1 char
(compiler-dependent)
-27 to 27 - 1 -128 to 127
unsigned char 1 char
(compiler-dependent)
0 to 28 - 1 0 to 255
signed short int 2 short
short int
signed short
-215 to 215 - 1 -32,768 to 32,767
unsigned short int 2 unsigned short 0 to 216 - 1 0 to 65,535
signed int 4 int
signed
-231 to 231 - 1 -2,147,483,648 to
2,147,483,647
unsigned int 4 unsigned 0 to 232 - 1 0 to 4,294,967,295
signed long int 8 long
long int
signed long
-263 to 263 - 1 -9,223,372,036,854,775,808 to
9,223,372,036,854,775,807
unsigned long int 8 unsigned long 0 to 264 - 1 0 to 18,446,744,073,709,551,615

This table shows the sizes of long integers used in Microsoft's compilers (LLP4). (The rest of the world uses 8 bytes as above.)

signed long int 4 long
long int
signed long
-231 to 231 - 1 -2,147,483,648 to
2,147,483,647
unsigned long int 4 unsigned long 0 to 232 - 1 0 to 4,294,967,295

This table includes the binary values:

Type Binary Range Decimal Range
signed char 10000000
to
01111111 
-127
to
128
unsigned char 00000000
to
11111111
0
to
255
signed short 1000000000000000
to
0111111111111111
-32,768
to
32,767
unsigned short 0000000000000000
to
1111111111111111
0
to
65,535
signed int 10000000000000000000000000000000
to
01111111111111111111111111111111
-2,147,483,648
to
2,147,483,647
unsigned int 00000000000000000000000000000000
to
11111111111111111111111111111111
0
to
4,294,967,295
signed long 1 followed by 63 zeros
to
[64 ones]
-9,223,372,036,854,775,808
to
9,223,372,036,854,775,807
unsigned long [ 64 zeros ]
to
[ 64 ones ]
0
to
18,446,744,073,709,551,615
A signed integer that is 32 bits wide and can store values in the range: -2,147,483,648 to 2,147,483,647

A signed char that is 8 bits wide and can store values in the range: -128 to 127

What happens when you try to store a value that is too large for the data type? With unsigned values, it just "wraps" back around to 0. Think of the bits being sort of like an odometer on a car. Once the odometer gets to 999999, it will "wrap" back around to 0. So, an unsigned char with a value of 255 will become 0:

 11111111
+       1
---------
000000000
With signed numbers, the result is undefined. It could be anything and do anything, including crashing the program. It's up to the particular compiler. The GNU gcc compiler does a sort of "wrapping" itself. The difference is that instead of going from the largest positive value back to 0, the bits go from the largest positive value to the smallest negative value. (e.g. 127 + 1 is -128).

Overflow example

64-bit data models

Here's a sample.

Why did Microsoft choose the LLP64 model instead of LP64 like everyone else?

Other interesting information here, here, and here.

Literal Constants

We know that a literal constant like 42 is an int and that a literal constant like 42.0 is a double. Don't forget that when you are reading/writing (scanf and printf) shorts and longs that you need to use modifiers on the type. h for short and l (lowercase L) for long. Refer to them here.


Usually, we write literal integral values using decimal (base 10) notation. C provides two other forms: octal (base 8) and hexadecimal (base 16)

Floating Point Types

Unlike the integral types, floating point types are not divided into signed and unsigned. All floating point types are signed only. Floating point numbers follow the IEEE-754 Floating Point Standard. Here's more information about floating point numbers than you'll probably ever need. Here are the approximate ranges of the IEEE-754 floating point numbers on Intel x86 computers:
Type Size Smallest Postive Value Largest Positive Value Precision
float 4 1.1754 x 10-38 3.4028 x 1038 6 digits
double 8 2.2250 x 10-308 1.7976 x 10308 15 digits
long double 10* 3.3621 x 10-4932 1.1897 x 104932 19 digits
Some floating point constants. These are all of type double:
42.0  42.0e0  42.  4.2e1  4.2E+1  .42e2  420.e-1  42e0  42.E0
To indicate that the type is float, you must append the letter f or F:
42.0f  42.0e0f  42.F  4.2e1F  etc...
To indicate that the type is long double, you must append the letter l (lowercase 'L') or L:
42.0L  42.0e0L  42.l  4.2e1l  etc...
In practice, NEVER use the lowercase L (which looks very similar to the number one: 1), as it will certainly cause confusion. (See above.)

*Here are the sizes of floating point numbers on various C compilers under 32-bit:

GNU gccBorlandMicrosoft
sizeof(42.0)  is  8
sizeof(42.0F) is  4
sizeof(42.0L) is 12
sizeof(42.0)  is  8
sizeof(42.0F) is  4
sizeof(42.0L) is 10
sizeof(42.0)  is  8
sizeof(42.0F) is  4
sizeof(42.0L) is  8
With 64-bit, all of the sizes are the same except the long double with gcc. It's 16 bytes.

Partial float.h listing

Another toy

See this refresher on IEEE-754 notation for more information.

The typedef Keyword

Suppose we want to add a boolean type to C. (There isn't one, so we typically use int in place of a boolean.) We've already done it using #define: here

To declare a variable, we simply do this:

int a;               /* Create an integer named a               */
unsigned char b;    /* Create an unsigned char named b         */
short int c;        /* Create a short integer named c          */
float d;            /* Create a float named d                  */
unsigned char * e; /* Create an unsigned char pointer named e */
These cause the compiler to allocate space for each variable, based on it's type.

If we want to create a new type (instead of a new variable), we add the typedef keyword:

typedef int a;               /* Create a new type named a */
typedef unsigned char b;    /* Create a new type named b */
typedef short int c;        /* Create a new type named c */
typedef float d;            /* Create a new type named d */
typedef unsigned char * e; /* Create a new type named e */
You can think of these type definitions as aliases for other types. To create a new variable of type a:
a i; /* Create an 'a' variable named i */
b j; /* Create a 'b' variable named j  */
Of course, thinks makes no sense whatsoever. For any real use, you need to give the typedefs meaningful names. Compare to #define:
  /* Create new types using typedef */
typedef int Bool;
typedef unsigned char BYTE;
typedef short int FAST_INT;
typedef float CURRENCY;
typedef unsigned char * PCHAR;
  /* Create new types using #define */
#define Bool int
#define BYTE unsigned char
#define FAST_INT short int
#define CURRENCY float
#define PCHAR unsigned char *
Examples:
Bool playing, paused;   /* Booleans for a DVD player    */
BYTE next, previous;    /* For scanning bytes in memory */
CURRENCY tax, discount; /* To calculate total price     */
PCHAR inbuf, outbuf;    /* To manipulate strings        */
Summary: