Wednesday, February 01, 2012

Arduino Data Types

(Update 2013-01-23: this article is about the Arduino Uno and similar boards that use the eight-bit Atmel AVR ATmega microcontrollers. If you're looking for the Arduino Due that uses the thirty-two-bit ARM Cortex-M3 microcontroller, see Arduino Due Data Types.)

I think for every system I've ever worked on that had a C compiler, going all the way back to my PDP-11 days, I eventually wrote this program or its equivalent. I have had some surprises. Here it is for Arduino, as part of my Amigo project.


#include <stdint.h>
#include <avr/pgmspace.h>

void setup() {
Serial.begin(9600);
}

void loop() {
Serial.print("sizeof(byte)="); Serial.println(sizeof(byte));
Serial.print("sizeof(char)="); Serial.println(sizeof(char));
Serial.print("sizeof(short)="); Serial.println(sizeof(short));
Serial.print("sizeof(int)="); Serial.println(sizeof(int));
Serial.print("sizeof(long)="); Serial.println(sizeof(long));
Serial.print("sizeof(long long)="); Serial.println(sizeof(long long));
Serial.print("sizeof(bool)="); Serial.println(sizeof(bool));
Serial.print("sizeof(boolean)="); Serial.println(sizeof(boolean));
Serial.print("sizeof(float)="); Serial.println(sizeof(float));
Serial.print("sizeof(double)="); Serial.println(sizeof(double));
Serial.print("sizeof(int8_t)="); Serial.println(sizeof(int8_t));
Serial.print("sizeof(int16_t)="); Serial.println(sizeof(int16_t));
Serial.print("sizeof(int32_t)="); Serial.println(sizeof(int32_t));
Serial.print("sizeof(int64_t)="); Serial.println(sizeof(int64_t));
Serial.print("sizeof(uint8_t)="); Serial.println(sizeof(uint8_t));
Serial.print("sizeof(uint16_t)="); Serial.println(sizeof(uint16_t));
Serial.print("sizeof(uint32_t)="); Serial.println(sizeof(uint32_t));
Serial.print("sizeof(uint64_t)="); Serial.println(sizeof(uint64_t));
Serial.print("sizeof(char*)="); Serial.println(sizeof(char*));
Serial.print("sizeof(int*)="); Serial.println(sizeof(int*));
Serial.print("sizeof(long*)="); Serial.println(sizeof(long*));
Serial.print("sizeof(float*)="); Serial.println(sizeof(float*));
Serial.print("sizeof(double*)="); Serial.println(sizeof(double*));
Serial.print("sizeof(void*)="); Serial.println(sizeof(void*));
Serial.print("sizeof(prog_char)="); Serial.println(sizeof(prog_char));
Serial.print("sizeof(prog_char*)="); Serial.println(sizeof(prog_char*));
Serial.println();
delay(5000);
}

And here's it's output for one of my Arduino Uno boards with the ATmega328p microcontroller.


sizeof(byte)=1
sizeof(char)=1
sizeof(short)=2
sizeof(int)=2
sizeof(long)=4
sizeof(long long)=8
sizeof(bool)=1
sizeof(boolean)=1
sizeof(float)=4
sizeof(double)=4
sizeof(int8_t)=1
sizeof(int16_t)=2
sizeof(int32_t)=4
sizeof(int64_t)=8
sizeof(uint8_t)=1
sizeof(uint16_t)=2
sizeof(uint32_t)=4
sizeof(uint64_t)=8
sizeof(char*)=2
sizeof(int*)=2
sizeof(long*)=2
sizeof(float*)=2
sizeof(double*)=2
sizeof(void*)=2
sizeof(prog_char)=1
sizeof(prog_char*)=2

Pointers are sixteen bits. Just like, in fact, on my beloved PDP-11s. Sixteen bits can address at most sixty-four kilobytes of memory, addresses 0x0000 through 0xffff. This now seems outrageously large compared to the two kilobytes of SRAM available on the AVR microcontroller on my Arduino Uno board.

Sometimes I miss Java.

Update 2012-02-06

The GCC tool chain for the AVR family has a command line switch -mint8 that can make the int data type eight bits wide; Arduino doesn't use it, but you might encounter it elsewhere in other AVR-based projects.

I strongly recommend using the types defined in stdint.h that have their widths explicitly defined, e.g. int8_t, uint32_t, etc., when such things matter (as they so often do in the embedded realm).

Update 2012-02-20

Here's the same for an Arduino Mega ADK board that uses the ATmega2560 microcontroller. Short answer: it's the same as the Uno.

sizeof(byte)=1
sizeof(char)=1
sizeof(short)=2
sizeof(int)=2
sizeof(long)=4
sizeof(long long)=8
sizeof(bool)=1
sizeof(boolean)=1
sizeof(float)=4
sizeof(double)=4
sizeof(int8_t)=1
sizeof(int16_t)=2
sizeof(int32_t)=4
sizeof(int64_t)=8
sizeof(uint8_t)=1
sizeof(uint16_t)=2
sizeof(uint32_t)=4
sizeof(uint64_t)=8
sizeof(char*)=2
sizeof(int*)=2
sizeof(long*)=2
sizeof(float*)=2
sizeof(double*)=2
sizeof(void*)=2
sizeof(prog_char)=1
sizeof(prog_char*)=2

The ATmega2560 has more input and output pins and more on-board flash (256 kilobytes instead of 32) and more SRAM (8 kilobytes instead of 2) than the ATmega328 used on the Uno. The flash and SRAM increases make the Mega a particularly interesting target for a multitasking real-time operating system (RTOS) where each task needs its own stack and context. Because of its larger program memory, the ATmega2560 requires three bytes instead of two for its program counter (PC), the hardware register that keeps track of the current instruction address.

8 comments:

Bob Nye said...

Keep talking like that and we will have to take your Wizard card away! :)

Chip Overclock said...

Hey, Bob, stuff like this is how I got my Wizard card!

Bob Nye said...

Talking about the size of variables gets you a Wizards card. Talking about a yearning for Java and we won't believe you can read a schematic :)

Chip Overclock said...

A Java developer of my acquaintance once remarked (probably to his eventual regret) "I don't want to know how it works under the hood." So there's some truth in what you say. But I do believe that to control costs in development projects you have to have developers work at as high a level of abstraction as possible. This is why we all don't still work in assembler except when it's really necessary.

This is also why I really like Linux/GNU as an embedded platform when possible because it opens the possibility of doing a lot of development for prototyping or even production in scripting languages in which the think-type-compile-debug cycle can be very very short. It also allows you to hire less expensive developers who have no idea what the size of variables are but don't need to know for the kind work that they do.

Java falls into this category. If you can support it on your system, then you would be remiss not doing as much work in it as you can, with faster development iterations and less expensive developers.

Those organizations who insist on doing everything in C or even C++ when cheaper alternatives exist are quickly going to go the way of organizations who want to do everything in assembler on gigahertz ARM processors. Which is to say: they won't exist.

I keep meaning to write an article about this.

Bob Nye said...

I don't disagree in principal. I have done a bit of development in python as a scripting language and I like it a lot. It lets me develop and test quickly and great libraries that help with the heavy lifting.

I pick on Java in embedded particularly because of the Java developers I have crossed paths ( swords ? ) with. In an embedded environment memory is always fixed (and quite often limited) so you have to pay close attention to memory utilization.

I appreciate what has taken place with Android where you have a Linux kernel in charge of the hardware and a bunch of Java code running in isolated VM's doing the application work.

Writing the applications in Java is great, the design cycle is very efficient. Since the applications are isolated, they can not effect the system stability to the point where the OS dies.

The reality is that the MIPS/Dollar available as well as the Cost per Megabyte of ROM/RAM today is what allows applications to be written in higher and higher level languages. The amount of library code that a simple program requires to run today compared to what you had available in the PDP11 days is amazing!

You should pen an article on this subject - look how much traction it has already!!

Chip Overclock said...

I think we're on the same page. Embedded development requires discipline and knowledge of the underlying platform. It doesn't matter whether you're writing code in Python, Java, C++, or C.

Years ago during my Bell Labs days someone in the Labs' CTO organization made the blanket statement that real-time code could not be written in C++. Standing right next to him were me and a colleague who had just spent the past several years writing real-time code in C++ for embedded products that the company was selling. She and I just looked at each other and shrugged. What ya' gonna do?

Ben said...

Hi,

What would the compiler do with something like:

short int var;

To me, each of these is it's own data type, and should result in a compiler error.. but it doesn't. Still, what's the resulting size?

In the case of the Arduino Uno, both a short and int are both 2 bytes.. BUT, what if they weren't (i.e. arduino due)? how much would a compiler allocate for a 'short int'?

In short, no pun intended, what's the reasoning behind qualifying one data type with another? (As an example context, TinyBasicPlus uses it a couple times).

Anonymous said...

I got..

4 = sizeof ( long int)
2 = sizeof ( short int)
2 = sizeof ( short )
2 = sizeof ( int)
4 = sizeof ( int long )
4 = sizeof ( double )
4 = sizeof ( long double )