We often advise against using String on 8 bit Arduino due to low memory...
... I was inspired to post this as the result of another discussion over on r/embedded about the same general topic: It is true that it is not recommended to use Malloc in embedded?.
The answer, as per most "computer things", is "it depends":
- For smaller memory systems, the answer is "yes" but it depends somewhat on how you go about it and whether you understand what is going on or not.
- For larger memory systems, the risks are lower, so it is usually fine, but you still need to know what you are doing. For mission critical, especially thing that need to run reliably for long periods of time (e.g. years), "Yes, it is true that it is not recommended to use malloc".
Dynamic memory and String
The problem with String is that it dynamically allocates memory. What that means is that when you create one, it doesn't actually have any memory allocated to it. This is useful when, for example, you don't know how big the string needs to be in advance.
So, it has to go and find some from an unused area of memory known as the heap.
The bottom line is that this is done via a function called malloc()
.
The heap
The heap is what is remaining after the variables that your program defines have been allocated to memory.
If you have verbose turned on, you will see a message telling you about this during the upload process. It will be a line like this one:
Global variables use 955 bytes (46%) of dynamic memory, leaving 1093 bytes for local variables. Maximum is 2048 bytes.
So, in that example, the heap can theoretically have access to 1,093 bytes of memory.
I did say "theoretically", because in most systems, there is another critical structure also trying to use that unallocated memory which I will describe that in the next section about "The stack".
As mentioned above, the heap is really useful when you don't know how "big" things will be in advance - for example, how many characters a user might enter into the Serial monitor in one go.
The heap grows upwards to higher memory locations. It typically starts from the first location in memory after all of the global variables defined in your code.
The stack
In the previous section "The heap", I gave an example and said "...theoretically ... the heap can use all of that unallocated memory". But, there is another critically important structure that is also trying to use that "unallocated" or "unused" memory. This structure is known as the stack.
The stack is an important structure because it tracks and manages the operation of your program. For example, if your code calls a function (e.g. pinMode, or digitalWrite, or Serial.println etc), the stack is used to keep track of where to return to when that function finishes.
For example consider this code:
void setup () { // Line #1
pinMode (2, OUTPUT); // Line #2
pinMode (3, OUTPUT); // Line #3
pinMode (4, OUTPUT); // Line #4
} // Line #5
At line #2, when pinMode is called, an entry is made on the stack that basically says, "when you are done, return to line #3". It isn't quite as simple as that, but basically that is what happens. Similarly when line #3 calls pinMode, a new entry is made on the stack that says "when done, return to line 4 and so on.
Also, after line #4's pinMode has finished, we will be at line #5, which is the end of the setup function. Prior to setup
being called (which is done by another "hidden" function called main
) an entry will be placed on the stack that says where to return to when setup is finished. This is what happens at line #5, the CPU will look at the stack and obtain the "when you are done, return to X" and do that.
The stack is used for other important stuff, but hopefully you can see from that that the stack is crucially important for the correct operation of your code.
The stack typically starts at the top of memory and grows downwards to lower memory locations. On an ATMega328P (Uno R3), this means the stack starts from the 2KB mark and grows down towards 0.
The three main memory structures.
Following is a diagram of the three main memory structures and how they are typically organised:
- Global variables (red)
- Unused memory (blue) consisting of:
- Heap for dynamic memory allocation
- Stack for managing the program's smooth operation
The Risk
If the stack and/or the heap grow out of control or even simply too much, then there is a potential problem. Can you see what it is? Hint: the arrows in the above diagram pretty much say it all.
Answer: >! If the arrows meet, then either (or both) the heap and/or the stack will get corrupted. That means the "smooth operation" of the program is likely to be over and the dreaded "undesirable random behaviour" will be the result !<
Fragmentation
The way dynamic memory allocation works is that it (malloc
) looks for some unused memory of a size specified by the programmer on the heap. If it finds it, it returns a pointer to that memory and you can then use it. If it doesn't find any, you will be informed and should take that into consideration.
When you are done using the memory, you should release it. This is done by the free
call. By releasing it the memory can be returned to the "free list" (a list of unused memory chunks) for subsequent reuse by malloc.
As indicated above, this is a really nice feature when you may be presented with an "unknown input size" that you need to take into account with things like String
- which handles it quite nicely.
An aside
Some may say, "but Strings don't use malloc, they (internally) use new
". Under the covers, new
relies on malloc to get its memory. Either way, the String class in the Arduino HAL (for 8 bit systems) uses malloc to allocate memory when managing the buffer.
Back to Fragmentation
Now, think about what happens if we allocate a String of say 1 character, then append 1 character, followed by another 1 character. This is exactly what happens inside of functions like Serial.readString
. This function is defined as follows (I added the comments):
String Stream::readString()
{
String ret;
int c = timedRead(); // Get a character from the Source.
while (c >= 0) // If there is one...
{
ret += (char)c; // Append the 1 character to our string
c = timedRead(); // Try to get another one if there are any
}
return ret; // Return the string (which could be an empty string if no characters were read.
}
From the above, it is hopefully clear that the line ret += (char)c;
appends characters one by one as they are read from an input source such as the Serial monitor.
Again, think about what happens here:
- Initially we start with an empty string.
- We add a character to it. If you look at String, it will say "Oh, I only have 0 characters in my string, so I need to allocate more". It is a bit convoluted to follow through the String code, but it does this via a function called
changeBuffer.
The changeBuffer
function looks like this:
unsigned char String::changeBuffer(unsigned int maxStrLen) {
char *newbuffer = (char *)realloc(buffer, maxStrLen + 1);
if (newbuffer) {
buffer = newbuffer;
capacity = maxStrLen;
return 1;
}
return 0;
}
Note the realloc call? That basically is a "two-fer". If it can extend the current block it does that and it is done.
If it cannot do that (and this is the important bit), it mallocs a new chunk of memory of the requested size (if it can) and frees the specified one (as specified by "buffer"). It will also copy the old contents to the new location.
Note that if realloc cannot simply increase the size of the current chunk of memory it must allocate a new one, copy the old one across then finally release the old one. That means that for a short time two copies of the buffer will exist.
Even worse, a problem known as fragmentation can crop up. Fragmentation is the phenomena where "holes" of unusable memory can pop into existence.
In the case where only one String exists, fragmentation is unlikely as realloc
will simply increase the memory allocated.
But, what happens if the String is initialised with one character (stored on the heap) and something else malloc's (or uses new) to allocate something else on the heap (I will refer to this as the "intruder"). This will be placed immediately following the String's allocation. Lets assume the intruder also started with just one byte. This is to avoid leaving space on the heap unused.
So we have this in memory:
- String (1 byte)
- Intruder (1 byte).
But what happens if another character is received and appended to the String? Well, the string will need to be expanded, but there is now no room left for it. So, a new one will be allocated following the "intruder". The old one is released. Leaving the first byte unused.
Now, lets say the "intruder" also needs to expand. It cannot because the expanded string will be placed immediately after it. So, it will now be allocated following the expanded String (and after copying will also be released). Now, we will have the one byte from the initial String plus whatever the intruder needed (1 byte). Now our memory will look like this:
- Unused (from the initial string and initial intruder - 1 byte each = 2 bytes).
- Expanded String (2 bytes)
- Expanded Intruder (2 bytes).
So we have a little chunk of 2 bytes at the top, then our 2 structures.
Now, what happens if the String needs to accept another character? It will need to expand to 3 bytes. But that won't fit in at the top of memory (just due to the nature of how realloc works and the potential need to copy things around), so it will put our new 3 byte String after the Intruder and free up the previously allocated memory. Now our memory will look like this:
- Unused (from above, 2 bytes plus the just released string 2 bytes = 4 bytes).
- Expanded Intruder (2 bytes).
- Twice Expanded String (3 bytes).
Note that we have a growing "hole" at the top of the heap?
Fortunately the malloc (and realloc) is smart enough to say that if the Intruder is expanded again by 1 more byte it will fit into the hole and it will reuse that. But there will be a new hole between the 3 byte intruder and the 3 byte string.
So, as these things are resized this "dance" will continue and these holes of unused memory can start to popup throughout the heap (not to mention the temporary need to maintain two copies of the structure if the buffer needs to be reallocated in a new location).
Also, the above only used two dynamic objects. The "challenge" is exacerbated as more and more dynamically allocated objects are used.
The result, the heap can grow up to such an extent that it "collides with the stack", or it can grow so that it is "close to the stack" and a few more function calls cause the stack to collide with the heap. Either way the result is a collision, some damage will occur to either or both of the stack and heap's contents and things will start going off the rails in a random, unpredictable and often confusing manner - that can be difficult to resolve.
In summary (TLDR)
The above is quite involved and quite technical. It also only skims the surface.
As a general rule, for small memory systems we generally do not recommend using dynamic memory (such as String, new or malloc and related functions) unless you know what you are doing.
Here is the link to the post that inspired me to create this post: It is true that it is not recommended to use Malloc in embedded?.
The discussion goes a bit deeper for those who may be interested in it.