Overloaded Return Values Cause Bugs

Wednesday, October 08, 2008 by Pascal Meunier
in General

In my secure programming class I have denounced the overloading of return values as a bad practice, and recently I discovered a new, concrete example of a bug (possibly a vulnerability) that results from this widespread practice. A search in the NVD and some googling didn’t reveal any mention of similar issues anywhere, but I probably just have missed them (I have trouble imagining that nobody else pointed that out before). In any case it may be worth repeating with this example.

The practice is to return a negative value to indicate an error, whereas a positive value has meaning, e.g., a length. Imagine a function looking like this:

int bad_idea(char *buf, unsigned int size) {
    int length;

    if (<some_error_condition>) {
        length = -ERROR_CODE;
    } else {
        length = size;  // substitute any operations that could overflow the signed int
    }
    return length;
}

This function could return random error values. Under the right conditions this could result in at least a DoS (imagine that this is a security-related function, e.g., for authentication). I suggest using separate channels to return error codes and meaningful values. Doing this lowers complexity in the assignment and meaning of that return value by removing the multiplexing. As a result:

There is an increased complexity in the function prototype, but the decreased ambiguity inside the function is beneficial. When the rest of the code uses unsigned integers, the likelihood of a signed/unsigned integer conversion mistake or an overflow is high. In the above example, the function is also defective because it is unable to process correctly some allowed inputs (because the input is unsigned and the output needs to be able to return the same range), so in reality there is no choice but to decouple the error codes from the semantic results (length). This discrepancy is easier to catch when the ambiguity of the code is decreased.
It does away with the bad practice of keeping the same variable for two uses: assigning error codes and negative values to a “length” is jarring;
It disambiguates the purpose and meaning of checking the “returned” values (I’m including the ones passed by reference, loosely using the word “returned”). Is it to check for an error or is it a semantic check that’s part of the business logic?

Integer overflows are a well-known problem; however in this case they are more a symptom of conflicting requirements. The incompatible constraints of having to return a negative integer for errors and an unsigned integer otherwise are really to blame; the type mismatch (“overflow”) is inevitable given those. My point is that the likelihood of developers getting confused and having bugs in their code, for example not realizing that they have incompatible constraints, is higher when the return value has a dual purpose.

Edit (10/13): It just occurred to me that vsnprintf and snprintf are in trouble, in both BSD and Linux. They return an int, but take an unsigned integer (size_t) as a size input, AND are supposed to return either the number of printed characters or a negative number if there’s an error. Even if the size can be represented as a signed integer, they return the number of characters that *would* have been printed if the size was infinite, so specifying a small size isn’t a fix. So it *might* be possible to make these functions seemingly return errors when none happened.

Note(10/14): I’ve been checking the actual code for snprintf. In FreeBSD, it checks both the passed size and the return value against INT_MAX and appears safe.

Tags for this post:

Leave a comment (3 so far) »

Comments

Posted by Andy Steingruebl
on Monday, October 13, 2008 at 03:47 PM

Don’t you find this to be a general problem with languages that don’t have exception handling built in? The C++/Java/C#, etc. paradigm here would be to throw an exception right? This makes is substantially harder for the caller to not handle/check the error cases.

Is this really a solvable problem in C?

Posted by Pascal Meunier
on Tuesday, October 14, 2008 at 12:43 PM

Andy,
Exception handling does create a handy information channel for errors. However, being in a language that supports exceptions doesn’t prevent someone to ignore that language feature and choose to overload return values with error codes.

Languages without support for exceptions can simply have more return parameters, so exceptions are not the only answer. In Ada, parameters can be clearly separated and identified as being “in” or “out”. For example (from the Lovelace tutorial):

procedure Split(
Source : in Unbounded_String;
First_Word : out Unbounded_String;
Rest : out Unbounded_String)

So it’s not difficult to define functions with many outputs without using exceptions. In “C” you would pass by reference.

The problems are the mentality of doing as much as possible with a few bytes or variables, habit, old insecure tutorials, lack of awareness of integer overflow issues, sloppy integer conversions, ambiguously defined types (is size_t the same as an unsigned int or an unsigned long, 32 or 64 bits?) and backward compatibility. Not everyone sees that a slightly more complex function prototype can lead to fewer bugs by using parameters with an unambiguous purpose. Novice programmers think that we are being clever by being efficient, regardless of complexity. Programmers generally need to be confronted with numerous bugs of their own making before becoming averse to sloppy type definitions, variable reuse, etc… I’d like to add overloaded return values to the list.

I surprise myself by liking Ada more and more as time goes by

Posted by Andy Steingruebl
on Tuesday, October 14, 2008 at 12:47 PM

Its a fair comment that this can be handled by passing references around. There are pluses and minuses to those approaches of course.

That said, I’m a much bigger fan of exceptions than I am of manual error checking by the developer.

And per your point about the standard C library, don’t even get me started about the stupidity inherent in things like strcpy().

Page 1 of 1 pages

Overloaded Return Values Cause Bugs

Comments

Leave a comment

Recent Blogs

Archive