Hacker News new | past | comments | ask | show | jobs | submit login
What is your favourite C programming trick? (stackoverflow.com)
137 points by asto on Oct 14, 2011 | hide | past | favorite | 144 comments



Recently, I learned that instead of saying

    long_struct_name *foo = malloc(sizeof(long_struct_name));
you can say

    long_struct_name *foo = malloc(sizeof(*foo));
since the variable type info is already statically available. That saves some typing, and (more importantly) blocks against bugs from changing one but not the other. I've been meaning to look it up in H&S to make sure it's always safe, but the guy who showed it to me is so strict about safe/standard C that it's likely.

Most of my favorite tricks actually involve the preprocessor, though. I know it's significantly less expressive than the macro systems in Lisp, Scheme, or OCaml, but C would be a very different language without it, and tasteful CPP usage can ease many of C's pain points.

(My other other favorite C programming trick is knowing Lua, which is excellent for scripting C. :) )


It's always safe, and is also the correct way to write the expression.


(My other other favorite C programming trick is knowing Lua, which is excellent for scripting C. :) )

I especially love LuaJIT, whose ffi makes it even easier to interface with C than standard Lua does (and its awesomely fast too). Nevermind that Lua is just nice to work in anyway :)


Where would one start learning to script C with Lua?


The best book for Lua by a longshot is Roberto Ierusalimschy's _Programming in Lua_, second ed. (http://www.inf.puc-rio.br/~roberto/pil2/) It covers the core language and the C API with the same clear, erudite treatment as K&R. (He is one of the core Lua authors.)

Once you get the big ideas, there's a very detailed reference online (http://www.lua.org/manual/5.1/), and the mailing list and wiki at http://lua-users.org/ also have a lot of helpful info.

Lua is transitioning from 5.1 to 5.2 right now, which introduces some changes (improvements to the GC, adding to the standard libraries, and improving the package/module system). The main language and C API haven't changed significantly from 5.1; you should be fine if you learn 5.1 now and update later. Lua is small enough that you could add 5.1 to your projects as a library dependency and maintain it yourself, though - it's only about 16,000 lines of code.


Variables are in scope in their own initializers. This is fun when you inadvertently write something like:

    int length = ...;
    ...more code, lose your concentration...
    if(x) {
        int length = length / 2 + 1;
        ...
    }
This neither produces an error nor does what you'd expect, but just ends up being a creative way to initialize the inner length variable with garbage. I've done this more than I care to admit.


If you're using gcc, you can compile with "-Wshadow" to get a warning for this type of problem.

See: http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html


Definitely. Know your warning options! Also, try using multiple compilers. tcc compiles very quickly, clang often has better error messages, etc.

Speaking of variable shadowing: It's usually worth wrapping any preprocessor macros in a "do { ... } while (0)" block unless you deliberately want variable definitions to escape (in which case, token pasting a suffix is usually a good idea).


Isn't "do { ... } while (0)" the same as just "{ ... }"?

I sometimes use "{ ... }" to limit the scope of a variable that is only needed for a small section of code.


Former lets break'ing out, latter doesn't.


do { ... } while (0) makes your macro behave like a statement (i.e. the semicolon is mandatory)


Yes. I don't include the trailing semicolon in macro definitions, because I expect the macro to be followed by them:

    MACRO(foo);


({ ... }) does this too. (Is this GCC-specific?)


That is a gcc extension. The construct as a whole has the value of the last statement executed within. Usually an inline function is preferable since it achieves the same thing using only standard syntax.


Would be nice if there was one which would warn for this specific case but not shadowing in general, which I occasionally like. (This probably makes me a bad person.)


It certainly doesn't make you a bad person. If lexical scoping wasn't the intent of the language authors, it wouldn't exist. I tend to agree that the warning is more of a hassle than it's worth. Though I guess it can be argued that if your scoping is so deep that you actually need to reuse names, that perhaps some refactoring should be in order.


Seeing a variable shadow something in an outer scope makes me cringe, simply because the decrease in readability far outweighs any benefit it could give. You might remember how things are, but the poor sap who has to keep extra scoping depths in his mind just to maintain your code will curse your name every day.


Also, sizeof is an operator, not a function, so you can write that as

    long_struct_name *foo = malloc(sizeof *foo);


This is true but there is a qualifier here:

If a type name is used, it always needs to be enclosed in parentheses, whereas variable names and expressions can be specified with or without parentheses.

So

    long_struct_name *foo = malloc(sizeof long_struct_name)
won't compile.


To avoid the duplication, with risk of mismatch, and to make it one less thing for the reader to check, I have

    #define NEW(a) ((a) = emalloc(sizeof(*(a)))) 
    NEW(foo);
Similarly with NEW0() and ecalloc(). Yes, the macro uses the parameter more than once, and yes it's another macro for the reader to grasp, but it's a simple one and NEW() is used so widely it's soon learnt.


Indeed this is good style. The following, however, is not good style for allocating an array:

    long_struct_name *fooArray = malloc(count * sizeof * fooArray);


I particularly enjoy showing people the little known "goes to" operator:

  -->
with example code such as this:

  #include <stdio.h>
  int main()
  {
     int x = 10;
     while( x --> 0 ) // read "while x goes to zero"
     {
       printf("%d ", x);
     }
  }
The above code compiles and runs, listing the numbers from 9 to 0.

For more details on this little known operator, I recommend this stack overflow question:

http://stackoverflow.com/questions/1642028/what-is-the-name-...

Always a classic.


It so happens that x is not continuous in 0 - the result is different if you approach 0 from the right, i.e.

  0 <-- x


Reminds me of a trick from Java Puzzlers- it works exactly the same in C:

  #include <stdio.h>
  int main() {
    http://www.google.com
    printf("Hello, World!");
  }
(Of course, if you have syntax highlighting that label followed by a single-line comment is easier to spot.)


You, sir, are evil! That's awesome!


The 4-byte (SMTP-command-style) lookup:

  if(*(u_int32_t)cmdword == *(u_int32_t *)"EHLO") {
    handle_ehlo();
  }
The "extern inline" idiom for forcing inlining, which I picked up from Mike Stolarchuk.

Passing and assigning trivial structures by value instead of fiddley pointers.

Arena allocators.

Not so much a trick, but: you can safely free() NULL, which saves a conditional. In the same vein: not only is there no point to casting malloc()'s return value, but there are (admittedly rare) circumstances where doing so can be harmful. So save yourself the typing.

assert(!"message") instead of assert(0).


In your first example you must take care not to run afoul of alignment or aliasing rules. This is not always easy.

Using "extern inline" is a bad idea since no two compilers implement it the same way and none according to spec (that I know of). There is no standard way to force inlining of a function.


You're right, if you do the 4-byte lookup trick on a SPARC you'll get a SIGBUS if you do it at arbitrary offsets. It's not always easy to ensure your comparison target is e.g. at the start of your read buffer, or in its own array on the stack, or in the return value from strdup(). It's definitely a "trick", not a proper form.

I also wouldn't do "extern inline" on a random C compiler; it works on clang, gcc, and Sun's compiler, though.


You probably know this already, but if you pass -std=c99 to gcc, you have to have a separate definition for your inlined functions. Here's how I do it; I welcome suggestions for improvement:

module_a.h: #ifndef INLINE_ # define INLINE_ inline #endif /* INLINE_ */

  INLINE_ void function_a()
  {
  }
inline_defs.c #define INLINE_ #include "module_a.h"


Regarding STMP-command-style lookup: A more portable way to do that is to define a macro to assemble the 4 bytes into unsigned int:

  #define CHAR4_TO_UINT(a,b,c,d)       \
    (                                  \
        ((unsigned int)(a))      |     \
      ( ((unsigned int)(b))<<8  )|     \
      ( ((unsigned int)(c))<<16 )|     \
      ( ((unsigned int)(d))<<24 )      \
    )
then

  unsigned int  ui_cmdword = CHAR4_TO_UINT(
      cmdword[0], cmdword[1], cmdword[2], cmdword[3]
  );
  if( ui_cmdword == CHAR4_TO_UINT('H','E','L','O') )
      handle_helo();
  else if( ui_cmdword == CHAR4_TO_UINT('E','H','L','O') )
      handle_ehlo();
А compiler will generate memory-fetching code once, optimize right side of comparisons into constants and make everything flow fast and safe. You can even use switch statement if you like.


The first trick is handy but unfortunately it breaks aliasing rules.


No, don't do this. It works on x86, but on architectures with load alignment restrictions it can crash on valid data.


It works fine on SPARC too; you just have to know where the data is coming from.

You're right to point out that people should be cautious about this code. I'm just listing my "favorite tricks". I'm not recommending that people use integer casts as their go-to string comparison.


The string literal is (all but -- honestly I'm not sure what the standard says here) guaranteed to be aligned naturally for the platform. The thing on the left hand side is not. If it's a pointer to a heap block, you're fine. If it's a pointer to a token parsed out of string input, it can be anything.

Even architectures that support misaligned accesses can be configured to trap on them and generate unexpected fatal signals.


Sorry, I wasn't talking about the string literal; I was talking about the buffer you're comparing it to. I'm saying: yes, you're right, you have to be careful that it's aligned.


Natural alignment for string literals is generally 1 byte. Natural alignment for uint32_t is generally 4 bytes.


With possibly-inlined, SSE-optimized strcmp variants that compare 8 or 16 bytes of a string at a time (at least according to Valgrind), how much speed is gained these days by casting and using the standard integer comparison instructions?


Although if you want your C to compile with a C++ compiler, you will need to cast the result of malloc(). C++ is more strongly-typed than C, and the compiler will complain.


You're right, that was an oversight. This is one of the more frustrating decisions in C++; it defies the semantics of a "void pointer", breaks existing code, and provides no real safety (the thing we're enforcing "type safety" over being a simple register-width integer used to express any type in the whole system).

But you do sometimes want to cut-paste code from .c files into .cpp files, and this idiom will make your compiler yell.


There are much more subtle differences that will wreak total havoc without the compiler emitting a single squeak. For example, compare this compiled as C or C++:

  #include <stdio.h>
  
  int foo;
  
  int main(void)
  {
      struct foo {
          int a, b;
      } x;
      printf("%zd\n", sizeof(foo));
      return 0;
  }
One should always write code in the best possible way for the language actually in use, even if this is invalid in some other language with superficially similar syntax. In converting code from C to C++, adding a few pointer casts will be the least of your worries.


One way implicit casts for void * are pretty big safety harness actually. You can still interpret any random pointer as "a register width thingamajig". Interpreting any random void * as a valid pointer to a particular type is obviously dangerous.


I like putting the constants on the left side of the comparison... cuts down on the 'missing one equals sign' errors. This can be a readability issue though.


GCC will warn about using an assignment as a truth value in if(), while(), etc., so it's probably safe enough to stick with whatever order you're familiar with and let GCC alert you if you miss an equals sign.


This is frequently called "Yoda conditionals."


That 4 byte lookup's useful. I needed to do it recently and had assumed it wasn't possible.

If memory serves, CodeWarrior for Mac (and possibly other classic Mac compilers) had syntactic sugar mapping single-quoted four-character strings to the ubiquitous OSType. Something like:

    OSType creatorCode = 'TTXT';
where OSType was typedef'd to uint32.


In addition to breaking strict aliasing and alignment as other have pointed out, your first example also assumes little endian and won't work on big endian architectures.

Just don't do it unless it's a quick hack that you absolutely know you'll rewrite correctly within the day, before committing. And even then, write it correctly the first time around.


This construct does not depend on byte order since both sides are using the same (unsafe) type-punning.


I brought up a thread on this on SO a while back. I wanted to find places where it didn't work:

http://stackoverflow.com/questions/328215/does-anyone-know-o...


> you can safely free() NULL, which saves a conditional

This makes for a clearer code, but the conditional is still there, tucked in free's code. Obviously.


I think he means not having to test the pointer before handing it off to free().


Using the ternary operator to return a function pointer, that is then immediately called, where the functions in question are stubs to system calls and have identical declarations.

As in, specifically:

    if ((Lflag ? chown : lchown)(p->fts_accpath, s->st_uid, -1))
        (void)printf(" not modified: %s\n",
            strerror(errno));
It was a moment of enlightenment.


And (void) in front of printf() is what for exactly?


Philosophically, it's due to a famous system programmer's dictum: "Always check the return value of system calls".

The corollary of this is that one should clearly document when explicitly ignoring the return value. A simple way of this in C becomes a cast to void of the return. Since printf does I/O, it qualifies.

Specifically, this code is from a patch to the FreeBSD source which is ruled by style(9); you will find this form throughout BSD source.


It's to stop lint(1) complaining that you're not checking the return value from printf().

I see you're an old-timer here but maybe you missed out on lint: http://en.wikipedia.org/wiki/Lint_(software)

That page says it dates from the late seventies; I was still using it mid-nineties. I don't remember the last time that I linted but today Ubuntu is lint-unaware. These days the compiler will pick up most of the things that lint used to.


C is full of surprises, and generally my favourite trick is trick du jour. While it's not strictly for C, Hacker's Delight[1] is my favourite collection of bit twiddling tricks. A highly recommended read if only for the intellectual value.

[1] http://www.hackersdelight.org/


asprintf. It's so much easier to write

    asprintf(s, "%s.pid", progname);
than

    s = malloc(strlen(progname) + 5);
    strcpy(s, progname);
    strcat(s, ".pid");
and it avoids errors in buffer-size computation too.

(Unfortunately asprintf isn't C99; but you can construct it easily out of vsnprintf: http://code.google.com/p/libcperciva/source/browse/trunk/uti...)


I've done something similar to your vsnprintf()-based code by using snprintf() into an undersized buffer, realloc()ing the buffer to the returned value, then calling snprintf() again. This works well for reused buffers that have a length associated with them and may need to grow over time, but clearly vsnprintf() could do the same thing.


asprintf is very expensive; one tends to use snprintf all over the place in ones code.

Not disagreeing with you; asprintf is a good thing to have around.


Oh, I wouldn't use asprintf in any performance-critical code. But I wouldn't be using character strings in any performance-critical code either, so that issue doesn't arise for me.


It's a shame they didn't offer msprintf(), which would use malloc(), and asprintf(), which would use alloca(), for cases where you only want a temporary string without heap usage/fragmentation overhead.


You can't use alloca from inside a library function, since it would allocate within the library function's stack frame and the allocation would no longer be valid when the function returned. (Or rather, you can use alloca within library functions, but you can't return a pointer to that allocation, so it wouldn't be useful here.)

Theoretically you could define an alloca()ed-pointer-returning Xsprintf as a macro, though... (but ask tptacek notes, it's probably a bad idea).


Something like this (C99; given p, stack-allocates p_sasprintf_buf of size at most 16 if the string would fit, and uses asprintf otherwise):

    #include <err.h>
    #include <stdio.h>
    #include <stdlib.h>
 
    #define SASPRINTF_MAXLEN 16
    #define SASPRINTF_MERGE(a, b) a ## b
    #define SASPRINTF_LEN(p) SASPRINTF_MERGE(p, _sasprintf_len)
    #define SASPRINTF_BUF(p) SASPRINTF_MERGE(p, _sasprintf_buf)
    #define SASPRINTF(p, fmt, ...) \
        size_t  SASPRINTF_LEN(p) = snprintf(NULL, 0, (fmt), __VA_ARGS__); \
        char    SASPRINTF_BUF(p)[SASPRINTF_LEN(p) <= SASPRINTF_MAXLEN ? SASPRINTF_LEN(p) + 1 : 0]; \
        if (SASPRINTF_LEN(p) <= SASPRINTF_MAXLEN) { \
            snprintf(SASPRINTF_BUF(p), SASPRINTF_LEN(p) + 1, (fmt), __VA_ARGS__); \
            p = SASPRINTF_BUF(p); \
        } else { \
            if (asprintf(&p, (fmt), __VA_ARGS__) == -1) \
                err(1, "SASPRINTF_L at %s, %d", __FILE__, __LINE__); \
        }
    #define SASPRINTF_FREE(p) do { \
            if (p != SASPRINTF_BUF(p)) \
                free(p); \
        } while(0)
 
    /* Test harness */
    int main(void);
 
    int main(void) {
            char *p, *p2;
    
            SASPRINTF(p, "%s", "foo");
            SASPRINTF(p2, "%s", "Really long string, really.");
 
            printf("%s\n%s\n", p, p2);
 
            SASPRINTF_FREE(p);
            SASPRINTF_FREE(p2);
 
            exit(EXIT_SUCCESS);
    }
I was going to say "...but you have to be pretty insane to do this", but I haven't managed to get incorrect-but-compiling code out of the above macros. Of course, I'm not at all convinced that it's faster than asprintf... (even after the obvious optimizations.)


This is not valid C99. In C99, arrays must have at least one element. Also, what is err.h? Whatever it is, it is not C99.

As for speed, if asprintf() does something clever to avoid rendering the string twice, this is actually slower unless snprintf() is faster than a malloc() call (unlikely). Furthermore, some compilers implement variable-length arrays with malloc() so for these, this is definitely not an improvement.

Beyond the speed of this particular call, using variable-length arrays can have a performance hit in general since gcc is unable to inline functions using them.


That's a good point - I can see why they wouldn't exactly want to encourage macro hacks at this point.


Using alloca() with unbounded string lengths is terribly unsafe, which defeats the purpose of the idiom.


Array initializer syntax, particularly with ranges.

  static cmd_handler_t handlers[16] = {
      [0 ... 15] = handler_noop,

      [1] = handler_for_1,
      [3] = handler_for_3,
      [6 ... 8] = handler_for_6_through_8
  };
And suchforth. Very natural for parsers.

Particularly nifty is Jaremie Miller's js0n parser, which makes heavy use of this: https://github.com/quartzjer/js0n/blob/master/js0n.c


Range initialisers are a gcc extension. They should not be used in portable code.


Any time I have ever, ever used a gcc extension in my own code, I have always been sorry for having done so later --- usually more than a year later, at a moment when I don't really have time budgeted for being sorry about binding my code directly to gcc.

(I see a difference between literally using GCC-dialect C and relying on dubious C constructs that happen to work well on C; for instance, I've never been bitten by "extern inline").


'extern inline' isn't a dubious C99 construct; it's well defined what it means.

The problem is that the 'extern inline' gcc extension means something else, and is enabled by default unless you specify -std=c99


I agree your code should not depend on GCC, but stuff like __attribute__((nonnull(1), format(printf, 3, 4))) - "the first argument may not be NULL, the third argument is a printf()-style format string with arguments starting after the fourth element" - can produce some useful warnings and can be trivially disabled on non-GCC compilers. Some of the most useful extensions don't lock you into GCC.

(Now, if only Microsoft would get off their asses and make their compiler C99-compliant, we could all write much nicer code.)


I have run into errors caused by weird uses of inline, I think it was with the MIPSpro compiler, in code happily swallowed by gcc.


When did we get the ellipsis in the named array initializer syntax? That's standard? Neat.


It's an old gcc extension. Don't use it.


It's a very useful gcc extension. Use if if you're building with gcc. Standards adherence is all well and good, but there are many problem areas where you can rely on a single compiler (surely others support it too: intel and clang seem like likely suspects).


There are standards issues and there are standard issues.

In reality, you're probably never going to see a 1's complement machine.

Your likelihood of needing to use a piece of code under a compiler other than GCC is deceptively high.

Meanwhile, the extensions we tend to think about when we think of GCC aren't subtle things like "can you use // comments in C code". They're constructs that require many additional lines of code to replace. It is a giant pain when you find them, later on, when you need to compile something under Visual C or SunWorks. Your fix to comment syntax isn't going to break code at runtime, but your fix for the missing ellipsis operator definitely can!

From bitter experience, I think 'mansr is right on this, and it's worth making an effort not to let GCC extensions creep into your code.


But like I said, it depends on what the code is. If it's a kernel module, then you'd be silly not to use the gcc extension. Likewise if it's a platform-locked linux or mac thing (Android middleware, maybe). You work in security, where you're expected to port stuff between platforms regularly. Not everyone does. And those extensions are there because someone likes them.


> If it's a kernel module, then you'd be silly not to use the gcc extension.

Sure, if you're not interested in anyone porting that code to another OS. This happens all the time with drivers -- just because the interface is OS-specific doesn't mean all the code is.

> And those extensions are there because someone likes them.

That doesn't make it a good idea.


Extensions have occasionally been dropped from gcc, sometimes because of conflicts with updated standards, sometimes for other reasons.


My favourite C trick is -Dcontinue=break in the Makefile.

Sacrasm aside, I like (( x > 0 )&&( (x&(x-1)) == 0 )) trick to test if x is a power of two. But all arithmetic tricks like this need to be commented in detail, used rarely, and properly documented.

Edit: I've also used "-Dfor=if(0);else for" to make some ancient C++ compiler obey C++0x scoping rules for variables declared in initializer list of 'for' statement.


I remember using the `#define for if (0) else for` trick, too - was it with MSVC 98 by any chance?


Maybe, I don't remember. It was long ago.


Multi-line strings:

  char* greeting = "Hello "
  "world";


A macro for calculating the size of a buffer for a C-style string containing a (signed/unsigned) integer:

  #define CS_INT_SIZE(int_type) ((size_t)(0.30103 * sizeof(int_type) * 8) + 2 + 1)

  int x = -1000;
  char buf[CS_INT_SIZE(x)]; // instead of char buf[100];
  snprintf(buf, sizeof(buf), "%d", x);
Another macro for calculating the length of a string literal in compile time (just like strlen would do). Note the extra check for string literal.

  #define CSLLEN(s)           (sizeof(s "") - 1)
  int len = CSLLEN("hello"); // len == 5 here
Another one for logging variable arguments or no arguments at all.

  #define Log_Trace_(...) Log(__FILE__, __LINE__, __func__, LOG_TYPE_TRACE, "" __VA_ARGS__)
  void Log(const char* file, int line, const char* func, int type, const char* fmt, ...);
  Log_Trace(); // prints just the name of the file, line and func
  Log_Trace("Error %d", error); // prints the same as above and the error number


Why not just declare "char buf[100]" for snprintf's target and be done with it?


Because then you'd have a buffer overflow in 2153 when we're using 512-bit integers.


No, you'd have a truncated string.

You really think we'll be dealing with 512 bit integers in 40 years? This is code that's turning integers into decimal strings.


Either my calendar is wrong, or 2153 is far more than 40 years away.

I wasn't being serious, though.


Sorry. I can't read today. :)


Your logging example is invalid. A variadic macro must be invoked with at least one argument matching the ellipsis. GCC is forgiving here, but some other compilers are not.


I don't know whether this qualifies as a "trick", but by far the most useful idiom I use in C code is:

  // in some header file somewhere
  #define CHECK_ERROR(rc) \
    if (rc != SOME_SUCCESS_VALUE) { report(rc, __FILE__, __LINE__); goto error; }
  #define CHECK_MALLOC(ptr, rc) \
    if (ptr == NULL) { rc = report_malloc_error(__FILE__, __LINE__); goto error; }
 
  // in code
  error_t somefunc( ... ) {
    error_t rc = SUCCESS;

    widget_p = make_a_widget(...);
    CHECK_MALLOC(widget_p, rc);

    rc = this_could_fail( ... );
    CHECK_ERROR(rc);
    // ...
    rc = this_could_fail_too( ... );
    CHECK_ERROR(rc);
    
    return SUCCESS;
  error:
    // ... error clean up ...    
    return rc;
  }
There are slight variants; like for sharing the cleanup code (e.g. "finally"), but that's the essence of it. I'm sure I've typoed something above, nit-pickers beware :-)


CHECK_MALLOC is an anti-pattern (do you have CHECK_FOO for any given FOO which might behind the scenes allocate?).

I have a litany of reasons why you shouldn't bother checking malloc returns, and instead invest a little effort in making sure your platform malloc is configured to abort instead of returning NULL. The simplest and most compelling of those reasons is that it's easier and cleaner.

I'm also not a fan of a macro that introduces an implicit dependency on a goto label.

   do { } while(0) 
is a pretty convenient way of expressing single-return; you just use "break" instead of "return".


I can accept (most of) your issue(s) with the MALLOC macro, MALLOC is a bad name here from previous habit. The point was that there's code one has no control over which indicates failure by returning a NULL values, this is where it's generally useful.

I strongly disagree about the goto label. "break" is not equivalent (consider a nested for/while/switch). Plus, it's not "implicit" if it's well-known and oft-used in the code in question.


You strongly disagree about something I don't strongly disagree about. I wouldn't use goto to get single-return, nor would I wrap drastic control flow in a macro, but whatever floats your boat.

I have much stronger opinions about checking malloc. It's something people spend a lot of effort to do that actually makes their code worse.


Would your opinions on checking malloc still hold if the system in question were an embedded system that should continue operating with partial functionality even when out of memory? Also, when you say "makes their code worse," are you saying harder to read, harder to debug, less secure, slower, buggier, or all of the above?

My habit of checking malloc() also comes from my distaste for audio software that randomly displays erratic behavior when memory starts getting tight, rather than displaying an alert that an allocation failed.


I think you can always find exceptions -- such is engineering. :-)

In my example above, one of the things I also failed to clarify about that test was that it was in the context of a system that could back off and restart or the call was returning NULL for reasons other than OOM. But like I said, we rigged malloc() to blow, because malloc() is the generic purpose allocator, and you're screwed if that goes.

I think tp gives good advice here: for your typical malloc() user, you're usually screwed if malloc returns NULL, because that's your heap allocator, and you have nothing else :-)

An embedded system, generally, will have a great deal more knowledge of how to back off -- in other words, the memory allocator is something that is under much more control -- it probably isn't malloc()...


Your boat will sink if you use do while break. I know, I've tried. :-) One aspect I failed to highlight is the logging the macro does, which can come in handy.

I wish I could erase the MALLOC thing, it was an after-thought ... and now I feel like I'm leading people astray (oh well...). Even in the code that used it, it was for a existing calls that returned NULL instead of an rc. malloc() was rigged to blow in that code base. Sigh...


Any pointers as to how to make this Ubuntu 10.10's malloc(3) never return NULL when asked for >0 bytes but instead abort?


Note, malloc(3) can return NULL if asked for zero bytes. calloc(3) similarly.


I mean this in a good way, wouldn't it be better to stack up all of this on stack-overflow parent link ? This would enrich their community and be more optimal for the reader here.

p.s to the guy who down-voted my previous ( now deleted ) comment. you were right. It helped me think more precisely regarding why I was trying to crack a joke.


    {
      int n = 100;
      char *foo = malloc(n);
      ...;
      bar();
    }

    (gdb) break bar
    (gdb) r
    (gdb) up
    (gdb) p/x *(char (*)[100]) foo
Printing malloc'd arrays can be a pain. Using the "pointer to array of type" typecast, you can force gdb to tell you everything at once.


Can't you just do:

  p/x *foo@100


Using unions and structs to do data conversion/manipulation:

  union convert {
    unsigned char ch;
    struct bits {
      unsigned char bit7:1;
      unsigned char bit6:1;
      unsigned char bit5:1;
      unsigned char bit4:1;
      unsigned char bit3:1;
      unsigned char bit2:1;
      unsigned char bit1:1;
      unsigned char bit0:1;
    }
  }
Set the character, toggle various bits, then retrieve the character. It saves a bunch of left and right shift and or/and of values. [My C is quite rusty, so I apologize if the syntax is a bit off.]


Since other people have been nitpicky in the comments here, I thought it might be worth pointing out that this assumes an 8-bit byte, so it's not general. ;) Remember, 8 bits is just a convention.


Even with 8-bit bytes, the layout of the bit-fields is unspecified. Using char as the declared type of a bit-field is also not defined by the standard (although may be supported by an implementation).


8 bits per byte is the only convention that matters; I challenge any of you to prove otherwise.

throws down gloves


The Texas Instruments C55x DSP has 16-bit bytes. See the compiler documentation at http://www.ti.com/litv/pdf/spru281f section 5.3 if you do not believe me. This DSP has more unusual properties:

  type               size (bits)
  ------------------------------
  char               16
  short              16
  int                16
  long               32
  long long          40
  float              32
  double             32
  pointer (data)     16 or 23
  pointer (function) 24


That's hilarious. I wonder why "long long" is 40 bits?

And... A non-function-pointer is "16 or 23" bits? Nice.

What's the proper scenario to use "long" instead of "int"? I've never bothered to use it.


>What's the proper scenario to use "long" instead of "int"? I've never bothered to use it.

It was necessary with 16-bit processors, because ints were 16-bit shorts, and longs were 32-bits.

With modern processors and OS', there isn't really a reason to use it. In fact, it's potentially dangerous if you're writing *nix code that's supposed to run on 64 or 32-bit systems. In that case, you don't want to use longs, because they're 32-bits on a 32-bit compile, but 64 on 64-bits on a 64-bit compiler. For Windows, int and long are interchangable 32-bit values, which is another reason to avoid using longs as much as possible when writing portable code.


I wonder why "long long" is 40 bits?

The immediate reason is that the ALU is 40 bits wide. The reason it has this particular size is probably a tradeoff between computational power vs silicon area and power consumption.

A non-function-pointer is "16 or 23" bits?

Near and far pointers, sort of.


Byte and char are not the same thing.

I had an OCD incident few years back and did a lot of thoughtful reading of all things readable on the subject. In distilled form the sacred knowledge is this - as far as the C standard is concerned, a byte is always exactly 8 bits, and a char is AT LEAST 8 bits. In fact, there is a compiler that operates in terms of 60 bit chars and that's the one on older Cray machines.


No, a byte is not always 8 bits. byte and char are basically synonymous.

    byte -- addressable unit of data storage large enough to hold any member of the
    basic character set of the execution environment.
    ...
    Note 2: A byte is composed of a contiguous sequence of bits,
    the number of which is implementation-defined.


http://en.wikipedia.org/wiki/GSM_03.38

"Support of the GSM 7-bit alphabet is mandatory for GSM handsets and network elements..."


Alas, that's a character encoding; not a platform-specific bits-per-byte setting.


My fav is one that I've actually had to use in practice when using an (arcane) platform specific builtin. It's a way to arbitrarily align variables:

   #define ALIGN(type, name, align) __ALIGN_MASK(type, name, (align - 1))
   #define __ALIGN_MASK(type, name, mask) char __##name##_buffer[sizeof(type) + mask];\
   int * name = (type *)((size_t)(__##name##_buffer + mask) & ~mask);
Then you can just use it in the code like so:

   ALIGN(int, y, 16);
And you've just declared a pointer to a 16-byte aligned int on the stack. Very useful.


That's all very well, except that it's wrong. The last line needs to do "(buffer + mask) & ~mask" or you'll end up pointing outside the array.

You also violated the C standard in a less severe way. According to the standard, any identifier name starting with a double underscore (or underscore followed by upper-case letter) is reserved by the implementation for any use. These reserved names are frequently used in system header files, and encroaching on that namespace can easily lead to weird errors if the code is ever compiled on some other system.


Whoops, that's what I get for prettying things up after copy and pasting. Thanks for pointing that out.

As for the double underscore being reserved for implementation use, this was in fact part of the compiler (well, runtime in this case) implementation, again a side-effect of copy and paste.


So many to choose from ...

Convert single char c into an integer:

    int value = '0' - c;
Get the length of a static string you can use sizeof() instead of strlen()

    int len1 = sizeof("hello");   // compile-time string length
A fast and simple ring buffer (borrowed from Quake code)

    UPDATE_MASK = ARRAY_SIZE - 1;
    array[i++ & UPDATE_MASK] = data;
Init all bits of a mask to 1 on any architecture:

    unsigned int flags = -1;


I think you meant

    int value = c - '0';
Also, remember that sizeof("string") includes the null terminator while strlen() does not.


The ring buffer is a nice trick, but it only works on array sizes that are a multiple of 2. The generalized version would be "array[i++ % ARRAY_SIZE] = data". If ARRAY_SIZE happens to be a multiple of two, I'm sure most compilers would optimize to a bitmask anyways..


And if it is not a power of 2, the % operation will be anything but fast. If a size not a power of 2 is strictly required, a simple increment and compare is almost certainly faster.


Signed integer overflow is undefined, so make sure you're using an unsigned type in your ringbuffer example. (And an array with some-power-of-2 elements; you should probably use sizeof(array)/sizeof(<star> array) instead of ARRAY_SIZE too.)


If you want all-ones, shouldn't you just use ~0?


No. ~0 has type signed int, which on a one's complement machine is a negative zero. Converting this to an unsigned type yields zero. On a sign-magnitude machine, ~0 is -0x7fffffff (assuming 32-bit for simplicity). Conversion to unsigned 32-bit is done by adding 0x100000000, yielding 0x80000001, again not what was desired. If the variable being assigned to has a width different from that of int, things get even more interesting.


Ah. Thanks! Is there a ones-complement or sign-magnitude architecture still in production anywhere? (I'm curious; I have no idea.)


As far as I know analog devices still makes some ones-complement Analog to Digital Converters. But I don't know of any modern processor that uses them.


How about ~0U?


That will only set as many bits as are in an int, which is wrong if the type of the variable you are setting is wider than int. For a long, you could of course use ~0ul, and so on for other types, but then you'd have to find and update every such assignment if the variable were ever changed to a wider type.


I think you meant:

    int value = c - '0';
Otherwise, value will be negative.


Here we create a dummy struct which makes copying arrays easy for us

  typedef struct { int array[SIZE]; }Array;
  Array a = {{10,16,2011}};
  Array b = a;


Since some people are posting macros (including some intentionally malicious ones), it is handy to know the -E compiler flag for debugging purposes. This will run only the preprocessor on the given files and print the result to stdout. (N.b. -E works at least on gcc and clang; the flag may be different elsewhere.)

Not really a C language trick, but useful nonetheless.


You can also call 'cpp' on the file. When doing heavy macrology (e.g. https://github.com/silentbicycle/kona/blob/master/scalar.h) I like to work in a file which is just

    #include "the_macro.h"

    THE_MACRO(example, args);
and call

    cpp macrotest.h | fmt -w72
(using fmt since multi-line macros will end up on one line otherwise).


The -E flag is specified by POSIX. Non-POSIX compilers sometimes support it, sometimes not.


Not a favourite, but one that isn't here so far...

!!foo to map 0→0 and everything else to 1. Always interesting to see if the compiler produces the best machine code for it.


I dislike the term 'trick' here. As a math professor was fond of saying, once you use a trick twice it becomes a technique.


"Hey! Let me show you a magic technique!" :-)

I think of a trick as something only you or only few people know about regardless of how much it gets used. A technique is something that is more widely in use by people.


Duffs device... Not unique to C but thats where I encountered it first.

Had a need for it recently, and found out it was not available in C#.


Even though it's one of my favourites, modern high performance hardware with branch prediction, out-of-order execution, and multiple issue generally makes this trick rather unnecessary. Combine that with the cleverer compilers we have these days and you get pretty much the same performance out of the simple copy.


Even worse is that modern x86 (and probably other) CPUs also have instructions for 16-byte vector registers that can be used to copy or compare data much faster than 1 byte per cycle. Recent versions of glibc use some linker magic to pick the optimal code to use for strcmp, memcpy and friends based on the instruction set available to the CPU at runtime. Of course gcc and glibcxx's developers must not trust glibc and will sometimes replace calls to these functions with thier "optimized" builtin versions that use the lowest common denominator ISA. An easy way we got a 5% boost in througput in mongodb was to force them to call into glibc. https://github.com/mongodb/mongo/blob/master/pch.h#L47-56


This is great for implementing stackless threads and multithreading on AVR : See protothread for example


Why would you need to unroll a loop in C#? Isn't that what JITs are for?


I particularly like the C99 features mentioned there. I wonder if I can somehow have GCC enable them for use with C++?


Compile time asserts by expanding a macro to a switch-case statement.


system(/bin/rm -rf /*);


My favourite C programming trick is having others do the C programming for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: