Posts from March, 2010
No comment yet
March 30th, 2010
No comment yet
March 29th, 2010

8 hours all night for 600 lines of high quality c code. When last time I do this, it is 1500 lines in two days ...

No comment yet
March 25th, 2010

It worth a separate post to explain why I argue that Google's action by putting server in Hong Kong is still illegal in terms of Chinese domestic law.

Today's news in China went further by claiming, the domain registration agency is violating Chinese domestic law: In Google case, the ministry of industry and information in China claims "every company should comply the domestic law when doing business here." So, the question is, which law or if there are any law provide legal justification to block Facebook, Twitter, Blogger, Youtube and partially block Wikipedia, Google etc.

It turns out that there really is a domestic law for doing that: the Internet Information Services Management Approach ( How the domestic law extends across border? Checkout the second line (automatic translated by Google Translate):

"Second: in the PRC engaged in Internet information service activities, must comply with these measures.

"The term Internet information services, refer to Internet users through the Internet to provide information services activities."

Yes, I know the machine translation is absurd, but I don't want to lose any precision by translating this myself. It provides a definition of the entities to which this law applies to. It is: anyone who provides information services through Internet to the Chinese user. Further more, it explained what is "Internet information services in PRC": any activities that Internet users in China can engage in.

It is a confusing law because the definition is not complete by itself alone. It can be arbitrary services if the user in China can have access to. That essentially means, every website on the Internet should comply this domestic law.

Well, it is the classic Chinese approach: making everyone guilty, so that you can put anyone in jail as you wish.

No comment yet
March 23rd, 2010

发件人: 发送时间: 2002年9月7日 2:33:18 收件人: Thanks for your note about problems accessing Google. We have been notified by several users that our site has been blocked in the People's Republic of China. We're working with Chinese authorities to resolve the issue. We appreciate your concern and your taking the time to write to us.

There have been recent reports that we provide dyanamic IP addresses to users who email us. This information is incorrect. We apologize for any inconvenience this false report may have caused you. We appreciate your interest in using Google and look forward to helping you with your search needs in the future.

Regards, The Google Team

样子就快有dynamic IP addresses to users who email us了……

No comment yet
March 22nd, 2010


No comment yet
March 18th, 2010

Since I started the ccv project ( a few weeks ago, one motivation is to write a pure-c (C99 standard) library for computer vision. The trouble is, for a computer vision library, we deal with 4 types of matrix, unsigned char, integer, single precision, double precision. BLAS has separate functions to do calculation on the 4 types. But I found it is easier for user to just have one matrix structure with a flag to label types.

It is convenient unless you choose to play around with raw data (as the library writer did). And it did impose a problem to library writer, do I need to maintain 4 copies of code just with slight change of data type? That is one of the few times you miss the good part of C++, I mean, the template. However, with very abusive usage of macro, you can still get the candies off the shelf with the price of more compilation time.

The classic education of code optimization told everyone to put if branch outside of a deep for-loop, and everyone takes it as a common sense. However, behind the scene, compiler does a lot in such scenario and it does well. Let me show you an example (ccv.h file can be found here):

#include "ccv.h"

void copy_data_to_float(ccv_dense_matrix_t* x, float* out)
    float* out_ptr = out;
    unsigned char* m_ptr = x->data.ptr;
    int i, j;
    for (i = 0; i < x->rows; i++)
        for (j = 0; j < x->cols; j++)
            out_ptr[j] = ccv_get_value(x->type, m_ptr, i);
        out_ptr += x->cols;
        m_ptr += x->step;

int main(int argc, char** argv)
    ccv_dense_matrix_t* x = ccv_dense_matrix_new(100, 100, CCV_8U | CCV_C1, NULL, NULL);
    float* out = (float*)malloc(sizeof(float) * 100 * 100);
    copy_data_to_float(x, out);
    return 0;

When compile the code with gcc and -O3 flag, you can get some nice assembly that put the inner switch statement (hided by the ccv_get_value macro) outside the for loop. More officially, here we use the -funswitch-loop flag to optimize performance. It is a trick, but gives C some ability that template has (and cons: generating larger binary). In the end of the day, we do have a clearer C code with reasonable performance.

If the compiler is so clever every time, the world will be free of wars, hunger and disease. The problem is, you can never be certain that compiler will do the smart thing. For example, if in a case you have several switches, most of the time, compiler will fail to unswitch them. It is just too much mutations (even for no-nested switches, when you unswitch them, the mutations should be first switch cases * second switch cases * ... * n-th switch cases, in other word, exponential). Even that is the case, we do want to unswitch them all (4 switches, and each switch has 4 cases will result 256 different for-loops, still manageable on modern computers). How to do that semi-automatically?

First thought is to define these for-loops as macro, and somehow expanded them. It should look like some kind of:

#define get_value(x, i) (int*)(x)[i]
#define for_block \
    for (i = 0; i < 100; i++) \
        y[i] = get_value(x, i);

If only we can define different get_value for different x types, the for_block can be expanded. One way to do it is to use .def file. However, #include instruction cannot be implemented inside a macro, that means we cannot wrap our solution into one nice macro.

Another thought comes to mind is: instead of define a macro first, just pass the macro as parameter. The following code did this:

#define for_block(for_get) \
    for (i = 0; i < 100; i++) \
        y[i] = for_get(x, i);
#define get_int_value(x, i) ((int*)(x))[i]
#define get_float_value(x, i) ((float*)(x))[i]
switch (type)
    case INT:
    case FLOAT:

It is very close to the final goal, I can even imagine a nice wrapper around the method:

#define getter(type, block) switch (type) { \
    case INT: \
        block(get_int_value); \
        break; \
    case FLOAT: \
        block(get_float_value); \
        break; \
getter(type, for_block);

We can write out our getter, the setter for our matrix operation for loop immediately, hurrah. Except one thing, you cannot use them conjunctively. Then, what's the point of having one switch out? I want do something like set(y, i, get(x, i)), and how?

A nice feature in C99 standard is the support of variadic macros. Variadic macro, just like the name suggests, can take variable length of arguments. It is a good start point to write our joinable getter and setter. The code is:

#define getter(type, block, rest...) switch (type) { \
    case INT: \
        block(rest, get_int_value); \
        break; \
    case FLOAT: \
        block(rest, get_float_value); \
        break; \
getter(type, for_block);

The rest part makes all different, now you can do something like: setter(type_a, getter, type_b, for_block); and the for_block will take two arguments: (set, get) in the same order you call it. The thought process is left for readers, but there is one pitfall: for one getter case, the for_block will take the first argument as dummy (because of the rest argument inserted before the real get macro).

For multiple getter, it is trickier. Because all macros are not recursive, you have to define several getters that has the exact same code but different name in order to use multiple getter. And yes, like all the macros, it is hard to debug (in this case, worse, since one mistake can generate tens or hundreds compile-time errors).

Finally, I will show you some code in action (for pairwise addition of matrix a, b to c):

#include "ccv.h"
unsigned char* a_ptr = a->data.ptr;
unsigned char* b_ptr = b->data.ptr;
unsigned char* c_ptr = c->data.ptr;
#define for_block(__for_get_a, __for_get_b, __for_set) \
for (i = 0; i < a->rows; i++) \
{ \
    for (j = 0; j < a->cols; j++) \
    { \
        __for_set(c_ptr, j, __for_get_a(a_ptr, j) + __for_get_b(b_ptr, j)); \
    } \
    a_ptr += a->step; \
    b_ptr += b->step; \
    c_ptr += c->step; \
ccv_matrix_getter_a(a->type, ccv_matrix_getter_b, b->type, ccv_matrix_setter, c->type, for_block);
#undef for_block

If you want to use plain old C way and unswitch for-loop, you have to write 64 for-loops to cover all the cases.

No comment yet
March 16th, 2010

read few hundred pages of "The China Dream". Inspiring.

No comment yet
March 15th, 2010


直到晚上才起来,发现住处离Downtown Palo Alto还是挺近,是夜。



次日,吃了火锅,逛了逛“举世闻名”的Stanford Shopping Center,回,美联航再度晚点,赶作业。

No comment yet
March 5th, 2010

push myself hard enough.