Author Archives

Emacs smart split for programmers

I spend most of my conscious hours in front of Emacs in a terminal window these days, and I share my configuration across all my computers. At work I have a huge monitor, so I split the emacs frame into 3 side-by-side 80-column windows. At home I have a smaller screen with room only enough for two windows. To share the same configuration file, I use the following snippet:

(defun smart-split ()
  "Split the frame into 80-column sub-windows, and make sure no window has
   fewer than 80 columns."
  (interactive)
  (defun smart-split-helper (w)
    "Helper function to split a given window into two, the first of which has 
     80 columns."
    (if (> (window-width w) (* 2 81))
    (let ((w2 (split-window w 82 t)))
      (smart-split-helper w2))))
  (smart-split-helper nil))

(smart-split)

The smart-split function split the emacs frame into a maximum number of 80-column windows. A very portable solution.

JSON++: A JSON parser for C++

I wrote a very simple JSON parser in C++ a while ago. It’s just a pair of header and source files, thus very easy to use (compile the .cc file and link with your source). Check it out here.

Here are a few examples:

string teststr(
        "{" 
        "  \"foo\" : 1," 
        "  \"bar\" : false," 
        "  \"person\" : {\"name\" : \"GWB\", \"age\" : 60}," 
        "  \"data\": [\"abcd\", 42]" 
        "}" 
               );
istringstream input(teststr);
Object o;
assert(o.parse(input));
assert(1 == o.get<long>("foo"));
assert(o.has<bool>("bar"));
assert(o.has<Object>("person"));
assert(o.get<Object>("person").has<long>("age"));
assert(o.has<Array>("data"));
assert(o.get<Array>("data").get<long>(1) == 42);
assert(o.get<Array>("data").get<string>(0) == "abcd");
assert(!o.has<long>("data"));

Caveat: It currently does not support floating-point numbers.

LINQ++: An embeded DSL for C++

LINQ have been the new hotness in Microsoft’s .Net platform. Well, you can have the same syntactic sugar in C++ without writing a new compiler. I wrote a small library (just a short header file right now) that can do some interesting things. (source available at github) The following snippets from the companion unit test shows a few:

// Count the number of people older than 30
cout << from(guests)
        .where(&_1 ->* &Person::age > 30)
        .count()

// combine the people older than 30 with the person with name
// "joe" into one table.
DataSet<vector<Person> > results =
        insert(
                from(guests)
                .where(&_1 ->* &Person::age > 30))
        .into(
                from(guests)
                .where(&_1 ->* &Person::name == "joe"));

// select the age column from the previous table.
shared_ptr<vector<int> > ages = results
                                .select<int>(&_1 ->* &Person::age)
                                .get();

It should work with all STL-compatible sequence containers and requires the boost library. You can chain the clauses to form complicated queries.

I wrote it up on the shuttle from work to home. Hopefully I can find some time to polish it up and make it actually useful.

Coroutine and continuation in C

An article about the EVE Online game and stackless Python got me thinking about how to implement coroutines and continuations in C. After a short research, I found a simple implementation (without using longjmp!).

Coroutines can be thought of as cooperative multithreading, sometimes also known as user threads. Two or more procedures can yield the CPU to each other in the middle of their execution, so that they appear to run concurrently. It’s a very well-known feature in more academic programming languages such as Lisp and Scheme which have the Call/CC (call-with-current-continuation) mechanism. A continuation is a way of representing “the rest of the execution” at a certain point of time.

Here’s the C implementation of two functions that run “concurrently”. First, the required headers:

#include <signal.h>
#include <stdio.h>
#include <ucontext.h>

Because the two routines need to cooperate, they need to communicate with each other. I will just use a global variable to signal whose turn it is. The volatile key word is needed to prevent possible compiler optimization. I will explain later.

// Signals which routine should run.
volatile int g_turn;

Let the first routine print the elements of the array {1, 2, 3} each in a line.

void routineOne(ucontext_t* self, ucontext_t* other) {
    int numbers[] = {1, 2, 3};
    int i;
    for (i = 0; i < sizeof(numbers)/sizeof(int); ++i) {
        printf("Routine one: %d\n", numbers[i]);
        // Call other with current continuation
        if (g_turn != 1) {
            g_turn = 1;
            swapcontext(self, other);
        }
    }
}

A few things to notice here. ucontext_t is a struct that stores a user thread’s execution state, including it’s stack pointers and instruction pointer. We pass two contexts to the function, one for itself and one for the other routine. This is very similar to the continuation-passing programming style in functional languages such as Haskell. The swapcontext() call stores the current state of execution to the first argument and switches to the state stored in the second argument. Notice that in this function, there is only one assignment to g_turn. The compiler does not know the semantics of swapcontext(). To the compiler, it’s just a function, and there’s no way to know it switches execution context. Therefore the compiler can assume g_turn never changes within this function after the first assignment, and replace all subsequent references to g_turn with the constant 1. That’s why we need to use volatile in the declaration.

We let the second routine print the array `{-1, -2, -3}. It’s completely symmetric to the first one.

void routineTwo(ucontext_t* self, ucontext_t* other) {
    int numbers[] = {-1, -2, -3};
    int i;
    for (i = 0; i < sizeof(numbers)/sizeof(int); ++i) {
        printf("Routine two: %d\n", numbers[i]);
        if (g_turn != 2) {
            g_turn = 2;
            swapcontext(self, other);
        }
    }
}

Unlike Stackless Python and most symbolic or functional languages, C is a stack based language. The main function needs to set up a stack for each user thread before starting them.

int main() {
    // Continuations
    ucontext_t cont_one;
    ucontext_t cont_two;
    ucontext_t cont_main;

    // one stack for each thread
    char stack_one[SIGSTKSZ];
    char stack_two[SIGSTKSZ];

    // Initialize the coutinuations.
    cont_one.uc_link = &cont_main;
    cont_one.uc_stack.ss_sp = stack_one;
    cont_one.uc_stack.ss_size = sizeof(stack_one);
    cont_two.uc_link = &cont_main;
    cont_two.uc_stack.ss_sp = stack_two;
    cont_two.uc_stack.ss_size = sizeof(stack_two);
    getcontext(&cont_one);
    makecontext(&cont_one, (void (*)())routineOne, 2, &cont_one, &cont_two);
    getcontext(&cont_two);
    makecontext(&cont_two, (void (*)())routineTwo, 2, &cont_two, &cont_one);
    g_turn = 0;

    // Call routineOne with current continuation. Continue from here
    // after routineOne finishes.
    getcontext(&cont_main);
    if (g_turn == 0) {
        setcontext(&cont_one);
    }
    return 0;
}

Run this program and you should see the following output:1

Routine one: 1
Routine two: -1
Routine one: 2
Routine two: -2
Routine one: 3
Routine two: -3

Hopefully you’ve found something interesting in this article. Now an exercise for the reader: If you step through this program in gdb, what do you think will happen? Will gdb get confused? Can you step from main() into routineOne() and then into routineTwo()? Try it. :-)


  1. This program runs fine in Linux, but unfortunately it causes a bus error in Mac OS X. I’m not sure why. Perhaps there is some issue in the Mac OS X system library. 

Building Thrift

Thrift is Facebook’s cross-language data-exchange and RPC library, similar to Google’s Protocol Buffers. In my opinion, it also seems to be better (supports more data structures and programming languages) than Protocol Buffers. Unfortunately, its documentation isn’t very comprehensive, and many people have trouble getting it to work. In Ubuntu/Debian Linux, besides the requirements on the official homepage, you need the following packages to compile Thrift:

libboost-dev
python-dev
ruby1.8-dev
byacc
flex

Hopefully after thrift comes out of the Apache Incubator, the documentation would be more complete.

Writing Tests First

The test-driven development (TDD) methodology advocates the following practice (in that order):

  1. Write tests for the feature you want to implement.
  2. Watch the tests fail.
  3. Write enough code for the tests to pass.
  4. Refactor your code.

I usually don’t care about the order in which the tests and the production code are written. I am used to a more traditional approach — write the code, and then the tests. Recently I realized one big benefit of writing tests first (in addition to all the other benefits the TDD advocates have been saying). Writing the tests first and watching them fail make sure the tests are indeed working, and after you write more code to make the tests pass, you are sure that the new code is indeed doing the work. If you write the production code first and then write the tests, the tests pass, but then you cannot be sure whether your code is indeed correct or your tests are broken (pass when they shouldn’t have).

In a few places in one of my personal projects, I mistakenly used assert() instead of the assertTrue() JUnit function. assert() is only effective in debug mode, so these tests end up useless.

贴了些在三藩迎火炬时的照片

一直忘了在这里发个链接: 点这里

We Are Not Brain-Washed

This is from my post in an email discussion with some coworkers, edited for a more general audience.

Many Americans believe that people who grow up in China have been brainwashed. There were many times during the torch relay event that the phrase “because you are brainwashed” caused a conversation to end abruptly. Back when I was a college student in China, I disliked the government as much as any of you here do, probably more. The same is true for many other students. If the government intended to brainwash the students, then I can assure you it wasn’t very successful. I was thoroughly disgusted by many things, and I used to organize protests against the school authority. Then I came to the US, both because the US has some of the world’s best universities and because I disliked the Chinese government. I studied for 5 years in a PhD program here and have worked for less than one year. During these 6 years, my attitude toward the Chinese government has been gradually changed and I started to understand why things are that way in China and found many of the things I hated become understandable, not because the Chinese government can remotely brain-wash me from the other side of the ocean but because there is comparison. Don’t get me wrong. I am not saying that China is better than or anywhere close to the US. But I do see many of the things I disliked in China happening here, sometimes in more subtle ways, sometimes to a lesser extent. Seeing that with the abundance of wealth and resource, a relatively small population, the strongest military force, and strong international influence, the US still have so many problems, it seemed to me that the Chinese government had been doing a decent job managing the country. There are plenty of Chinese people who support the government on many issues and policies, and believe it or not it is most likely because their lives are improved rapidly, not because of government propaganda. Sure there are a lot of propaganda on the Chinese medias, but they are so superficial and obvious that they largely get ignored or made fun of. From this aspect, I would argue that the US has a more powerful propaganda machine to serve its interests and ideology (it probably started influencing me back when I was in China).

Many people I know have similar feelings. Many of those who went to San Francisco to protect the torch were also on the Tian’an Men Square in 1989.

I think it is unfair and impractical to expect China to do the same things that the US does now w.r.t. to issues like human rights, freedom of speech, etc. China has a long cultural history, but as a modern country, it has less than 60 years of history, while the social, governmental, and legal systems in the US have evolved for hundreds of years, not to mention China’s huge population and relatively small farmable land. Whenever a problem about China is discussed, there is always someone who comes out and say “Simple. Why not just let people vote for a decision?” It’s not that simple. Our laws have huge holes; Different ethnic groups can have serious conflicts because of religion and cultural differences; Our government is immature and afraid of uncertainty; Our whole social system is too fragile and don’t have enough buffer to survive instability. Fixing these problems takes time, and we’ve come a long way. Voting is not a trivial process, otherwise there wouldn’t be so many debates in the US about the procedure and machinery of voting. Other governments can easily point fingers at China only because they don’t have to solve China’s problems. One example comes to mind — the one-child policy. It had been criticized for a long time by westerners for human rights violation. Reagan questioned that policy when he visited China, but the conversation ended when Deng said the restriction could be lifted if the US could help by accepting 10 million Chinese immigrants per year. In the city where I grew up, I know a number of families who had two or more kids. They paid a fine, and did not get the monthly single-child stipend from the government. It’s simple. But it does not surprise me if there are government officials in certain places who enforced this policy in ways that violated human rights and created tragedies. In the 80s and 90s, even now, the low-level government officials in some rural areas didn’t get chance to receive much education due to the cultural revolution. It’s a tragedy of that whole generation, a tragedy for both the victims and wrong-doers. In the same way, I don’t doubt there are many Tibetans whose families suffered great pain and loss in certain periods, just like many people in other parts of China. They have my greatest sympathy. However, such tragedies are usually exaggerated to sound like systematically planned crime in order to serve political goals.

I grew up in the Yunnan province (next to Tibet), where many Tibetans live, so I’ve visited some Tibetan monasteries. My wife travelled a large region in Tibet – from Lasa all the way to the Everest, talked to many Tibetans, lived in Tibetan homes. I don’t want to bring up too much opinion. Let me just say that from our experience we do not think the government is against the Tibetan culture or has any plan to reduce the Tibetan population — Tibetans don’t have to pay tax (although many of them still choose to contribute much of their wealth to the monasteries); Large amounts of money is spent maintaining the monasteries; Tibetans receive enough stipend so that they can have a decent live without working (such social benefit doesn’t exist in other parts of China); Tibetans can have 3 children (or more for a small fee) instead of one. My wife also got warned when entering Tibet that if she got involved in any kind of conflict with Tibetans, most likely the local government and police will not help her, to avoid stirring up the tension between different ethnic groups.

We all agree that China has problems: human rights, freedom of speech, etc. But I don’t believe there is such a thing that one country or government helps to improve human rights in another country. A country only helps itself. The US government only supports freedom and self-determination when it serves its national interest. I’m not accusing the US, this probably applies to all governments. I have a pessimistic view of international politics, so when a government is paying money for something happening elsewhere, I always doubt its intention. From unclassified US government documents, the Dalai Lama had been on CIA payroll until the mid 70s when the US and China establishes foreign relationship, then he got transfered to another organization with the phrase “human rights” in it. The same set of documents also show that the US had been training a Tibetan army in Colorado and dropping them back to Tibet as gorilla fighters. “Human rights” sounds so good and is so widely applicable that it is the most convenient phrase to use when a government needs to explain to tax payers why their money is used to help overthrow another government. Yes, there are problems in the Chinese government, we all acknowledge that. There is no incentive for a Chinese to hide the government’s shortcomings, after all, a better government means a better China and better lives for Chinese. However, I believe the Chinese people have enough wisdom and courage to solve their own problems. We know the US government’s capability of “introducing” democracy into another country — there are plenty of examples to look at. Thanks, but no, thanks. China is unique, and there is more than one way to democracy.

I hope people from different nations don’t accuse each other of “brain-washed” simply because they have different opinions or they are not expressing their opinions clearly due to language/cultural barriers.

I believe most Americans who support “Free Tibet” have good intentions and have very good reasons for their attitude. But be sure to do enough research to make sure your good will is not misused by others. (Note that this is a suggestion, not an assertion.) I’m not trying to change anyone’s opinion, just want to let others know what people with similar backgrounds with me might think.

“Free Tibet” on the Golden Gate Bridge

See this link.

Quoted from a Digg comment:

Did anyone bother to tell them that Tibet is within the territorial grounds of mainland China and NOT California? Seriously, I think our Elemetary schools are failing us!

Joel Spolsky’s Talk at Yale

Joel Spolsky (author of Joel on Software)gave a talk at the Department of Computer Science at Yale last year. I already graduated, so I wasn’t there. The script is an interesting read: Part 1, Part 2, Part 3.