Posts archived in English_Essays

Here is a quite funny picture I found today on digg. It is called the Best Phishing Email, Ever. Despite the funny nature of this hilarious letter, have someone really noticed that in this letter, we have G mail or Gma il instead of Gmail? (one extra blank in between the letter “G” and “m”, or “a” and “i”).

gmail_scam.jpg

Last year when I was interviewed in Google, I reported this bug to Gmail team, (Sadly to say, they didn’t take it very serious, as it’s still there now). Maybe they can argue that it’s a feature, but let me take five minutes to explain why is it. (I guess it’s fairly simple and straightforward).

Let’s start from a little background about HTML. We all know that when we send colorful texts like “Google” via gmail, the email format is actually HTML. Here is a quite important aspect about HTML: An HTML user agent should treat end of line in any of its variations as a word space in all contexts except preformatted text.(Page 20, RFC1866) That is to say, a cstring in HTML source file like “a\nb\nc” will have the redering output identical to “a b c”. Therefore, it’s really confusing that we have to use <br/> to make a newline within a paragraph in HTML text, and “\n” is equivalent to a white space in most of the cases.

Now let’s go back to Gmail. When I send myself a piece of email with a colorful string “Google” via Gmail, I got “Googl e“. Via viewing the source of the HTML, we can actually find that there is a “\n” in between those letters. For example, this is a piece of HTML(javascript) excerpted from Gmail source related to this colorful Google:

\u003cfont color\u003d\”#000099\”\>G\u003c/font\>\u003cfont color\u003d\”#ff0000\”\>o\u003c/font\>\u003cfont style\u003d\”background-color:#ffffff\” color\u003d\”#ffcc00\”\>o\u003c/font\>\u003cfont color\u003d\”#3333ff\”\>g\u003c/font\>\u003cfont color\u003d\”#33cc00\”\>l\u003c/font\>\u003cfont color\u003d\”#ff0000\”\>e\u003c/font\>\n \u003c/div\>\n\u003cdiv\> \u003c/div\>\n\u003cdiv\>

Here \u003c is “<”, \u003d is “=”, without the bold “\n” in this line, the result should be “Google“. So, why we have an extra “\n” here? Who did this trick? The answer is simple: “Gmail”. For some reason, Gmail breaks a long line in HTML source file into multiple lines and sends the email out (I haven’t figure out the rule that Google uses to break lines in HTML source file). By doing several trival experiments like sending mail from Gmail to Hotmail and vice versa, I am now pretty sure the problem is caused by Gmail automaitc line breaking strategy. That is to say, Gmail client automatically inserts a newline(“\n”) symbol in the HTML source file and causes this “visual bug”. Actually this bug is quite easy to fix, for instance, just break the line at the first blank after the label name, for example, like:

<span
style=”color: rgb(255, 0, 0)”>red</span><span
style=”color: rgb(0, 255, 0″>green</span>

instead of say

<span style=”color: rgb(255, 0, 0)”>red</span>
<span style=”color: rgb(0, 255, 0″>green</span>

or

<span style=”color: rgb(255, 0, 0)”>red</span><span style=”color: rgb(0, 255, 0″>
green</span>

The first generates “redgreen“, and last two give “red green

BTW, here is a nice tip for interviewees: love your prospective employer, love their products. Eventually, you would have a very nice understanding about their culture and products. All companies are willing to hire guys who actually love their culture and products (and can even find bugs :).

PS: in preparing this article, I found that Gmail team has secretly updated the text format system from using plain old <font> to fancy (and elegant) XHTML+CSS <span>.

PS2: http://www.opinionatedgeek.com/dotnet/tools/Base64Decode/Default.aspx is a nice online tool for decoding the base64 format.

When I was in collage, I had two classes in computer science, namely, “Algorithm” and “Data Structure”. These two concepts are universal in both computer programs and software applications, whether on a rescued laptop or a million dollar mainframe. Nowadays, Web becomes tremendously popular, and of course, extends significantly in scalability. Therefore, are there still any general concepts like algorithm and data structure in modeling the web? Here are some incomplete thoughts of mine about the computation and storage model of the web.

1. Google.

For Google search engine, it treats the web as a sorted list based on different keywords. Thus, provided with keywords, the web is sorted by the relevance and PageRank system. Google does both the computation and the storage. For general users, these lists are sorted via some criteria extensively studied by Google; we just get the result out. It seems to me that this is the most successful model for user to access the web. However, sorted list requires both sophisticated sorting mechanism and advanced computational power. Although there are fairly amount of search engines in the world, for most of them, their “sorting quality” or “general coverage” are not as good as Google.

2. Del.icio.us/YouTube/Flickr/Google Base, etc.

For these and other similar sites, they basically organize the web or some special media files on the web on a tag-based system, or preciously, an n-to-n mapping structure. All these four in this category are the leading websites on their own fields. For instance, YouTube is the largest video-sharing website, etc. I would conclude that n-to-n mapping about the media sharing is suitable for socialized website. We can find other success models elsewhere.

3. Amazon, Facebook, Microsoft and Google’s Data Storage API.

According to R/WW, today Microsoft announced Windows Live SkyDrive. Facebook actually quietly released the Data Store API beta recently. Amazon has already had the famous S3 service for a while. They are all treating the web storage as Lookup Table. A closer look shows that all these four data storage API sets are trying to let the user to store heterogeneous media as “object” and support random accessing via keys. For some startups, this feature is critical as the storage scalability is usually an obstacle. Via using these four APIs, the scalability is hidden or moved to these big name companies. Treating the Web as objects will absolutely simplify the storage model and reduce lots of overhead in scalability. Similar to organizing all files under the same folder or on a disk using the file name as the key, sooner they might need some search tool/index/tag mechanism to get rid of the name space nightmare. Additionally, as meta-information should be stored in this system, a search will take twice the database access. Obviously developers have to do more than simply dump the data in. The bottom line is, data storage pools are not big trucks, developers has to maintain them. But I do see the dawn of gird computation here. Here, this is for storage task, but if we can later provide a similar interface to computational task, we will jump into the era of grid computation.

4. Yahoo! Pipe and Google Mashup Editor, FQL, etc.

There is an article about Yahoo! Pipe named “web as a database”. I would rather say that Pipe treads the web as an UNIX file with handy tools dealing with it. Later we got Google Mashup Editor, which is ugly but powerful (at least for me). GME is somehow like Yahoo! Pipe but more natural for programmers. They both tread the Web as a special file (or the concatenate of several files). They provide the “operation system” where you can run the services like sorting and filtering on that particular file. They are making some sort of operation system and binary applications on a Google Inside (or Yahoo! Inside) web. FQL is another funny thing worth mention. It models the Facebook data like people and groups as RDB, and FQL/Facebook platform is RDBMS. My conclusion in this section is Yahoo! Pipe might be a GUI for mashup editing and GME is like a console-based editor. It’s hard to tell which one is better now. Facebook is quite aggressive in reorganizing the web information; guys at Facebook are going to re-model the web in a Facebook manner via the F8 and several millions of users.

5. SUN/SETI@home

While SUN is selling the CPU power at 1$/h to general public, SETI@home utilizes all the idle CPU power around the world. Since I haven’t done much research on the computational model, I just list these two here. The previous one might become the future of grid computation. Actually SUN is very good at grid computation. The second one is the distributed computation. We also have other computation models like P2P computation or other decentralized computation model. While SUN is treating itself as the CPU for the web, SETI@home is treating the Web as the CPU. It’s hard to judge which is superior. SUN might be the first several companies who can develop the actually grid computation on the web, with SETI@home project might be the most powerful computer on the web.

Keep in mind that there is not silver bullet in modeling the web. I do believe that if one wants to setup a big company, he/she does need a big picture about how to model the web and how to setup the model.

Again, all ideas here are immature and need to be refined. You are more than welcome to leave comments and suggestions.

Why I am writing blog posts in English.

The reason is elementary, my dear Watson:

0. “The British are coming!” (Just kidding)

1. I want to improve my English

2. While we wait for the babel fish, I have do something.


“You forgot Poland.” — George W. Bush

====Chinese Version==中文版==

我为什么用英文写作?

最近间或写一些英文的文章, 一些读者来信表示不快. 因为我的母语是中文, Blog 也大部分是朋友在看, 用英文写作给他们的阅读也带来了麻烦. 那么, 为什么我要用英文写作呢? 我可以用两句英语解释清楚, 因为美国朋友很欢迎我用英文写作. 但我必须费点功夫告诉我的中文读者, 我这样做不是瞎折腾, 是有我自己的原因的.

1. 我想通过写作提高英语水平

我 的英语不好, 这一点我很小就知道了. 高中时候, 英语没有城市里的小孩好; 大学时候, 英语没有那些刻苦准备GT的同学好; 四六级过是过了, 也不能算好; 跌跌撞撞到了美国, 更加知道自己英语很糟糕. 因此我觉得要有意识的提高自己的英语. 而英文写作很有帮助.

我从去年开 始订阅了<时代>, 新单词新句型记录了很多, 可惜从来没有实际用过, 转眼即忘. 英文写作能让我牢记单词和句型, 用多了还能信手拈来, 脱口而出. 如果您细心看, 就觉得我用的句型大部分会来自最近几期<时代>上的文章, 引文也来自互联网上最新的一些资讯. 我觉得模仿是比较有效的方法. 而且既然写出来给人看, 就要多次检查, 防止语法错误等等. 这样战战兢兢好几次, 现在也能独立于语法检查器写出没错误的句子了, 算是不小的进步了.

2. 我想让更多的人了解我, 和我交流

IT 圈子中, 大家都知道王建硕(Jiansuo Wang), 毛向辉 (Issac Mao). 他们都在坚持写英文的Blog. 他们的英语水平或许都不算特别好, 不过他们都有很多的国际读者. 思想是核心, 语言是承载. 可惜在Babel Fish 到来之前, 我们还都必须受制于语言. 用英语写作, 不是我不爱母语, 而是我想更多的把我用母语表达的意思用另一种语言, 传给更多的人. 至于写英文是否崇洋媚外, 我倒不想辩论. 我尽量做到行文不中英夹杂, 因为我觉得夹杂中英文才是对母语最大的玷污. 如果您觉得看英文不舒服, 略过去就是了. 以后我尽量每周写一两篇英文的, 大家可以捧捧场, 也可以喝个倒彩. 我倒是希望通过写作英文Blog, 我的国际读者相对多一点, 读者之间的交流更加多元一点.

小提示: 您可以使用 Google Translate 把英文翻译过来. 虽然质量不够高, 差不多也能知道大致的意思了.

–My thoughts on making choices.

I have to admit that starting an article with the cliché “to be or not to be” [1] is somehow awkward. But actually, this is the topic of this article. No, I am not trying to answer the question asked by Hamlet. Everyone asks the same question towards different things over and over again everyday, and so do I. Several bits and pieces came to my mind recently, so I just record them down. Instead of getting the answer to the ultimate question of life, the universe and everything, which is 42 [2], here, I want to figure out my principles in making choices.

Thought #0: Making choice is a choice, or why only the paranoid survives.

Lots of people won’t make decisions unless they have sufficient information. But in the real world, information is necessary, but never sufficient, for making decisions. The idea never making decision before you get sufficient information is common but misleading. The long period of making decision finally hurts the outcome of that decision for lacking of time in implement it.

In making choices, our goal is to choose the best one. However, usually there is no obvious superior as the world is complicated, that’s the reason why information is needed to distinguish all the alternatives. Keep in mind that information should be helpful instead of delusive. Sometimes, conflicted information will make people lost and let the decision-making procedure be very painful. Therefore, in the decision-making process, one still makes decisions like ‘whether I should take this information’ or ‘whether I should wait for more information’. I would call this procedure “meta-decision making”.

My idea here is at any time, never let the meta-decision making take up the actual decision making time. I’ve seen more than once that someone staggered at the opportunity and hesitated before deciding. Needless to say, they finally make no better decision than roulette. There is a famous saying that only the paranoid survives. The paranoids usually make decision at the very beginning and hold on straight to the end. There is no meta-decision making for them. Sometimes they make worse decision, but hopefully, they can make superior decisions surprisingly, and they survive. Thus, please focus on the decision making itself and do not let the meta-decision murder your decision. Be aware of them, they may kill your decision.

Thought #1: Occam’s razor, or why more is less.

Once upon a time in my life, I had three pretty good offers, and I have to choose one among them finally. Frankly speaking, I’ve never imaged that. Anyway, I had to choose one. I began to realize that the more is not always the better. Sometimes, we do need an Occam’s razor [3]. Why is that?

The reason was because I felt satisfied with any of the choices, which means I could not simply nuke any one of them. Actually I shouldn’t always mention my achievements in the past, but please allow me to explain it in brief. My first choice is attending Peking University for the graduate study. Before taking the graduate entrance exam, I just want a try. I didn’t want anything more than an exam score. The thing turned out to be amazing that I ranked the 1st among all the students in that major–one of my favorite majors–Bio informatics. My second choice is Google China. At that time, Tina (I guess she is a senior assistant to Dr. Kai-fu Lee) told me that I have a probability of 99.9% to get an offer from Google China. In the meanwhile, I got the offer here, Washington University. Well, for some distinguished students, probably they can withdraw all of these and choose Stanford or MIT. However, for me, all these three are really really good — I had my beloved girlfriend studying in Peking University at that time; Google China was (and is) shining and flourishing; I wanted to stay in Beijing as lots of my relatives and friends were there; I wanted to have my own start up in Zhongguancun with some friends there and Web 2.0 was a buzzword at that time; USA is a free land and the major is computer science, my dreaming major; My advisor was (and is) doing excellent research work in his field; professors at Peking University were quite nice to me; to stay in Beijing would be definitely better for my parents; Gee, tons of pros and cons in my mind at that time. All of these things are twisted together. As a result, I got serious insomnia and was in a blue funk in making this decision. I would rather choose to hide under the rock.

Then, I would like to say that my uncle and my advisor gave me the Occam’s razor. My uncle suggested that I shouldn’t consider too much about others’ idea; and my advisor just told me that I could choose Beijing and Google in future. I’ve noticed that, unlike me, someone takes a different decision [4]. I would like to say, there is no standard Occam’s razor. I absolutely admire him if he didn’t get insomnia in making this decision :). I am saying that the more is not the better is not because I’ve hold such three good offers and am trying to show off, I just want to say that keeping the life simple and stupid is indeed very necessary. For more details about why more is less, I recommend a Google Video [5] for you guys.

Thought #2: Murphy’s Law, or how to use greedy algorithm.

This is about making decision between the current worse choice and the future better choice. Some people will take a risk of 80% probability to get another opportunity in the near future that is 20% superior over the current one. It sounds perfect, right? Since you can get a better one at a relative high probability, why bother with the current one. Now let’s do a simply mathematics. The expectation of the outcome of the future opportunity is 80% * (1+20%) = 96%. Boo, it’s worst than 1, so why not holding the current opportunity?

Most people, if not all, are very optimistic towards the future opportunities, and this 80-20 principle is universal acknowledged. But simple mathematics reveals the truth that one should never be too optimistic to put a bid on the future, unless it’s 25% or more better than the current one. In fact, in my opinion, 25% is not enough. If we take into account the time wasted in waiting for the future, I won’t bid for it unless it’s 30% or more better than the current one.

I am not trying to persuade others to be conservative. In fact, I encourage taking a risk on high-rewarded opportunity. But the Murphy’s Law states “things will go wrong in any given situation if you give them a chance.” [1] The future event will always have a larger probability to go wrong than your expectation. Therefore, if you want to be greedy, the best algorithm is not choosing the best choice in terms of result, but the best choice in terms of expectation. That’s the usage of probability. :)

Thought #3: No bargain choice, or don’t catch the deal if you don’t want it.

Some people make decisions to do something not based on their need, but because doing those are easy. In other word, they want to catch the deal. For instance, a friend of mine had two choices: one was going to a big company as an intern; the other was going to US for graduate study. The previous offer would delay his admission for half a year. Actually, the previous offer, even accepted, wouldn’t help much about the graduate study here. However, he would like to choose the first one because he thought that the later one was “difficult” for him at that moment. Therefore, the previous choice is like a bargain — you can live without it, but if it comes, just get it.

I am going to say “no” to bargain choice. First, bargain choice will misdirect one from the main road. Second, as Paul Graham pointed out, bargain choice will consume your energy [6], and you will be controlled by all these bargain choices. If you don’t really want it, why get it? Remember that more is less, and too much bargain choice will degrade your vision in making choices.

I’ve put all four thoughts here. If you have other principles or idea that is worth while sharing, why not leave your comments? ;)

PS: I am not an expert in making choice per se. Here I just summarize my thoughts in making choice. I will be very glad if someone can help translate this article back to Chinese, as I really have no time to do this.

References:

[1]: To be or not to be in Wikipedia
[2]: The ultimate answer
[3]: Occam’s Razor
[4]: http://blog.wangjunyu.net [A Google China employee's Blog, recommended]
[5]: The Paradox of Choice — Why more is less.
[6]: Stuff by Paul Graham

As an ESL (English as Second language) student, I usually have a fear of writing articles. Nevertheless, I have to write about one article per week, either for learning English or for recoding my idea. For many people in China, their killer application is Word and Kingsoft Ciba. They simply type a Chinese phrase into the electronic dictionary, copy and paste the English word, do some grammar check in Word. After doing all of this and Word stops reporting any spelling and grammar error, they feel a grant sense of achievement. I was one of them before.

In the meanwhile, as a Linux deadhead, I dislike M$ products emotionally. It seems to me that the only way out is AbiWord or Openoffice. I’ve used both for a while. Yet, I have to say that they are helpful but not perfect. To use them, I have to prepare a text file, which is inconvenient when you are working on a Tex file. For MacOSX, the other thing is I have to install X11. Don’t get me wrong, *nix is industrial-strength and designed to do everything solely with the shell. (Well, WoW is the last thing in my mind.)

After a painful Googling, now I have at least four tools helping ESL writing.

1. GNU Aspell.

GNU Aspell is a Free and Open Source spell checker. It supports the spell checking for source codes, script comments, TeX files as well as HTML web page and email. Aspell provides the user both interactive and batch mode. It contains several advanced features that are missing in both M$ Office and OO such as text-file-based user-defined dictionary and “sound like” (e.g., know and no). GNU Aspell is definitely for literate programmers or PhD. students who want to write elegant code comments and academic articles.

2. GNU diction

GNU diction is originated from the diction on the AT&T UNIX. It is actually a rule-based style checker. I’ve read the code thoroughly and found that almost every piece of the rule came from a book titled “The Elements of Style” authored by William Strunk. That is to say, you have an “Elements of Style” in your pocket now. Please note that the simple grammar checker in Word has nothing to do with style checking. GNU diction is a charming complement to Word/Openoffice if you insist using them.

As it is rule based. It sometimes provides redundant information even your usage is indeed correct. As D.E. Knuth has mentioned in the “Mathematical Writing”, the analysis of diction is quite superficial. “However, said Don, these programs are kind of fun. And they do provide an excuse to read the document from another point of view. Even if the analysis is wrong it does prompt you to re-read your prose, and this has to be a good thing”.

3. GNU Style

GNU style is contained in GNU diction package. It will report the readability of your article in several well-known indexes. For the native speaker, these are used for improving the readability of the article. Nevertheless, for ESL students, these indexes would be viewed as the writing level in terms of “grade/school year to understand your article for average American”. In my opinion, we ESL students should prevent using too naive words and too simple sentences in technical writing. Definitely don’t use a million dollar word where a one-dollar word will do. Yet for ESL students, trying to use some new and sophisticated words would eventually boost the ability in writing.

4. LanguageTool (GPLed)

It is an open source language checker for English and other languages based on Java. I began to use it recently. It’s better than the embedded grammar checker in Openoffice. Moreover, it does support CLI mode and web mode. This is the missing tool on the Linux platform for grammar checking.

I can remember when I was a collage student, I struggled to write English articles with M$ word or Openoffice. My personal experience with English writing and M$ Word grammar checker brought me the truth that we should never ever rely the quality on the f**king damn grammar checker. As a rule of thumb, the best way to improve ESL writing skill is to write and to practice.

BTW: In preparing this article, I’ve employed vim, aspell, diction, style, languagetool and other tools on the Linux and Mac platform.

I came across a story from the Solidot (A Chinese version of Slashdot) this morning that one of the developers in Fcitx project finally decided to terminate it, one of the top open source (GPLed) Chinese input method project on *nix platform. For English-speaking users, the importance of the IME might not be fully realized. For Linux users living in East Asia, IME is somehow equivalent to keyboard. IME is so critical to the software platform such that Google also has developed a Chinese IME recently. So, why did this developer make this decision to terminate such a significant (and also a well-known) project? According to the main page of the project, the core developer got pissed off by an other developer who criticizes this project as a “poor and ugly” coding style. So my question arises: can we developers in the open source community criticize others’ projects on the basis of “coding style”?

Further reading prompts me that criticizing others’ coding style is very common in the open source community. I am not an expert in coding per se, however, I have at least 10 reasons why we shouldn’t have the holy war on the coding or design style towards other developers.

1) In open source community, coding is for run or for fun, not (merely) for read.

Traditionally, most people would thought that the skills that required to write one’s own software are so advanced that one could never hope to write his/her own code one day. However, lots of advanced programming tool such as IDE and some high-level script language have inherently remodeled the schema. Now, lots of beginners are willing to write some code to get things done; and they are as passionate as the gurus to put their code in the public domain. Consequently, their ugly coding/design style has been criticized by others in the community for “not readable” or “not beautiful”. However, what is the purpose of the open source movement? I would like to say that open source movement is about sharing and freedom—you can learn from others and do whatever you want. However, no one in the open source community aims to write the “textbook” source code. We basically write the code for a special purpose. Therefore, people should not criticize form an aesthetic perspective. After all, the coding is for run or for personal fun. Coding for reading is not the purpose per se–at least it is not the original purpose. So, we have to bear with the truth that every developer has his own coding/design style; even sometimes the style is goofy.

2) Style is highly restricted by the language feature, or, the ‘native’ programming language this developer uses.

It has been quite a long time since PERL was first time called pathologically eclectic rubbish lister. However, as I know, lots of researchers around the world use PERL on a daily base. In our department, many colleagues use PERL as their ‘native’ language despite that PERL is somehow write only. Provided this, it is unfair to judge others’ coding style as it differ from language to language. For example, my ‘native’ programming language is Java. When first time I read the book “Text Processing in Python”, I found that “map” operation is amazing when I need apply some homogenous operation on every element in a container with iterator. However, in practice, I still cannot help using “for loop” instead of “map”. As syntactical sugar differs from language to language, it usually requires quite amount of experience before one actually realize the right coding style in some programming languages other than the native one. Therefore, the holy war on the coding style here is similar to the holy war on English and Spanish, which is vulgar and intolerant.

3) Design pattern and coding style reflect the underlying thinking, or different design purpose, which might be difference from person to person.

If you Google “code style”, you will find tons of guidelines ranging from kernel programming to CSS coding style. This is because for a robust and collaborated open source project, a nice coding style will significantly reduce the communication overhead as well as the time wasted on the maintenance. However, what if some one is going to write a system or framework from scratch?

Basically, a coding/design style will somehow reflect the underlying idea. For example, I guess I am not the only one who dislikes the wrapping design in the Java.io package. In order to use a single Unicode string reader from file, you have to warp the FileInputStream with several objects; while in C++ or python, a single statement settles all the chaos. However, the idea of Java IO is putting least assumption upon the IO stream and providing the programmer with the most flexibility. Therefore, merely criticize the coding style to use Java IO or the design of java IO is gratuitous. Frankly speaking, I do not feel like the design style of Apache Struts as it is very complicated for me to deploy a small system (That’s why we have RoR); however, Struts strictly implements the MVC model 2 pattern and hereby makes itself powerful for large systems. Everyone can make up his/her own style at this point. Therefore, the holy war on the coding/design style is similar to choose between apples and pears—it’s not necessary to gauge which one is better, as they are just different fruits from different genesis.

4) Although there might be some rule of thumb to write beautiful code, there is not a unique standard.

As I’ve mentioned before, there is no unique standard in coding/design style as you can always attack the same problem in different approaches. This statement not only holds for the coding style, but also for the XML configuration file style and other design-related stuff. For example, when I was learning the Apache Ant, I would like to call the build.xml as makefile.xml or to have separated XML files for each target, etc. However, it is hard to tell which design is better. In the IME case, the programmer simply uses Chinese as the tag name in the XML file. As you might know, as long as the XML is encoded in UTF-8, it is an issue in neither understanding nor program migration. However, this design was criticized for “not very i18n”. I would say that this judgment is goofy and absurd.

5) Stay foolish

As it is usually hard to define which style is better, the developers in the open source community should indeed stay foolish. I do agree that arguments and the discussions towards a particular project are quite helpful. However, keep this discussion in a polite and elegant way would be more productive.

6) Never judging people by their code style.

Because this doesn’t make any sense. In the book “12 habits that hold good people back”, the author mentioned a kind of people who “see the world in black and white”. One who simply judging person by the coding style falls to this category. Coding style is not the whole part in programming. Moreover, a poor coding style/design does not necessary means the lacking of ability in developing the system.

7) Refactory is a procedure, not a purpose.

Just like Feynman’s famous quote that “physics is like sex”, refactory of the code is just like sex too. Although it may give some practical results, but that’s not why we do it.

In short, refactory is a procedure that makes the code more readable or easier to maintain. However, there is not reason why we should refact the code solely for the aesthetic purpose. I can image that the GoF’s book will boost a passion from the bottom of the hearts of the readers to refact every piece of code written by others. I was one of them. The GoF’s book will definitely help build a “sensitive nose” that can sniff the smell of the code. Probably applying a nice coding style or design pattern is a good practice. I still insist that a nice coding style or sophisticate design pattern is not the reason why we write code.

8) Efficiency or beauty is an issue, but making it workable is the first priority.

When I was interviewed with Google, one of the interviewers gave me a very good suggestion when I got stuck in a problem. He told me that the philosophy at Google is “first make it work, then improve it”. In the open source community, usually the software is for solving a real world problem. Therefore, making it work is much more important than making it beautiful. I have to concede that there are some developers who can achieve both goals in the same time. Nevertheless, for most developers, usually the code is ugly and awkward at the very beginning. If I have to choose in between a piece of workable but ugly code and a piece of beautiful but malfunctioned code, I prefer the previous one. I guess except the coding-style paranoia, everyone will choose the first one. The truth is (or 80-20 principle tells us that): an ugly but workable prototype will cost 20% of the total time; a pretty and not necessarily workable prototype will cost about 80% of the total developing time. As software is changing all the time, on cannot expect to have a “final” version that is both beautiful and workable. Therefore, to choose workable instead of pretty code is wise.

9) Peer review is about finding the (potential) bug, not about the coding style.

I’ve heard that in many big name companies such as Google, the code peer review plays a quite important role. I’ve also once been an intern at Siemens. There, before checking the code into the code repository, usually a colleague will go though your code to see it there’s something wrong. (Of course you have to pass the unit test before you checking in). According to my experience, peer review is more about the nice practice of extreme programming than the code style exam.

In the open source community, the scenario changes: everyone can read your code and figure out what happens in the code. While the developer should expect the feedback from the community, it should be in the form of suggestion or patches instead of fierce criticizing, especially on the coding/design style. Again I would emphasize that open source community should always be polite to the contributors while cruel to the malicious saboteurs.

10) Instead of to say something, why not to do something.

The best way to contribute the open source community is not to say something, but to do something. For instance, if you feel uncomfortable about one project, then instead of writing a letter to the author complaining about their poor coding style, why not just refacting the code and republishing the code? In my humble opinion, barking dogs seldom bite.

What’s Web 2.0 in a youth view?

At first glance, the Web 2.0 looks like a buzzword for attracting the attention from both VC and Internet users. Everybody talks about it, but few can tell what it is. For the startups, Web 2.0 strategies promise next YouTube or Facebook in their Business Plans. For Internet users, Web 2.0 sounds like cool and fashion. You are left in the Stone Age if you haven’t had a MySpace homepage or have never visited YouTube for hilarious video clips. In the other side, to my knowledge, all the success Web 2.0 companies like YouTube, Facebook that we might use everyday never declare themselves as Web 2.0 companies explicitly, although they are considered as the flagships of Web 2.0. Web 2.0 also has nothing to do with technological innovations or the next generation Internet. So is web2.0 hype or propaganda?


Web 2.0 is all about communication, sharing and passion for our Y-Generation

It is easy to assert that Web 2.0 is nothing but a buzzword. But it’s an illusion. We use the Internet everyday. To our own experience, the contemporary Web is something for us as we can feel the life has changed since the beginning of Web 2.0. However, even we create it, use it, and talk about it, the one million dollar questions are still there: What is the definition of web 2.0 in the dictionary of youth and what’s the big deal about Web2.0 for us? Needless to say, youth has their own definition about Web 2.0 as they experience “their” web 2.0 everyday. Although there is a very detailed definition of the term Web 2.0 in Tim O’Reilly’s article “What is Web 2.0″, it is still hard to define what it is for youth. The study of Web 2.0 and youth is interesting enough to write a whole book about it. To make the long story short, here, I would like to say, Web 2.0 is all about communication, sharing and passion for our Y-Generation. Web2.0 are our creations, our portals, our communities and our web classrooms.


Web 2.0: Created by Youth.

YouTube was founded by Chad Hurley (at age 28), Steve Chen (27), and Jawed Karim (26); MySpace was founded in July 2003 by Tom Anderson (28); Facebook was created by Mark Zuckerberg (20) in 2004 when he was sophomore. Why Web 2.0 is more likely to be created by the young generation? To understand this, we need to have a deep insight towards the definition of Web 2.0. Despite the bells and whistles, Web 2.0 is nothing more about a new application platform instead of an evolutionary technology. Actually the key technology of Web2.0, which is usually referred as Ajax, was born in the early 2000. Web 2.0 is an updated version of World Wide Web. The original purpose of the World Wide Web is to make the Internet meet the increasing communication requirements around the world. Here, the key idea is to fit the communication requirements. Correspondingly, Web 2.0 is not a new technology or a new business model; it is a satisfaction of the long existing requirement on the Web. Technology and business model is second to the satisfactions of users’ communication requirements. Therefore, theoretically, in Web 2.0, as long as you have the new ideas that can cater for the communication requirements of the Internet user, it is fairly easy to get start as the resources has never been so accessible in Web 2.0 in terms of both investments and technical teams. Nearly in every corner of the world, you can find several groups or teams with members vary from professional businessmen to youth just graduated, talking about web 2.0 and working toward their dreams. The only differences are the content of their websites and the target users of the websites. Frankly speaking, bubble is everywhere in the contemporary Internet. VCs won’t just fund companies because it’s cool and you won’t provide users excellent service for free when no business model is presented. Still, thousands of youth with passion dive into the Web2.0 Ocean without caring about if the competition is overwhelming.

Web 2.0 is not a new technology or a new business model; it is a satisfaction of the long existing requirement on the Web.

Needless to say, to fit the requirements is much more challenging than to set up a website. The Internet giants like Yahoo and Google tend to cover every aspect of the Internet applications, but eventually, they can fit few as they are supposed to meet the need of everybody. As the history has told us several times: new requirements are usually discovered by grass-root instead of the elite or giants. From this perspective, Web2.0 has no differences with previous industrial booming. However, there is a significant factor that puts the youth as the avant-grade class in Web 2.0: the passion and the advantage in the age. For example, in the year 2006, about 76% of the Internet users China are below 25. Moreover, the new Internet users are mainly young generation and they are glad with new websites and to adopt new innovations. This phenomenon is not only observed in the China but also in other countries like United States and Korea. The young Internet users push the Internet atmosphere to the young end. It is well known that Facebook and MySpace are mainly created for youth. Originally the age limit for MySpace was 16 and up but it is 14 and up now. It more or less reflected that the Internet users are younger now. As the Internet is mainly used by youth now, there is no wonder that web 2.0 is mainly created by the Y-Generation.


Web 2.0, personal site and me-media for youth.

Now that the Web2.0 applications are created, the next important thing is making it flourish rather than letting it perish. As other Web1.0 websites, Web 2.0 also weaves networks with nothing special. However, this is a participatory web now. As TIMES has pointed out, this is all about “You”. People didn’t realize the values of their own in the Internet before. It is not because people are not brilliant enough to discovery their needs. The reason lies behind this is the lack of web service infrastructure. In the past, it is very hard to have your own website or web gallery on the Web, as at least you need to know HTML, flash and web programming. Additionally, very few websites provide free services like online picture management or blog systems. All these summed up made the users very hard to express themselves, even though they are very willing to. However, the advanced technology makes all of these services come to users via a simple registration. Now, as tons of websites are created every year that provides the photo uploading, online bookmark, video sharing or blog services for free, people start to use the web as their new platforms. Currently, without the difficulty, average Internet users can upload their contents–no matter it is an eyeball-catching article, a hilarious video or just a personal photo on the Web. They now focus on the contents instead of the irrelevant technological details. As the Web now is easy and ready to use, users now become the producer and director of the contents on the Internet. Usually people use the term UGC (User Generated Content) to describe this contemporary trend in Web 2.0.

The passion of youth makes the Web2.0 so vivid and happening.

You can image that Web 2.0 makes the Internet look like a fast growing organism that doubles itself every18 months. If Web 2.0 is so vivid, what is the personality of Web2.0 in youth perspective? To answer this, let’s see what is the personality of youth. Although there is no standard answer, when talking about youth, these words must in the top list: passion, fashion, thinking different, open and willing to make friends. Microsoft has a very famous slogan: “your potential, our passion”. In Web 2.0, probably the best slogan will be “your passion, our potential” for youth. How do I say that? As I’ve talked, Web 2.0 creates a new and easy-to-use platform and users are the actors and directors. The reason why they choose Web2.0 as the platform is partly the passion of the youth and partly the willing to show off–with passion, they are willing to express and contribute contents to show-off. Susan Ng, a Facebook user, said: “I want to tell others what I am doing”. Susan is not the only one who wants to show off on the Internet. Thousands of youth have personal web pages and write blog. The sidebar of blog is podcast, public photo gallery and video clips. All these media are in one category: me-media. Web 2.0 now becomes the me-media of youth and we are both the producers and consumers at one time. The passion of youth makes the Web2.0 so vivid and happening. Therefore, in my point of view, the personality of Web 2.0 is passion and showing off.


We not only share, we even meet. Web 2.0 as our communities

Wikipedia describes Web 2.0 (I cannot find the definition of Web 2.0 in other encyclopedias at the time I write this article) as “supposed second generation of Internet-based services”. Some typical Web 2.0 applications include social networking sites, wikis, and communication tools. It is far beyond the simple blog or podcast system for personal use. So one question arises: why are they on the Internet besides for expressing themselves? Actually, the motive for youth using Web 2.0 is one part showing off and one part meeting friends. As common Internet user, one of the most exciting finds in Web 2.0 for our youth is that our friends are on the Internet too. As Danah Boyd mentioned: “For most teens, it is simply a part of everyday life — they are there because their friends are there and they are there to hang out with those friends.” Web 2.0 is about connecting people, and making it for efficient for people to communicate. In the social network, it is nothing different than the real community: you should make yourself link-friendly.

But you may argue that Web1.0 also connects people, so why it is Web 2.0 instead of 1.0 that makes the communities possible? In the first place, you have to take the development of the Internet into account. As we know, community is based on the communication infrastructures such as email and instant messenger (IM). In the previous web, the companies put considerable efforts on building the basic communication tools to that can users can get connected. Only after that, can the users come up with the new requirements such as communities and special interest groups. Created in the year 2003, MySpace mainly serves 20-something and teens. It has blog, IM, mail, music video, chat and photo gallery and almost covers almost every possible communication approach on the Internet. MySpace provides every aspect of typical the personal website and thus attracts more and more users.

Get connected and stay connected.

In the second place, in the previous Internet, people were connected too; but it is difficult to stay connected, as the relationships on the Internet are unreal. For example, as I’ve said before, it is hard to set up personal profile in the previous web. Therefore, no one knows if you are a dog. However, now, in Web 2.0, since everyone has the public accessible web page, the public profile becomes real and the connections on the web are more concrete. It is not like eventually met some stranger in the road and the connection later missed; it is like a real community that everyone knows about each other. A report shows that MySpace takes about 11.9% of the total time spend online in the United States in December 2006, which is also miles ahead other websites. Why? Because comparing with other websites like Yahoo and MSN, it is easy to find friends on MySpace. The other mechanism to facilitate the real relationships is the offline communication. Internet is a virtual community, but the offline activities like parties and dating are real. In Facebook, if you are in a group, either fortunately or unfortunately, you will get tons of invitations from Wednesday to Friday about the parties or concerts on the weekend. Web 2.0 is the real community not only because it is related to the real world, but also it is the natural extension of the real community on the Internet. Web 2.0 was born to build the virtual community and turns out to be the web version of the real world.

Web 2.0, a new classroom

It is very easy to overlook the importance of these Web 2.0 sites in terms of education as they are usually time-killing websites for you. For example, one of my friends created a group named “I’m On Facebook When I Should Be Studying” and now they have lots members. Now, keeping youth in the classroom are overwhelming enigmatic challenges. There are, however, lots of educational websites that can be our classroom, on the Internet. Education is not only in the classroom or library now. It is also in the cyber space. If the world is really flat as Thomas L. Friedman argues, every student at every corner of the world can benefit from Web 2.0. Actually education on the Web brings together a community of learners into a virtual classroom. Jingxue Zhang is a student in China. He check the MIT OCW (Open Courseware) webpage regularly that provides “a free and open educational resource (OER) for educators, students, and self-learners around the world”. He uses OCW to teach himself Electrical Engineering and Computer Science. Nevertheless, he can discuss with his virtual classmates via newsgroup or email.

Play is not the only part of the youth’s life in Web 2.0

Since the birth of the Internet, education has no longer been limited by the time, place, media or instructor. However, even we are in the center of information explosion, ninety percent of the information we wade through will be useless and selecting that ten percent becomes a challenge. Web 2.0 settles this in an entirely different methodology–the power of community. Wikipedia is a hypertext writing system written collaboratively by volunteers. Comparing with blog system that emphasizes the personal knowledge or experience, Wikipedia highlights notable knowledge collaboration and sharing on the Internet. It becomes more and more important as an online encyclopedia. Jerry Kim, an undergraduate student in South Korea, says that he usually uses Wikipedia to get the basic idea about some unfamiliar terminologies then follows the external links and references to teach himself some concepts about Artificial Intelligence, a subfield of Computer Science. He also used Wikipedia to prepare the questions for the trivia night. He has confidence about the trivia cited from Wikipedia because if it contains mistake, someone will correct it in a blink. The basic idea of collective intelligence is that everyone has knowledge that is valuable so someone. Wikipedia and Web 2.0 are the platforms for this intelligence. Education is everywhere in Web 2.0. Even in Second Life, a 3-D virtual online game world that entirely built and owned by its users, many universities and educational institutions are already using it as a supplement to traditional classroom environments. Search engine and other Web 2.0 applications like Ask Yahoo also highly facilitates the knowledge discovery of youth. World Wide Web is worldwide classroom in Web 2.0 at this point.

Frankly speaking, Web 2.0 is a wide and hot topic as well as the youth. They offer considerable food for thought. The four aspects I listed here is one of many possible perspectives in talking about Web 2.0 and youth culture. There’s so much potential and I really believe that the passion of youth makes the Web so vivid and youth could directly benefit from riding the Web 2.0 wave. In the end, the bottom line boils down to one sentence: perhaps Web 2.0 is the most important force in shaping Internet and youth culture in the early twenty-first century.

About the author:

Eric You XU is an independent blogger. He writes blog posts in both Chinese and English. He defined himself as a Web 2.0 critic as well as an advocate. He received a Bachelor of Science degree at Nanjing University (China) and now he is a doctoral student at Washington University. He loves writing, thinking and exchanging inspirational ideas. You can reach him at youxu@wustl.edu

Notes: This is an invited article for ITU Techcom World Horizon magazine. I personally would like to thank the editor-in-chief, Mr. George Ran Ren, for his invitation and kind support. I also would like to thank the whole Horizon team for their endeavors and contribution in making such a wonderful and fascinating magazine. Thanks, George and all the team members. Keep up the good work.

(As it is an invited article, please do NOT copy-and-paste it elsewhere. Any kinds of comments are welcome. I will provide a link to the magazine once the magazine gets “out of beta”. :)

All the materials are cited from Google. I highlighted some important items that might be useful for our Chinese students.

>General information to include

To make it easier for us to determine where you might best fit within our organization we ask that you take a few simple steps to help us understand your qualifications. Following the guidelines below will ensure your resume/CV finds its way to the appropriate group more quickly, giving you a better opportunity to discuss your qualifications in person or via a phone screen.

* All resumes/CVs and supporting materials must be submitted electronically; no paper resumes will be accepted.
* PDF, HTML, or Microsoft Word documents or text formats are acceptable – or you can submit using plain text format.
* All resumes and related materials (transcripts, etc.) should be submitted in English.
* Pictures, images, or other graphics are not necessary – and are discouraged as they can slow evaluations.
* Only send essential personal information – be sure to include your name and how to contact you in the resume, not just your cover letter. Include your email, phone, and residence address. Do NOT include your gender, date of birth, age, family status, or personal identification numbers.
* It isn’t necessary to include military service you may have performed, unless it reflects some special achievements or accomplishments that you feel illustrate your qualifications for the job.
* To increase the accuracy of the information we have about you and the speed with which we’re able to reply to your submission, please keep your resume clean and simple. The use of special formatting, tables, images, multiple columns, etc., can decrease the ability to accurately review resumes. As we’ve found with Google itself, plain text works best!

>Submitting a resume – Educational background

Your resume/CV should reflect your academic achievements and accomplishments in these areas. In the education section of your resume, be sure to include the information outlined below.

* Your resume should show all post-secondary institutions attended, degrees conferred, and a cumulative grade point average (if available) for each degree received.
* Only report your educational history dating back to the university level; do NOT include elementary or secondary schooling. However, if you completed a “year abroad” program as part of your pre-university education, feel free to include this in your resume.
* Provide a brief description of any important projects you completed as part of your coursework, and indicate whether it was all your work or done as part of a team. If part of a team, indicate your own role and contributions to the effort.
* If you graduated from a university within the last five years, include a copy of your transcripts (unofficial is okay), a list showing individual coursework completed and grades received, as well as the overall grade average.

>Submitting a resume – Your Work Experience

You may be fresh out of a university, or have substantial work experience and a history of accomplishments. Either way, we want to know what skills you’ve acquired along the way. We’ll look closely at the work experience section of your resume, so the information you provide here is very important. Please follow the guidelines below carefully.

* List your experience – projects completed, accomplishments, etc. – by your position with each employer.
* Include more information than just the name of your employer and your job title. We also want to see concise, yet important, detail on your specific accomplishments and the impact your efforts had on your company.
* Rather than including all job responsibilities, only focus on those that you feel are relevant to the job for which you are applying at Google.
* If you worked while attending a university, either during the summer or concurrent with your course work, be sure to include a brief mention even if it isn’t specifically related to a potential job at Google.

>Submitting a resume – Additional Information

Here at Google, we value talent and intelligence, group spirit and diversity, creativity and idealism. Googlers range from former neurosurgeons and puzzle champions to alligator wrestlers and Lego maniacs. Tell us what makes you unique!

* Include the names and contact information of 2-3 references. These can include faculty advisors, co-workers, managers, or others who can talk knowledgeably about your skills and abilities
* Be sure to include any awards you’ve received, articles you’ve published, or conference presentations you’ve given.
* We don’t need to see copies of any awards or publications, just a reference to them.
* We don’t need copies of any written references you already have, just a mention of 2-3 individuals that can reflect on your most recent skills and experiences. Be sure to include their contact information. We will not contact your references until after we talk to you.

> What to expect from your interview

* While we’ll certainly do our best to make you feel comfortable during the interview process, we’re very interested in learning more about how you approach problem-solving. The questions you’ll be asked will be in-depth and will be intended to let us get a peek at how you think about complicated things. Many candidates find this challenging, but ultimately exhilarating. It’s your chance to show an appreciative audience exactly how much you’ve learned about your area of expertise.
* Interviews are always conducted in English and you should have a strong command of the language so you’ll be able to describe your ideas clearly. This is essential as all positions interact directly with engineers in the U.S. and other countries.
* Google’s phone screen and in-person interviews are highly technical in nature. You’ll be asked to write code during the interview itself and to speak to the technical details of your past designs and implementations . You should expect that your interviewers will have a great deal of curiosity about the specifics of your work and will ask questions about how you arrived at your conclusions. Our engineers admire and respect the work of others and are truly interested in learning more about what you’ve accomplished and how you did it.