Archive for February, 2007

郑重其事推荐一个朋友的Blog

以后争取每过一段时间推荐一个朋友的Blog. 促进交流,也拓宽我Blog读者的阅读面。大家有什么好的也可以向我推荐。

Schuyler 是我新认识的朋友。 他的英语很好,我们一直用英文聊天。他最喜欢的书是The Art Of Computer Programming,  就是一本某些计算机系本科生都看不下来的书。他最近忙着做一个机器人,而很多人高中的时候还不知道什么叫单片机。他还对卫星航天很感兴趣,观点很有见地。大家可以到他的Blog ( http://www.schuyler.cn/ ) 转转,附带说一下,他的Blog是他自己用DasBlog架设的, 目前看上去还不是太完美,主要是国内用这个的还不多,他也是自己对着文档配置出来的,相当不容易!

当然,说了这么多都是次要的,重点是:他才上高二! 有兴趣的去看看吧 => http://www.schuyler.cn/ :)

江山代有才人出啊! 有时候人和人的差别,就在视野。

Comments (4)

说说Spam

我有5个电子邮件地址,工作上用wustl.edu的,私人用gmail的,当然最后都是用Gmail收取. 以前没啥保护意识,乱公布邮箱,现在每天至少90封垃圾邮件. 因为我的用户名中间有个”.”, 而且都不是英语单词组成, 有些字典攻击的Spam根本就打不到我, 所以相比较而言还不是Spam最严重的目标.

用了好久Gmail, 其垃圾邮件过滤的确是非常好的. 我印象中只有一次False Positive (把正常邮件Spam了),这还是因为那封邮件只有一个hello 和一个图片造成的。按照我的经验, Gmail 识别成False Negative(垃圾邮件识别成正常的) 有以下几种,

1. 像模像样的中文(非英文)邮件 (至少遇到十次):

> 贵公司(厂)财务负责人:
>
> 您好!
>
> 本公司是一家专业为国内外企业提供专项经营服务的股份制企业,可为各类企业代开各种发票,(主要包括:广东省及国内各省统一商品发票、增值税、机械、建筑、广告、运输、服务、税务代开类等发票),公司的客户遍布全国各地,本公司一贯以”诚信、高效、务实”的经营理念和”稳妥、快捷”的经营作风,不断地锐意进取、与时俱进,竭诚为全国各地有需要的客户提供更加全面、更加到位的服务。热忱欢迎您的垂询!!
>
> 顺颂商祺!!!

本来Gmail对中文支持就不好,更不说垃圾邮件识别了。况且这些邮件实在太不像垃圾邮件了。当然这种垃圾邮件有个致命问题,就是有些关键字,只要设置一个 “发票” “财务负责人”, 基本上就可以过掉。(初期我搜”发票”不能定位到这个邮件,也难怪Gmail识别不出来)

2. Gmail 初期, 改换的关键词, 比如把Software 写成 S0ftware (现在基本上没有了)

Need S0ftware?
OEM software - throw packing case, leave CD, use electronic manuals.
Pay for software only and save 75-90%!

Discounts! Special offers! Software for home and office!
TOP 1O ITEMS.

$79 Microsoft Windows Vista Ultimate
$79 MS Office Enterprise 2007

3. 啥文字都没有,含有几个外部图片链接
这些邮件的图片链接是动态内容,也就是说,如果你向那些地址请求图片,spammer 就会知道这个邮箱是有效的,然后,垃圾邮件就会更多。又一次我为了”让垃圾来的更猛烈些吧”,特地把那些图片链接都一一分析了一下, 结果可以想象是垃圾果然更猛烈了……

这段时间大部分的没识别出来的垃圾邮件都是这样的形式,Gmail 基于文本过滤的系统目前没法过滤出来,大约50%的这些邮件往往都在我收件箱。如果都用外部图片的话,虽然Gmail可以不显示图片,但却还是不能识别这些垃圾邮件。至于具体怎么弄,各位读者有没有什么好办法?

如果大家对Spam过滤的研究感兴趣,可以读著名的Essays 系列 (Google:Essays 第一条结果) :

http://www.paulgraham.com/antispam.html

Spam过滤是本质上是Turing Test 问题 (区分人和机器), 因此不要过高期望Spam能够100%被挡掉。但是简单的往往是最好的,基于文本的Bayesian分类器已经非常之强大了, 如果你想到了OCR方法, 我还是劝你暂时先放弃这个吧. 说不定等Google能在图片旁边放AdSense的时候,你再来研究这个 :)

Spam有时候也能变成黑客搞笑的东西,比如说

Google Blogoscoped 就写过一个 十大对付Gmail Spam的办法 , 包括火烧,送Bill Gates 等等。可谓Spam十大酷刑。
当然也有现代派的人把 Spam的标题串成诗的 .读起来比梨花体好多了. http://www.spam-poetry.com/
GNU也有经典的Spam Joke: http://www.gnu.org/fun/humor.html#TOCSpam

PS:
我倒是想,要是我和各位读者说,如果你要给我发邮件,请在邮件中包含一行这样的字符:
5^&&(&*@29bd8067ab0c822cc@)$*@)!$*)@56f485fbd675544ba69d5ec($*#&$@&#@(!
或者我自己从dev/random 拿个串来过一次sha1sum, 把这个串做成我Gtalk/MSN/Email 签名档让大家都能看到。然后我用Gmail Filter, 不含这个的通通Spam, 我就不信Spam 就这么强大,哼哼!

当然,此建议因为强烈不切实际而没啥操作性~~

Comments

我们的晚会照片节选


大家先吃饭 (注意画面左边那些剪纸很漂亮的)


不知道是不是校长,反正是领导了


主持人


武林高手(四岁就练的真功夫)


舞蹈队跳《竹枝词》


专业女高音,在Webster读音乐专业


断背之歌


CSSA两个主席单挑吉他


菊花台,现代与古典


我表演农民企业家 胡话胡说节目现场


被张哲的表演打败了,差点笑翻


最佳着装奖!

You! I want you to join the CSSA Party! XD


游戏环节!


我也凑热闹,结果猜错答案,被迫上演泰坦尼克(最后还是沉了~~)


You XU同学暴笑中…


百里的吉他


相声《吃元宵》


诗朗诵,这位是职业广播电台主持人


新疆舞。原来武林高手也是舞林高手


一曲 明天会更好


为什么拍照不叫我,55555


总导演小波在逗小孩玩

我是从Flickr 盗链的别人拍的照片,hoho~~

Comments (2)

过年了,拜年了

过年了,各位读者身体健康,阖家欢乐。我在美国给各位拜个年!

===心想事成==恭喜发财======

下面是我的talk show
--和谐社会的和谐春晚

据说今年春晚一个猪字都没听到。好! 多么感人多么和谐多么政治正确的春晚!

政治正确(Political correctness)是一个美国词汇,是美国民权运动的产物,意思为在语言和行动上最大程度的减小对宗族,宗教等特定人群的感情伤害。比如说,黑人不能叫黑人,要叫非洲裔美国人(埃及人和南非白人请自觉靠边); 消防员邮递员一致要用person结尾;第三人称一定要he/she, 不许只写一个;连huMAN都不许说,要说people. 再比如说,不能说Christmas, 要说Xmas, Holiday, Season's Greeting, etc.

我也常常想,咱们的和谐社会,就要方方面面和谐;作为全国人民都看的节目,一定要政治正确, 不能伤害少数人感情;果然啊,在中国少数民族的感情要求下,在我们和谐社会和宗教团结的旗帜下,今年春晚终于成功的一个猪都没有说

我们算算,中国有12生肖,以后春晚哪些不能说呢?

子鼠,对不起,不许说,为啥,老鼠常偷吃粮食、咬坏家具、衣服,戕害人民。我们常常说硕鼠,你说老鼠,硕鼠阶层鼠年跟你急。要改小白鼠就不伤害人家了。

丑牛,对不起,不许提,牛是印度教的圣物,中国有一些印度教徒,要尊重人家感情,不能说。只能说黄牛,奶牛。

寅虎,对不起,不许说,大家一定知道白族是崇拜虎的,我们要尊重白族感情,严禁卡通化虎。要卡通请卡通美洲虎。

卯兔,对不起,不许搞,具体请看郑渊洁童话《兔民族的公开信》。我们要尊重动物保护组织,不能搞兔年活动。要改成大家吃不到的月亮上的玉兔。

辰龙,对不起,不行,按照某专家的说法,龙太凶猛,外国人不喜欢,我们要尊重人家外国同胞。改恐龙吧,反正灭绝了,不凶了。

巳蛇,坚决不许提,蛇可是有特殊含义的。提了人家二奶阶层肯定和你打官司。改四脚蛇吧,四脚蛇不是蛇。

午马,不提,蒙古族打天下的东西,要尊重人家朴素的感情。改白马,由于众所周知的原因,白马非马。

未羊,不提,厄尔多斯的牧民的感情要尊重,羊是人家的生活来源,能整天说羊么。改体育运动中的山羊,谁都不伤害。

申猴,不许提,猴是上海休博会吉祥物,当心挑起地域争端。要说就说孙悟空。

酉鸡,嘘,小点声,鸡是不能乱说的,要说性工作者。人家也是有人权的,不能说!

戌犬,藏獒听过没?要尊重青藏高原上的那些灵性神物的感情。想说么,要说人类的忠实朋友。

亥猪,一个字也不许说,要说,谁让你说? 不许说,想说么,来人掌嘴! 要说,和谐年!

好吧,中国十二生肖我看可以这样改:

小白鼠,黄牛,美洲虎,玉兔,恐龙,蜥蜴,白马,体育山羊,孙悟空,工作者,人类朋友 和 和谐年。

这样多和谐,以后CCTV两个人见面打招呼:

哥们,咋穿红裤衩?
哦,今年是和谐年,本命年。师傅您呢?
我属小白鼠,比你小一岁,我老婆属恐龙。
嘿,怪不得哥们这么年轻有为!

Comments (1)

过年了,拜年了

过年了,各位读者身体健康,阖家欢乐。我在美国给各位拜个年!

===心想事成==恭喜发财======

下面是我的talk show
--和谐社会的和谐春晚

据说今年春晚一个猪字都没听到。好! 多么感人多么和谐多么政治正确的春晚!

政治正确(Political correctness)是一个美国词汇,是美国民权运动的产物,意思为在语言和行动上最大程度的减小对宗族,宗教等特定人群的感情伤害。比如说,黑人不能叫黑人,要叫非洲裔美国人(埃及人和南非白人请自觉靠边); 消防员邮递员一致要用person结尾;第三人称一定要he/she, 不许只写一个;连huMAN都不许说,要说people. 再比如说,不能说Christmas, 要说Xmas, Holiday, Season's Greeting, etc.

我也常常想,咱们的和谐社会,就要方方面面和谐;作为全国人民都看的节目,一定要政治正确, 不能伤害少数人感情;果然啊,在中国少数民族的感情要求下,在我们和谐社会和宗教团结的旗帜下,今年春晚终于成功的一个猪都没有说

我们算算,中国有12生肖,以后春晚哪些不能说呢?

子鼠,对不起,不许说,为啥,老鼠常偷吃粮食、咬坏家具、衣服,戕害人民。我们常常说硕鼠,你说老鼠,硕鼠阶层鼠年跟你急。要改小白鼠就不伤害人家了。

丑牛,对不起,不许提,牛是印度教的圣物,中国有一些印度教徒,要尊重人家感情,不能说。只能说黄牛,奶牛。

寅虎,对不起,不许说,大家一定知道白族是崇拜虎的,我们要尊重白族感情,严禁卡通化虎。要卡通请卡通美洲虎。

卯兔,对不起,不许搞,具体请看郑渊洁童话《兔民族的公开信》。我们要尊重动物保护组织,不能搞兔年活动。要改成大家吃不到的月亮上的玉兔。

辰龙,对不起,不行,按照某专家的说法,龙太凶猛,外国人不喜欢,我们要尊重人家外国同胞。改恐龙吧,反正灭绝了,不凶了。

巳蛇,坚决不许提,蛇可是有特殊含义的。提了人家二奶阶层肯定和你打官司。改四脚蛇吧,四脚蛇不是蛇。

午马,不提,蒙古族打天下的东西,要尊重人家朴素的感情。改白马,由于众所周知的原因,白马非马。

未羊,不提,厄尔多斯的牧民的感情要尊重,羊是人家的生活来源,能整天说羊么。改体育运动中的山羊,谁都不伤害。

申猴,不许提,猴是上海休博会吉祥物,当心挑起地域争端。要说就说孙悟空。

酉鸡,嘘,小点声,鸡是不能乱说的,要说性工作者。人家也是有人权的,不能说!

戌犬,藏獒听过没?要尊重青藏高原上的那些灵性神物的感情。想说么,要说人类的忠实朋友。

亥猪,一个字也不许说,要说,谁让你说? 不许说,想说么,来人掌嘴! 要说,和谐年!

好吧,中国十二生肖我看可以这样改:

小白鼠,黄牛,美洲虎,玉兔,恐龙,蜥蜴,白马,体育山羊,孙悟空,工作者,人类朋友 和 和谐年。

这样多和谐,以后CCTV两个人见面打招呼:

哥们,咋穿红裤衩?
哦,今年是和谐年,本命年。师傅您呢?
我属小白鼠,比你小一岁,我老婆属恐龙。
嘿,怪不得哥们这么年轻有为!

Comments (1)

WTH I’ve said

Just holy crap, shame on me!
// Well, I am OK.

Comments

Oops, love is <3 and Google needs a spell checker


If you type <3 in Gmail talk (Note: Not the gtalk client or any jabber client), you will get a gif.
Take a look at this and this.

Well, no wonder! Love is really the thing needs <3 people :)
BTW, this is Google’s LOGO today:

Is it Google or Googe? Well, the romantic version is: g and l are falling in love. Or, you can not spell “girl” without g and l, they are together! (Chinese Version)

The XXX version is, they are 69ing. (Oh dude, I just cite the comments from Digg)

// BTW, The Internet is a series of tubes, not pipes. So enjoy YouTube and no bother Yahoo! Pipe. LOL!

Comments

Happy Valentine’s Day Everyone!

It's Valentine's Day.
Well, it's V-Day! :)
图像

victorie! 

Comments (4)

It’s snowing white cats and dogs

Weird weather in St. Louis.

White Dog

White Cat

Comments (2)

Yet another close letter to Google CN

Blogger Isaac Mao posted an open letter to Google Founders, which provided three suggestions. For me, I don’t quite agree with him, so here are my comments. I do believe that Dr. Kai-Fu Lee is the right person for Google China and the current strategy is right. (Come on, I have no relationship with him:) Well, probably in detail, Google China needs some improve/change, but the fundamental strategy is right. You can say no to me, please leave your comments freely.

Dear Larry and Sergey,

I’m writing you the short letter on behalf of many Internet users in China to have some suggestions to resolve the current dilemma for Google in China, from both business and social perspectives.

Google China now is not exactly in a dilemma. When we say dilemma, which means you can not go either way. However, we can see the progress in China. The marketing share decreasing in China is not necessarily the dilemma.

During the National Day holiday week in 2002, when Google.com was blocked in China for the first time, Chinese Google users made an online protest spontaneously. They appealed to free the purer search engine wave by wave. Its seemed its also the first time grassroots power was demonstrated in China on Internet. You can imagine how eager they are to have a complete Internet instead of a shrinked one. At last, people won, Google backed. However, after 4 years, we started to question whether we should continue to support Google. Many users here were disappointed when they found Google.cn filtered many keywords. The compromise remarks by you in Davos made us more frustrated. Seems you are adopting self-censorship which hurts those loyal users a lot which also devalue your motto of “non-evil”.

Here your basic assumption is that GFW is evil, and when Google filter the content himself, it is kind of evil. Let’s put it this way, if you can access google but usually get a connection reset, are you annoyed? Yes, we are professional user, we can bypass the sensitive keywords, we can setup proxy, we can do everything to fight with GFW. But the problem is, what should the common user do? The are expected to get a result, no matter sensitive or not, related to their search. However, sometimes, even their keywords are not sensitive, unfortunately, in the returning result, there is a sensitive content. Boo, they get a connection reset. Who can they blame? They are using Google, right? GFW will not say: “Sorry, your connection is reseted by GFW, please try later or dail XXX for more information”. To guarantee the user experiences, some compromises are needed here in China. I know nearly every blogger in China consider the GFW as evil. However, self-censorship is a down-to-earth strategy to make things work and protect Google itself in China. I think the Google’s philosophy is first make it work, then improve it. It is hard to say this is evil or not. For instance, if you can use Google but usually get the annoying connection reset everyday, what will you do? Will you choose Yahoo! or Baidu? Actually Google.cn is facing the small-business more than the blogger as the small-business will bring google the major income in China. Thus, to make the Google search work in China is much more important than other issues like the content. To assume every user has the technical knowledge and is patient enough to use Google behind the powerful GFW is gratuitous.

Google is ever regarded not only a leading Internet business, but a hope for many people around the world to open their thinking. Many bloggers in China still believes that in their everyday writings. We guess you were misled by incomplete information on how censorship is good to Chinese people. The fact is Google in the 130M-Internet-Users country is losing loyal users with loosing your principles. We understand its tough to anyone to make decisions. But it high time to change it back to the right track. Here we would like to propose 3 ideas to Google for its China strategy in a long term run, to survive, and live better:

The question is, who is the loyal users for Google now? Let me put it this way, do you really think that the Blogger in China will contribute more to Google China than the common user in terms of the income or searching market share? Do you really think the small business will not pay for Google only because Google self-censored the content without the overall quality of related AdWords?

1. Set up a 1B US$ corporate venture fund to invest in China’s Internet pioneer sites and cutting edge companies. The venture fund can be managed by experienced fund managers and industry gurus who really understand the value of Google, as well market potential of China. In my estimation, a venture fund with such a size can invest over 100 deals totally cover 60% of Internet traffic in China. With venture fund strategy, Google can play its manageable chaotic game in a capital way.

This idea is really bad. If Google really wants to setup up a VC, the best place for this fund is Silicon Valley instead of China now and in the near future. The main problem for Google China is the market share in the searching market instead of the whole Internet market. To invest the Internet company in China, which is actually invest the accessing point of Internet and the content producer or communities in China, Google will maintain a very long product line. The things is: you can not solve the dilemma in China in a capital way, Google China needs no money from the capital market. If this fund is for obtaining the market share or communities in China, the best way to manage this fund is Google China team themselves. We always emphasis the concentration of a company when they make the decision, which is also true in China. If Google want to play the game in a capital way, OK, please just suggest them to move the whole China technical team to Seattle or MV to develop other products or do the localization, and convert Google China to another Sequoia. VC can earn money, but is it really what Google China want? I don’t think so. The marketing share in search is quite hard to gain simply via investing. Google China now has 200- technical members and probably about 200 marketing/hr employees, which is rather a small team. You can image that for them, the localization is quite a heavy task, not to mention the product development for China market. Of course this small team can not take charge with the management of the fund. However, without their feedback, how can the VC choose companies to comply with the whole strategy of Google China. In one word, VC is good, but not helpful to solve the dilemma for Google China now if you say it is a dilemma. This strategy is in fact not a strategy. If this is a strategy for Google, it is also of Microsoft, for Oracle, for IBM, even for Citigroup, for American Express, for every Top 500 companies that wants to gain more money/market share in China.

2. Develop anti-censorship tools and service for global Internet users. In China as well some other coutries, censorship is still a tradition in culture. We are accustomed to control or to be controlled(It’s true!). But it’s too far from modern humanity and universal value. It won’t target China only, instead its a global issue to be solved. So it won’t cause Google’s operation in China into trouble. The budget to complete the mission will be not more than several millions dollars.

Good willing, but not easy to implement. I don’t know if you have heard about Tor. Tor is a tool to protect your content from GFW. However, to develop the product officically and distributed it openly is under a very high legal risk. For example, if we have this tool, can the gov-er-ment say: It is not legal to use this, as it is not legal to use a GPS speed camera detector? Of course the gov-er-ment can ban this tool like banning the speed camera detector, even these two things are inherently different (one is dangerous, one is for freedom). Actually, GFW is not the Google’s source of trouble in China. If the only issue is GFW, how can the smartest people at Google not come up with this idea? As you know, it is hard for google to obtain a permit to collect the news in China, that’s why Google News in China is called Google Information. It somehow reflect the key issue: the gov is the key constraint for Google China. However, how can Google blame/fight with the gov apparently in China if it wants to start the business? The only way to solve the dilemma is the combination of time and public relationship, which is in fact not a technical problem.

3. Increase the incentive to Chinese Google Adsense users. This can dramatically encourage more Internet users to participate Google’s business ecosystem. It’s a pure business strategy to increase loyalty as same important to Google’s products in China. Anyway the tactic should be deployed with better localized customer service to respect to individual users and protect their less to hundred-dollar income.

True, but unachievable. It is in fact the unfair competition and will essentially impair the whole searching market in China, which is also not good for Google.

Google is not alone. There are still several millions Google fans in China, especially those bloggers who are more real time intelligent to outside world. If Google do good as they did in early days. There will be more supporters for sure. Google is not playing a game of itself. You may under-estimate that before with limited information sources. People here are looking forward that you can pick the three suggestions(or partly) as China strategy in the coming years which can keep Google’s “non-evil” motto alive in people’s mind. It will also benefit to Google’s business in China. It will be also benefit to the whole Internet neutrality in China. All the Internet users will appreciate that eventually.

True. I am also a GFan. Support is not everything in China, right? We have supported someone in the history, but the result was not tah as we expected. If Google’s opponent in the game is baidu, support is everything. If the opponent is gov which you want Google to be the other player, it is a fatal game and usually no winner, support is nothing.

All of all, the pure the better; the more compromise, the worse.

You bet, but it is a further goal. Now, let’s face the truth: In China, more people choose Baidu while Baidu does the self-censorship ever since the very beginning. Why Baidu success, because of the pure? Well, I am not going to say: be evil. The key is: know the real situation in China, understand the local policy and have the good relationship with the gov. Sounds evil, but it is true. And I can see the progress, I think Google will be out of this trap soon.

Comments

« Previous entries