静水铭室 Silent Water and Curved Mountain

Posts from April, 2009

Using Affinity Propagation as a Descriptor-Reducing Tool

April 27th, 2009

The fancinating part of high-dimensional descriptor is that it has two faces, one is the sparsity and the other is the density. The overall descriptor (in my case, image patch descriptor/local feature descriptor) data are lay in space with large variance. But by observing small group of descriptor data, there are some dense area in the space. The duplicate detections of corner detector, similar objects and etc. all may cause the density. By reducing all the density descriptor cloud to one examplar, it will reduce the overall matching time. Especially, for CBIR, it would be nearly no after-effect of reducing descriptors for one image.

That is the time where affinity propagation comes in as a replacement for k-median. Affinity propagation is an amazingly fast and pretty good approximation to the optimal examplar result. Affinity propagation’s property of using sparse matrix can largely reduce the computational cost. By using full neighborhood dissimilarity information and the mean of dissimilarity as preference, it reduced 1147 local patches in a image to 160 local patches. How ever, to compute the full dissimilarity matrix is time expensive, for my experiment, a best-bin-first tree was used to speed up the k-nn search and only set the dissimilarity with top N (N=5, 10, 20) neighbors. In that case, the time cost was reduced from 30s to less than 1s and the number of local patches was reduced to 477(T20), 552(T10), 647(T5).

A coarse observation is that, by reducing the number of local patches, the accuracy of search over database is improved. The reduction of local patches leaves the more distinctive ones in the bank. More distinctive points reduced the false positive, and get the overall performance gains.

As the affinity propagation method shows many promising aspects, the new key word “EXAMPLAR” will be introduced in the implementation of non-structural data query lang.

Pushing Grid Computing to Browser through Javascript

No comment yet

April 15th, 2009

Half a year ago, I read an article about how to use simple javascript to perform MapReduce in browser. It is very interesting, but the author obviously ignored that the locality of MapReduce made it so good. It is not proper to introduce MapReduce to the scenario of browser because it solves data-intensive problem which bandwidth is critical (that is why Reduce part introduced).

However, the idea of making browser do some extra work is suitable for computing-intensive work which only requires little data. Someone is already on the track years ago by using Java applet or Flash. With Google Gears or even setTimeout, I believe it is very realistic now to introduce browser-based grid computing with Javascript.

More details about it will be revealed in July.

Why Facool failed and what I learned from it

No comment yet

April 5th, 2009

For those who don’t know what Facool is, there is a video about it: http://www.vimeo.com/1925998

It has been 3 years since the close of Facool in 2006. After working on serveral minor startup things, I still occasionally heard people’s ask about why Facool failed at that time. I spend a lot spare time to think about it. Today it still seems to be a cool idea to put face retrieval technique online and there are many startups working on this (such as face.com, riya.com etc.). And now I think that I have a good perspective of why Facool failed.

Facool rolled out as an academic research result. It took me while to realize the economic potential and then I started to run it as an actual product. The year of 2005 is the time when everyone believes that search is the coolest stuff as SNS in 2007 and twitter in 2009. The idea is simple: to index all faces in the web and find it instantly. The missing point here is that the goal is too ambitious and the resource I could use is limited.

The shortage of resource can explain many negative facts that Facool encountered. First is the shortage of images. At 2005, Facebook just launched. There is no much good structural representation of personal information here on the Internet. By scraping 100,000 images, the detector found about 10,000 faces and most of them were low-resolution ones. You have to dig the deep net in order to find more useful information and due to the lack of structural information about person, I even have to develop a new algorithm to determine a person’s name!

Lesson 1 learned: start with a small thing, and evolve along the way.

When Facool came out as a web service, I coded a web server from scratch which made me spend more time to take care of socket error, concurrency problem etc. To make a web server is a big time sinker, and even if it could take few percents advantage, it is not a convenient thing to start with. I actually spent 2 months to code the web server, comparing with now pile up a web service in 3 days with Django, I wasted too much time on unimportant stuff.

Contrarily, I was not a huge fan of opensource community at that period of time. In 2005, I only heard of OpenCV and never put real use of it. Without trying the power of opensource, I trained the face detector by my own. Which, no doubt, cost another 3 months to get a satisfactory result.

Lesson 2 learned: saving time and avoiding reinvention of wheels, taking the power of opensource.

When finally finished the beta version of Facool, I just about ran out of money. I spent about $5,000 to buy server and rent the bandwidth, left few bulks for living. It is hard to recall that just 3 years ago, there is no slicehost, no Amazon S3 and you have to startup with $2,000 server.

At June, I don’t have one extra penny to pay the bandwidth, and that pretty much about it.

Lesson 3 learned: startup with cheap stuff and save at least half of your money before the release day.

Sometimes, I appreciate that I was failed so young and I have so much time to start over.

Mar, 2009 ›

‹ May, 2009