This year is a good year for programmers around the world since there are several specific search engine dedicated to source code, especially open sources. I don't have much chance to try this kind of technology so I will talk about only the top two services: Krugle and .

Krugle

Krugle is not just a plain search engine but it is also a collaborative web application for collecting custom annotation by members. It can be classified as a social network for source codes. In Krugle, you can specify 4 parameters as follows.

  1. Query
  2. Programming language
  3. Area such as comment, source code, function definition, function call and class definition
  4. Project

Krugle UI relies on AJAX and JavaScript. You can't use Krugle without JavaScript enabled.

Google Code Search

Google Code Search has just been released. Just like other Google's searches, it is just a plain search engine that you can't do anything but search. There is not also AJAX here. Google Code Search is very simple and fast. All advanced queries are passed through operators just like others. There are only 4 operators by now.

  1. Query
  2. Programming language
  3. Package URL
  4. Filename

The most interesting feature is regular expression. You may enter the query string in POSIX extended regular expression. How is it useful? Take a look below query I found in .

mysql://\w+:\w+@[^l][^o].*\..*\..*/\w+ -sample

And then check out . Amazing! You discovered the embeded login of MySQL in many open source projects.

Experiments

I tried to test both search engine by the same query string to see the quality of given results.

  1. As far as I know for a long time, CVS's protocol has word "I LOVE YOU" as well as "I HATE YOU" in authentication module. So I started by these words.

    "i love you" "i hate you"
    

    As a result, Krugle gives better results in term of meaningful lines. I guess that the reason is Krugle gave highe priority to declaration area while Google Code Search doesn't care where the words are.

  2. According to above MySQL db login, I tried to find something similar in Krugle. Unfortunately, Krugle doesn't support regular expression. So I just specified a short part.

    "mysql://"
    

    In this test, Krugle shows me the best of its results including JDBC connection string and the URL without password or localhost. Google Code Search wins this experiment with no doubt. Thanks regexp!

  3. Then I tried to look for my code, BTQueue.

    btqueue
    

    Krugle didn't find my project while Google Code Search found many instances including the old, the newer and the latest code in Subversion.

  4. The next one is also my project, SCMSWeb

    scmsweb
    

    Krugle could not find this project again. Google Code Search found many packages according to this project including ones in FTP.

  5. Now it's time for Drupal. I just want to find example how to call module_exist() and its new name module_exists() for Drupal 5.0.

    Google Code Search:

    module_exist pacakge:drupal
    

    Krugle:

    module_exist
    

    And specify project as drupal. As a result, Krugle returned about 434 matching files while the top ten entries were from Drupal Contributions CVS. In Google Code Search, the results were taken from FTP and there were only about 100 files.

    Now it's time for module_exists() which is only available in latest code only. Just to change module_exist to module_exists. In this experiment, Krugle found nothing while Google Code Search found a file. I'm sure there must be more. Anyway, Google Code Search has a bit newer data in repository than Krugle has.

In conclusion, each search engine has its own unique features. If you like regular expression, you might find Google Code Search is very useful. Anyway, it only applied to not-to-complex string. Krugle allows you to specify area of the string you are looking for. It is very useful if you know what you are looking for.

Tags: ,

Reply