I Think Tech: 2010

Monday, December 13, 2010

Backup contacts from Nokia phone to a excel spreadsheet -- Convert VCF to CSV with nodejs.!

Yesterday my uncle asked me to take a backup of his cellphone contacts. He has a Nokia 3600 Slide phone. Now this phone is a piece of work.. ! It doesn't provide a way to copy/move contacts to the external memory card. I found a couple of file manager applications in .jar and .jad formats and tried to see if they show me the phone's memory (not the external memory card, but the memory embedded in the phone) as that is where the contacts are stored. But no luck. Later when I found one such application, whose description said it gives you access to contacts, it was a .sis file and phone simply said "File format not supported"... !! . It's a S40 phone and hence cannot install applications from .sis files and most of the cool and usable stuff seem to be available in .sis format only. :(

A little searching led me to Nokia Europe support website page for "Backing up your phone" and the preferred method for backing up contacts was copying to a computer via bluetooth (mentioned as the simpler solution) and backup via PC Suite (mentioned as the harder way). Sadly the simpler way was not possible for me as my current laptop (Lenovo G550) is bluetooth challenged (a.k.a no bluetooth module) and I did not have the CD that comes with the phone to install the PC Suite (I wouldn't have installed it even if I had the CD.. !!)

Some more searching led to me a few applications (free and paid, .sis and .jar/.jad formats) which would "sync" your phone to a "remote location" via GPRS and make it available to us globally and also (here comes the best part) via "Social Networking Websites".... (Geez... as if we love spam calls and SMS and we are short of them..!!).

Having found nothing really useful I started fiddling with the phone settings to see if I can find something there. Luckily there was the Backup option there. The default backup tool available in the phone can apparently dump your phone's current state --- including contacts, calender, settings, applications, etc.. etc.., all into a backup file (a .NBF file which I presume stands for Nokia Backup File/Format). Well thankfully this file is placed in the external memory card, in a folder named Backups. Then I connected my phone to my laptop and chose it to be used as an external storage device. Copied the .NBF to my computer. And I was happy that I could do it without any additional installation.

Sadly the happiness did not last long. I tried to open the NBF file in a few text editors (hoping to see a huge list of contacts) but was instead presented some gibberish.. Damn it... it's a binary file.. ! And I could not find any reader/converter for this file format on the internet. But thanks to the linux "file" utility, I got to know that NBF is just a zip archive. Fantastic... !! I unzipped it and found all my contacts in a folder at a depth of 4 (i.e. a folder in a folder.. 4 times). Again I did not see a huge list of contacts. I saw a huge list of VCF files, one each for every entry in my contacts. And this is how each file looked : (in case you are wondering why this file per entry is a problem)

BEGIN:VCARD
VERSION:2.1
N;CHARSET=UTF-8;ENCODING=8BIT:
TEL;PREF;WORK;VOICE;ENCODING=8BIT:
END:VCARD

Well clearly this is not really a usable form.. !!

Now, I thought I would just import all these VCF files into my contacts in Outlook and then export it as a CSV. That would be a pain in two ways -- First I need to setup my Outlook if it is not already setup, which was the case with me(a small pain). Secondly if Outlook is already setup then the two sets of contacts will be merged. Not something that my uncle wants. So the next idea was to convert all these VCF files into one CSV.

Then the hunt started for VCF to CSV converter. I found a couple of online ones, but was apprehensive to upload the files to a remote server. I also had the option of downloading those server side scripts and run them in a webserver locally or with a PHP engine from the command line. I did not have PHP installed and did not think it was worth the effort as I wasn't sure how well those scripts worked. Finally I found this python script - http://code.google.com/p/vcf-to-csv-converter/ . Happily I fired up my Linux VM and ran the script with nearly 1000 VCF files got from the phone. Bang.. !!! The script erred out with an exception while processing some 332nd file. :(

At this point it appeared to me that the easiest way out was to write a simple importer myself. I was in no mood to write a C/C++ program and do all that string manipulation with raw character buffers and pointers. And the only scripting language that I know well (well enough to write something useful on my own from scratch) was Javascript... !! So how do I process these VCF files with Javascript?

Enter, Nodejs. I had been fiddling around with node for some time now but had not done anything useful with it. So I fired up vim and started to write the importer script and in about 30 mins (much less than the time spent on the internet searching for an existing tool... !) I had the contacts list in a CSV..!! Wohooo... And here is the script which did the job : https://gist.github.com/738325

The script is very crude, to the extent that I have hardcoded the path of the output file and it's very likely not efficient either. Also note that this script just extracts Name and Phone numbers and no other details. It is a not a generic VCF parser ( I don't even know the VCF spec. I just looked at a couple of VCF fies and figured the format of lines for name and numbers). If anyone needs it, they are free to use it. No guarantees though.. !! :)

It was a fun playing with JS, specially NodeJS. :)

Friday, October 15, 2010

Where does node.js figure in the typical web application stack?

I have been reading up about node.js for a little while now. Now having Javascript as one of my favorite languages I obviously love node.js. It's simply awesome. But it was not until very recently that I started to wonder where would this node.js fit in the typical three-tier web architecture, fondly know as "the stack". The three elements of the stack being Web server, app server and DB (DB is definitely out of question). Within the app server we typically have a language interpreter and a framework (like rails).

I initially thought of node.js as a JS interpreter (similar to the ruby interpreter) and can be used with FCGI. But then again node.js in turn uses the V8 JS engine. So it's not a language interpreter per say. Then I saw several node.js examples showing how easily web based applications can be created. So I started comparing it with other frameworks like Rails or Erlyweb. But no, its not that either. Sure there is a simple HTTP module in node.js but it's in no way anywhere close to these frameworks. So is it a web server then? Definitely not that either, considering the rich feature list of existing web servers like Apache or Nginx. So what is this node.js then?

From what I have understood till now, node.js is just a JS library (not like jQuery or Prototype which are meant to run in the browser context). node.js is more like a ECMA-Script library. If we treat JS as a generic programming language, I believe, we will see quite a few shortcomings, the most significant one being the lack of system i/o facility. I guess ECMAScript was designed to be run in a host environment and hence features like console i/o or file i/o or network i/o were not added. This makes it very hard for JS to be used outside the host environment. This is exactly what node.js provides.

node.js extends the ECMA script and provides these missing aspects which enable JS to be used on the server side for network programming. node.js provides file i/o, socket i/o, process handling, a mechanism for creating modules and specifying dependencies, several network oriented modules and so on. (The complete list is here).

So node.js is really a library, which adds capabilities, although these features are somewhat at a more basic level than the ones provided other libraries. For instance, look at the libxml library in C. The C language compilers come with their own standard library which provides mechanisms to do file i/o. But there is no out of the box provision to deal with XML files/documents. This capability is provided by the libxml library. So the libxml library allows programmers to do something more with C than what is provided by default. There are innumerable number of such libraries which add various types of capabilities.

In somewhat analogous way, node.js is also a library which adds a lot of new capabilities to the Javascript language, although, as stated earlier, these capabilities are much more basic in nature and in most cases are present in other languages as part of their standard offering.

So node.js is not a new language and hence do not compare it with other languages like Ruby, Python, etc. Javascript is the language here.

node.js is not a new web application framework. So do not compare it with Rails, Django, Sinatra etc. albeit, note that node.js was apparently developed as a means to write high performance client server programs. Consequently smart folks out there started working on web application frameworks based on node and there are a couple. I know about "Express" which AFAIK, is based on Sinatra and is gaining popularity. Now that is an item comparable to Rails and friends. Questions like will node.js replace rails are, technically speaking, absurd.

node.js is not a server. Absolutely not. There are node.js based servers, just like Nginx and Apache are C based servers.

Tuesday, October 12, 2010

Building Ruby 1.9.2 and installing rails 3.0 on it -- On Ubuntu 10.04 - Lucid Lynx

Issues that I faced while building Ruby 1.9.2 and then installing Rails 3.0 and finally making the example in "Getting started with Rails guide".

Make sure the following development libraries are installed before you start building ruby:
(The ruby configure, make and make install (i.e. building and installing) will not tell you anything about these missing libraries)

1) zlib-dev (I think the package name is zlib1g-dev) -- Needed when you try to install the rails gem. If this is not available you will get the following error when you try to install rails with the command :

gem install rails

ERROR: Loading command: install (LoadError) no such file to load -- zlib

2) libssl-dev -- Needed when you try to run the inbuilt rails WEBrick server and load the first example app in the getting started guide. You will get an error of the type:

"LoadError: no such file to load -- openssl"

In my case I did not have this library the first time I built ruby. So I followed the instructions given here to build the openssl-ruby module/binding.
After this I ran `make` and `make install` from the top ruby source directory. May be that was not necessary, but I did it anyways.

Also, I am guessing that if this package was available when I first built ruby then the openssl-ruby module would be built by default. If not there should be a configure option to enable this `feature`. The configure's help output does not provide any info on this (not even the --help=recursive option).

==== Upgrading from older ruby versions ====

Older ruby versions used the folder /usr/local/lib/ruby/site_ruby//rubygems . Now apparently this directory is replaced by /usr/local/lib/ruby//rubygems .

So you will have to get rid of the site_ruby folder (i.e. delete it) so that the gems are not searched for and used from a stale folder.

Not doing this might result in you not being able to run gem at all.

Saturday, June 26, 2010

What is Cloud? -- Simple terms please

Cloud has been making a lot of noise and almost every tech (or tech related) person knows about it or at least heard of it. Now for those who have just heard about it but do not know what it means here is a quick definition from Dave Neilsen, the founder of Cloud-Camp. He says, "For something to be called cloud, it should have these properties :

Hosted by someone else
On-demand. Do not have to wait or call somebody to get it.
Metered somehow. So you know exactly how much you are using and how much you are paying.
Scalable, both ways - up and down as and when you require."

He goes on to say that Cloud could mean different things for different people. Here area few examples stating what cloud is for a particular person :

For an IT guy -- Infrastructure as Service
For a Web Developer -- Platform. Just dump your code and don't worry what runs it.
For a Business guy -- SaaS (Software as a Service)

That was pretty neat. Helps me answer the standard question "What the hell is this cloud thing?" in a sane manner. Earlier I could never figure out what a proper answer should be for this question, because there was so much to tell.

Here is my attempt to elaborate on above mentioned examples.

So cloud is basically having the infrastructure to do what you do hosted by someone else and having it totally scalable. For example, in the above list, for a web developer cloud is a platform where he can dump his code and expect it to run as he has designed it. He does not worry about the machines, the network connectivity, the bandwidth. He just pays for those in a metered manner. He scales his platform whenever he wants. He can increase his bandwidth quota, move to a better machine, increase the number of machines and all of this without calling the customer care or the sales guy. He will do it by logging into the cloud services website or he would have a script do this for him automatically, i.e if he is geek enough.

Similarly for a business man, it is software as a service. E-mail service would probably be a good example. The business man does not know what software runs the email system, he does not worry about what version of email server is running, what os it is running on, what DB it is using to store the emails, what protocols it is making use of. If the email contents are not that sensitive he would not even worry about the physical location of the servers storing these emails. He just buys the email software as a service and uses it. All that he probably worries about is how many email accounts are available to him/his company and how reliable/usable they are. At any point he can increase or decrease the number of accounts, once again without making a call.

That's cloud .

Note : I got this definition from one of the IBM developerWorks podcasts which is available here.

Oh, and remember, all this time every reference to Cloud meant "Cloud Computing", not just plain "cloud"

Monday, June 7, 2010

Very high startup time for vim under screen (GNU-Screen) -- SESSION_MANAGER

I have been using GNU-Screen for a while now and it has been very useful. Today morning when I started working, I noticed that vim is taking unusually long time to start up. It was very irritating. I had faced this same issue some time back but I could not recall the solution. I just remembered that it had something to do with GNOME and the display settings. On searching I found a couple of posts which said that this is because of vim trying to connect to an X which either is on a distant machine (distant in terms of network delay) or it is trying to connect to an non-existent X server. Another post on Ubuntu forums suggested that this could be because of multiple entries for 127.0.0.1 in the /etc/hosts file. Various combination of commenting/un-commenting entries did not help. I checked the DISPLAY env variable. It looked good too.

Finally I resorted to the last option of using strace. strace did reveal interesting stuff. I saw the that the wait/delay was because of a connect() call. Here are a few lines from strace output :

:~$ cat strace.vim.out | grep connect
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/6386"}, 21) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/6386"}, 21) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/6386"}, 21) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/6386"}, 21) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/6386"}, 21) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/6386"}, 21) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"}, 110) = 0
connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"}, 110) = 0
connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"}, 110) = 0
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(4, {sa_family=AF_FILE, path="/var/centrifydc/daemon2"}, 25) = 0

This confirmed the fact that it was a network related thing, i.e vim is trying to connect to something that did not exist. I thought my X (or GDM to be more precise) was screwed up and thought of logging out and logging back in. I thought of doing some more experiments with this setup to find out what caused the problem.

All of this was running under my gnu-screen session. I opened another gnome terminal to read the redirected output of strace. Accidentally I used vim itself to open the file. Before I realized my mistake and I could start cursing myself, vim popped up..! It was there up and running as fast as it could be... !! Then it hit me that it could be my screen session which is causing this. I did not know how to find the differences in the two environments - in and out of screen. To solve this particular problem I ran strace on vim in the new terminal so that I could compare the two and find out what is lacking. Here is what strace told me in the terminal outside screen :

:~$ cat outside.strace.vim.out | grep connect
connect(3, {sa_family=AF_FILE, path="/tmp/.ICE-unix/28919"}, 22) = 0
connect(4, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"}, 110) = 0
connect(5, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"}, 110) = 0
connect(6, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(6, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(6, {sa_family=AF_FILE, path="/var/centrifydc/daemon2"}, 25) = 0

The difference in the path was obvious. On searching through the list of environment variables SESSION_MANAGER came up. It is the variable which will tell all gnome based apps how to contact the X (or the GNOME session). I do not know what caused this disparity, but most likely setting the appropriate value inside the screen session would have worked. Well it would have worked in one of the screen windows in which I change the value. I have several such windows, so I just chose to start a new screen session.

Tuesday, March 30, 2010

Mozilla @ SJCE : Static Analysis projects

It has been a very long time since I posted anything about the Mozilla related activities going on at my college SJCE, Mysore. That in no way means absence of any activity. In a previous post I mentioned that along with the attempt to introduce Mozilla as a standard course I was working to get the current final year students (who are enrolled under the larger VTU University) to start working on Mozilla, using their final semester project as a means. Well I am happy to say that this has materialized. 8 final semester students from CS expressed interest in working with the Mozilla community as part of their final year project and it's been a month nearly since they started their work. Here is a brief write up about that.

As is with most of the Computer Science students in India approaching the Mozilla community, these 8 students also wanted to do something related to compilers. The JS engine and static analysis are two projects in Mozilla which would come under the compiler banner. These 8 students wanted to work on something substantial which can be presented by two teams of 4 students each as their final semester project. So the bugs that they would be working on had to be related. This was possible only with static analysis as there are a lot of related tasks available. Also static analysis would be something new to the students and it would give them an opportunity to understand the internals of the compiler (GCC here) like the AST (Abstract Syntax Tree) and other representations of the code. Moreover the static analysis APIs are exposed in JS and hence the analysis scripts would be written in JS. That way students would learn JS also. Above all these students would be doing something genuinely new.

The students could not be asked to start working on the bugs directly. They were new to open source development, the tools used there like the bugzilla, using email as a formal medium of communication, the source control system (to add to the complexity Mozilla now uses a distributed RCS - Mercurial [hg]), using IRC, the linux development environment etc. It has been these things that the students have been learning for this part month or so. This learning has been in the form of accomplishing the tasks which form the prerequisites for the actual static analysis work. These are things like downloading gcc and mozilla sources from the ftp hosts and from the mercurial repository respectively, applying the mozilla specific patches to gcc for plugin support etc, etc... These are all listed here. Note that some things like installing all the dependency packages for building these applications from sources, learning to use the linux command line itself and others are not on that page but were new to these students nonetheless.

All the students have been putting in substantial effort and have picked up the traits of an open source hacker pretty soon. We have had a few IRC meetings and a lot of formal communications over emails. In parallel we were also working towards shortlisting 8 static analysis bugs. Based on the feasibility of the bug being completed by an amateur developer within a span of 2.5 months and based on the students' interest we finally decided on these 8 bugs :

Bug 525063 - Analysis to produce an error on uninitialized class members
Bug 500874 - Static analysis to find heap allocations that could be stack allocations
Bug 500866 - Warn about base classes with non-virtual destructors
Bug 500864 - Warn on passing large objects by value
Bug 528206 - Warn on unnecessary float->double conversion
Bug 526309 - Basic check for memory leaks
Bug 542364 - Create a static analysis script for detecting reentrancy on a function
Bug 500875 - Find literal strings/arrays/data that should be const static

These tasks are good, challenging and provide an opportunity to understand compilers very closely.

Currently the students have downloaded gcc, applied the patches, built it along with the dehydra plugin support and are ready to run static analysis on the mozilla code. They are now trying to run simple analysis scripts like listing all classes in mozilla code and all classes and their corresponding member functions. It is still quite a long way to go, but it has been a real good start. Let's wait and watch what great feats are in the pipeline.

I hope to keep this blog updated at the same pace at which the students are working.

Good luck to the students. :-)

Thursday, January 28, 2010

Script to get nsIWebProgressListener state names from state codes

This is totally Mozilla specific and probably will not make any sense to anyone not involved with Mozilla code.

So in Mozilla there is an interface named nsIWebProgressListener which can be used to get notifications about any web progress -- a page load in simple terms. So these notifications are sent to us by calling our onStateChange methods. One of the parameters passed is the state of the request. This is a hex code. Memorizing all the hex codes is insane. So to log the states I wrote a small, dumb, script.

I wanted to put it somewhere on the internet, instead of a file on my disk, and hence this blog post. Here is the script to convert nsIWebProgressListener state hex codes to state names. A simple lookup function, but handy for logging


var flagNames = [
"STATE_START",
"STATE_REDIRECTING",
"STATE_TRANSFERRING",
"STATE_NEGOTIATING",
"STATE_STOP",
"STATE_IS_REQUEST",
"STATE_IS_DOCUMENT",
"STATE_IS_NETWORK",
"STATE_IS_WINDOW",
"STATE_RESTORING"
]

var flagValues = [
0x00000001,
0x00000002,
0x00000004,
0x00000008,
0x00000010,
0x00010000,
0x00020000,
0x00040000,
0x00080000,
0x01000000
]

function splitFlags(aFlag) {
var states = ""
for(var i in flagValues)
{
if(aFlag & flagValues[i])
{
    states+= flagNames[i] + "\n";
}
}
return states;
}

That's it.

OnSwipe redirect code