Friday, September 17, 2010

List of countries and regions from FreeBase



When developing web application, it is typical to ask for user's mail address. Country and region is easy to misspell so it would be nice to have autocompletion for those two fields.

But where can we get the list of all countries in the world and their regions? And preferably in format which can be consumed by programming language, for example JSON: you don't want to write html parser to retrieve your data from Wikipedia pages.

Here is FreeBase for you. FreeBase is free hierarchical database. You can think of it as about wikipedia for computers.

FreeBase allows you to write your query once and not revisit your program each time when two countries are split, or time zone information is changed. As soon as this info is entered into FreeBase your program will use it without any changes. Another concern is that finding this information manually is quite a lot of work. Just think how much time would you spend collecting all countries/regions/cities. And think how quickly this information is outdated. Your application will become outdated in matter of months.

First thing we need to write a query to retrieve the list of all countries in the world.

http://www.freebase.com/api/service/mqlread?queries={"q1":{"query":[{"limit":1000,"name":null,"type":"/location/country"}]}}

This will produce the following result:

{
"code": "/api/status/ok",
"q1": {
"code": "/api/status/ok",
"result": [
{
"name": "United States of America",
"type": "/location/country"
},
{
"name": "Germany",
"type": "/location/country"
},
{
"name": "Australia",
"type": "/location/country"
},
{
"name": "Iran",
"type": "/location/country"
},
{
"name": "United Kingdom",
"type": "/location/country"
},
...
...
...
{
"name": "Rome, Italy 11.15.04",
"type": "/location/country"
},
{
"name": "Diff\u00e9rance",
"type": "/location/country"
},
{
"name": "Kingdom of Croatia-Slavonia",
"type": "/location/country"
},
{
"name": "Tyse",
"type": "/location/country"
},
{
"name": "Rain",
"type": "/location/country"
},
{
"name": "migrated from India in 1968",
"type": "/location/country"
},
{
"name": "Republic of Genoa",
"type": "/location/country"
},
{
"name": "Mygdonian",
"type": "/location/country"
},
{
"name": "Theban",
"type": "/location/country"
}
]
},
"status": "200 OK",
"transaction_id": "cache;cache01.p01.sjc1:8101;2010-09-19T00:03:10Z;0013"
}


Looks good. But wait, I've never heard of a country named "migrated from India in 1968".
Apparently, there is some noise in the database which can be explained by mistakes of volunteer contributors. They confuse association of a topic with its type and and instead of making a topic as having association with some country they may by mistake make a topic itself a country.
To filter out this noise we can filter only those countries which have FIPS code. The query code is:

[{
"type": "/location/country",
"limit": 600,
"name": null,
"id": null,
"fips10_4": {"value": null, "optional": false}
}]


I won't go into too much details of MQL, you can read more on FreeBase site but basic idea is: if we provide property with a value, search by values, if we provide empty property, return its value.
"Optional" is a special property which allows filtering unassigned properties. "optional": false demands property to exist.
Here is url:

http://api.freebase.com/api/service/mqlread?query={"query":[{"type":"/location/country","limit":600,"name":null,"id":null,"fips10_4":{"value":null,"optional":false}}]}

Much better, "migrated from India" is gone so are gone attic Greek states like Theban, because they do not have FIPS code.

Now we want to do something useful with this info. As we see, information in the DB is not super reliable, so the best approach is to give user ability to select their country but still leave an option to enter text even if it is not in the database. jQuery's autocomplete feature is what we want.


<!doctype html>
<html>
<head>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.js" type="text/javascript"></script>
<script src="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.5/jquery-ui.min.js" type="text/javascript"></script>
<link rel="stylesheet" href="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.5/themes/smoothness/jquery-ui.css" type="text/css"/>
<script type="text/javascript">
$(function() {
var countriesUrl = 'http://api.freebase.com/api/service/mqlread?callback=?&query={"query":[{"type":"/location/country","limit":600,"name":null,"id":null,"fips10_4":{"value":null,"optional":false}}]}';
$.getJSON(countriesUrl, function(data, textStatus, xhr) {
var names = $.map(data.result, function(d) {return d.name})
$("#country").autocomplete({source: names})
})

})
</script>
</head>

<body>
<p>Country: <input id="country" type="text" /></p>
</body>
</html>


Here we go, autocomplete of country in 4 lines of code!

Watch out for cross site scripting: browser will not allow HttpXmlRequest because origin of your html page is different from api.freebase.com. To address this, we added "callback=?" to parameter of the query and result is formatted as JSONP.

There is a jQuery component "freebase suggest" but it implements autocomplete in its own way and we want to stick to the standard components as much as possible, that's why we used jQuery's native "autocomplete".

In the next blog I'll add querying region of chosen country.

Sunday, July 25, 2010

HNibernate in Git and Mercurial.

I imported NHibernate sources into github.com and bitbucket.org.
First, I've downloaded snapshot of svn repository from http://fabiomaulo.blogspot.com/2010/06/nhibernate-svn-local-mirror.html. You can set up mirror on your own, but download is faster.

Then I've created 2 folders, one for git and one for mercurial. Then ran those 2 scripts:


cd ../NHibernateMirror.git
git svn clone file:///home/vadim/projects/NHibernateMirror -T trunk/nhibernate .



hg clone file:///home/vadim/projects/NHibernateMirror/trunk/nhibernate/ ../NHibernateMirror.hg


I don't like trunk containing a folder "nhibernate", it is redundant IMHO, so I made the essential content the root of git and mercurial repository. Because of this, importing tags and branches is problematic, but I'm not interested in trunk only.

Import is good, but how to keep it up to date?
Update your local svn mirror:

svnsync sync file:///home/vadim/projects/NHibernateMirror


Update Git:

cd ../NHibernateMirror.git
git svn fetch
git svn rebase
git push origin master


Update mercurial

cd ../NHibernateMirror.hg
hg pull
hg push ssh://hg@bitbucket.org/vadim/nhibernate

Wednesday, June 09, 2010

Poor man's CSS Framework



If you have simple site structure, for example classic: header, left panel, content panel and footer, then it may be an option to skip full scale css framework. JQuery introduced "position" utility. Here is how I stitched my content panel to the left panel:

$('#pages').position({
of: $('#main-menu'),
my: "left top",
at: "right top"
})


No stupid tricks with css margins!
Taking into account that css3 layout won't be ready any soon and taking into account not the best architecture of css overall, jQuery can become layout engine of the choice in the future.

JSON and WCF


Microsoft JSON implementation just pisses me off.

First, they serialize enums as numbers.

Second, this strange date type serialization. Ok, I can understand that jscript has no hint that given string is a date, but when client sends '2001-12-31' to the server, which does know the type, how stupid it is to demand: "DateTime content '2000-01-01' does not start with '\/Date(' and end with ')\/' as required for JSON"???

Thursday, April 29, 2010

Automatic build from tags (not trunk)


Requirements


My requirement for Continuous Integration is slightly different. Our project involves semi-manual Sql scripts preparation for a build, so I can not relaibly build every version from svn. Only the code in tags/Builds is good. But I am so lazy, I don't want even run my build script. I want to new build to be automatically detected and built. Just email me please when you are done :)

The first attempt to automate daily builds was to create my own "small" asp.net project to do it. But as soon s I realized that I need a winservice to perform long lasting tasks, such as source code checkout, I abandoned this idea. It is too much effort and I should be able to find something ready.


Almost continuous integration


So I decided to give a whirl to Cruise Control.NET. I heard of it before, but it does what I do not need: it tracks *any* change to the *trunk*, whereas I need to track new folder in "tags/Builds" and trigger svn checkout of this particular subfolder, not just "svn update" of "trunk" folder.

As I suspected, CCNet Svn plugin can crate new labels in tags but it can not track changes in tags folder.
I tried Hudson build manager too. More plugins, way much better UI but the same problem: it tracks trunk only.

Such a minor problems do not stop me and I dived inside Cruise Control.NET.
At first I tried to do what I want by introducing some faked task, which would check last build tag and compare it to the last one available in svn. But CCNet asks "source" plugin for changes and if it detects nothing, no "task" will be invoked.
So I started digging CCNet's "Svn" class. Well, it can be done, but a lot of work, and to make it flexible and not reflecting my particular setup even more work.

All the sudden I took another look at seemingly unrelated plugin "external source". It says that you use it to integrate with other source control systems, but you can do more with it. You can call your own script which will do custom svn search logic.

Bummer, "external source" command line is specified as "executable GETMODS "fromtimestamp" "totimestamp" args". So if I want to execute "ruby.exe /path/to/ruby/script.rb", I can't: script parameter is the last in the list of params. The same with "cmd.exe", I can't pass params in the order I want.

But this should be easy fixable: downloaded sources (make sure you get the same version of sources as you have installed as binary package), add one more parameter "argsLeading" and it works!

Configuration



<project name="Your Project">
<workingDirectory>C:\tmp\ccnet-working\YourProject</workingDirectory>
<artifactDirectory>C:\tmp\ccnet-working\YourProject.Artifacts</artifactDirectory>
<triggers>
<intervalTrigger name="interval" seconds="3600" initialSeconds="10"/>
</triggers>

<sourcecontrol type="external" autoGetSource="true">
<executable>ruby.exe</executable>
<argsLeading>C:\Projects\Your\Project\TagCheck.rb</argsLeading>
<args></args>
</sourcecontrol>

<tasks>
<!--<nullTask />-->
<msbuild projectFile="src/YourProject.sln">
<executable>C:\WINDOWS\Microsoft.NET\Framework\v3.5\MSBuild.exe</executable>
<logger>C:\Program Files (x86)\CruiseControl.NET\server\ThoughtWorks.CruiseControl.MSBuild.dll</logger>
</msbuild>
</tasks>
<publishers>
<email mailhost="mail" from="build-no-reply@enviance.com">
<users>
<user name="BuildGuru" group="buildmaster" address="you@your.company.com" />
<user name="JoeDeveloper" group="developers" address="you@your.company.com" />
</users>
<groups>
<group name="developers">
<notifications>
<notificationType>Failed</notificationType>
<notificationType>Fixed</notificationType>
</notifications>
</group>
<group name="buildmaster">
<notifications>
<notificationType>Always</notificationType>
</notifications>
</group>
</groups>
</email>
<xmllogger/>
</publishers>
</project>



Handling script

$svn_tags='https://svn.yourcompany.com/svn/your/project/tags/Builds'

def last_tag
`svn.exe ls #{$svn_tags}`.split().last().chomp('/')
end

def last_build
Dir.entries('.').select {|d| d =~ /^\d{8,}/}.sort().last() || '0'
end

def svn_info(tag)
info = {}
`svn.exe info #{$svn_tags}/#{tag}`.
split("\n").each {|line|
pair=line.split(/ *: */)
info[pair[0]]=pair[1]
}
info
end

#
# Get Modifications
#
if ARGV[0] == 'GETMODS'
if not last_tag > last_build
puts ''
exit 0
end

info = svn_info(last_tag)
date=DateTime.parse(info['Last Changed Date']).strftime()

# in fact, we should filter only modifications which are in between
# the ones in command line, but seems CCNet does check the result,
# so let's always return the latest entry
puts "

#{info['Revision']}
New build
#{last_tag}
#{date}
#{info['Last Changed Author']}

"
exit 0
#
# Get Source
#
elsif ARGV[0] == 'GETSOURCE'
workdir = ARGV[1]
timestamp = DateTime.parse(ARGV[2])
last = last_tag
info = svn_info(last)
date = date=DateTime.parse(info['Last Changed Date'])
if date > timestamp
STDERR.puts "Command line timestamp must be less then svn. Svn: '#{date}' command line: '#{timestamp}'"
exit 1
end
puts `svn.exe export #{$svn_tags}/#{last} #{workdir} --force`
exit 0
end


exit 1


Patch



Index: project/core/sourcecontrol/ExternalSourceControl.cs
===================================================================
--- project/core/sourcecontrol/ExternalSourceControl.cs (revision 7225)
+++ project/core/sourcecontrol/ExternalSourceControl.cs (working copy)
@@ -183,6 +183,14 @@
[ReflectorProperty("args", Required = false)]
public string ArgString = string.Empty;

+ ///
+ /// The same as "arg" but it will be the first parameter in command line.
+ /// Is useful if external program is a script engine and you need 1st parameter
+ /// to be a script name.
+ ///

+ [ReflectorProperty("argsLeading", Required = false)]
+ public string ArgLeadingString = string.Empty;
+
///
/// Should we automatically obtain updated source from the source control system or not?
///

@@ -237,7 +245,8 @@
///
public override Modification[] GetModifications(IIntegrationResult from, IIntegrationResult to)
{
- string args = string.Format(@"GETMODS ""{0}"" ""{1}"" {2}",
+ string args = string.Format(@"{0} GETMODS ""{1}"" ""{2}"" {3}",
+ ArgLeadingString,
FormatCommandDate(to.StartTime),
FormatCommandDate(from.StartTime),
ArgString);
@@ -265,7 +274,8 @@

if (AutoGetSource)
{
- string args = string.Format(@"GETSOURCE ""{0}"" ""{1}"" {2}",
+ string args = string.Format(@"{0} GETSOURCE ""{1}"" ""{2}"" {3}",
+ ArgLeadingString,
result.WorkingDirectory,
FormatCommandDate(result.StartTime),
ArgString);
@@ -286,7 +296,8 @@
{
if (LabelOnSuccess && result.Succeeded && (result.Label != string.Empty))
{
- string args = string.Format(@"SETLABEL ""{0}"" ""{1}"" {2}",
+ string args = string.Format(@"{0} SETLABEL ""{1}"" ""{2}"" {3}",
+ ArgLeadingString,
result.Label,
FormatCommandDate(result.StartTime),
ArgString);