Sunday, April 20, 2008

Server-Side Language Detection with Ruby + Google Language API

Have you ever wanted to detect the language of a piece of text? Google's AJAX Language API makes this possible on the client side. This Belgian startup's blog post shows a PHP example of how you can use Google's detection service on the server side. Here is a port of that example in Ruby:

(the 'json' gem must be installed prior to running this program)

require 'rubygems'
require 'net/http'
require 'open-uri'
require 'cgi'
require 'json'

base_url = 'http://www.google.com/uds/GlangDetect?v=1.0&q='
url = base_url + CGI.escape("See if you can guess what language this is!")
response = Net::HTTP.get_response(URI.parse(url))
result = JSON.parse(response.body)
lang = result['responseData']['language']
puts "Language code: #{lang}"

3 comments:

Anonymous said...

I have developed Online Translation Service page which uses Google AJAX Language API. I also used Google AJAX Language API to translate the user interface of Online Translation Service to the following languages: English, Arabic, Bulgarian, Chinese, Croatian, Czech, Danish, Dutch, Finnish, French, German, Greek, Hindi, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish etc

Anonymous said...

This gem allows you to do the same locally, without the extra HTTP request to Google: http://github.com/peterc/whatlanguage/tree/master

Anonymous said...

Here is a n-gram based ruby language detector, it supports more language and perform better:
http://github.com/feedbackmine/language_detector/tree/master