Thursday, May 26, 2005


Google Translator: The Universal Language

Here is some really interesting machine learning. Google took all of the UN's translated documents and fed it to a statistical engine of some sort. Now that engine can translate between all of the UN languages with fairly high accuracy.

I am curious about how this would handle colloquial use of language. My guess is not terribly well, without enough input data to chew on. I can imagine this type of database being grown organically though, with people translating the occasional document or snippet in their spare time, or as needed. What sort of statistical strength piecemeal (and potentially low-quality) translations would have is debatable, though. It may be that a rigorous and repeated approach such as what the UN needs to take with all of their documents would be needed.

where did you go?
Hi Jose,

I'm on holiday right now, but I've been having trouble finding things to post about.

If you've been reading regularly, I apologize, and will try to post more (really, I didn't know that people I don't already know find anything here interesting).

