Almost all such systems are built for a single language pair — so far there has not been a sufficiently simple and efficient way to handle multiple language pairs using a single model without making significant changes to the basic NMT architecture.
Google’s engineers working on NMT released a paper last year detailing a solution to multilingual NMT systems that avoided making significant changes to the architecture, which was based on a single language pair translation.
This makes me wonder what the NMT architectural structure would be and how it would differ from what is currently in place, to be optimized for multilinguality. And what the differences would be, in how it behaves, if any.
I wonder if the system were architected in a language other than English, if it would be different. What do you get if you cross Sapir-Whorf with systems architecture?
I wonder how the machine would translate ‘soy milk’ to Romanian. Would it assume English is the source language because of ‘milk’?