Neural Machine Translation

Machine Translation is an application of NLP where one Language is translated into another language. Example translating Spanish to English. Machine Translation using Neural networks especially Recurrent models, is called Neural Machine Translation or in short NMT. Most widely used Deep Learning model for NMT is seq2seq model which has Encoder and Decoder. At a high-level Encoder takes input sentence and Decoder outputs translated target sentence. Both Encoder and Decoder models are build using LSTM, RNN, GRU layers etc.

This is the simple NMT which translate from English language to Marathi langage. the Dataset include a total of 38696 sentances of english and there respective marathi sentances you can find the data from here the data is then cleaned and and stored as pandas dataframe with 2 columns English and Marathi. After this it is splitted into 90% to 10%.

I have trained a Seq2Seq model without attention. The LSTM(Long term short term memory) encoder and decoder is trained for 30 epochs on 34825 data.

Model Performance

Model performance is good compared to the number data it has trained. it perform good on short sentances having 4-5 words but not that good for longer sentances as we already know the seq2seq model doesnot perform well on lengthy sentance which is one of its disadvantages. Also it has problem in decoding the sequence involve name because the only name in the dataset used is 'Tom', so it decode this name easily but if you give it sentance consisting name other than 'Tom' like 'My name is sam' it not able to decode for name 'sam' and produces something like 'माझं नाव आहे'.

Improvements

Following are some changes can be done to Improve this Translator

  • It can be train on more versatile data with lots of variation in it.
  • For lengthy Sentance limitation of it we can include the Attention Machanism
  • We can try replacing GRU with LSTM and check the performance of the model

The above points were some of few Improvement which I can suggest. But we know that there is no end if Improvements things keeps Improving as time progresses. Hence the Improvment of this NMT is not limited to above points.

English to Marathi Translator