Neural Machine Translation
ENGLISH-MARATHI
Machine Translation is an application of NLP
where one Language is translated into another language.
Example translating Spanish to English. Machine Translation
using Neural networks especially Recurrent models, is called
Neural Machine Translation or in short NMT.
Most widely used Deep Learning model for NMT is seq2seq model
which has Encoder and Decoder. At a high-level Encoder takes
input sentence and Decoder outputs translated target sentence.
Both Encoder and Decoder models are build using LSTM, RNN, GRU layers etc.
This is the simple NMT which translate from English language to Marathi langage.
the Dataset include a total of 38696 sentances of english and there
respective marathi sentances you can find the data from here
the data is then cleaned and and stored as pandas dataframe with 2 columns English and Marathi.
After this it is splitted into 90% to 10%.
I have trained a Seq2Seq model without attention.
The LSTM(Long term short term memory) encoder and decoder is
trained for 30 epochs on 34825 data.
Model performance is good compared to the number data it has trained.
it perform good on short sentances having 4-5 words but not that good for
longer sentances as we already know the seq2seq model doesnot perform well
on lengthy sentance which is one of its disadvantages. Also it has problem
in decoding the sequence involve name because the only name in the dataset
used is 'Tom', so it decode this name easily but if you give it sentance consisting
name other than 'Tom' like 'My name is sam' it not able to decode for name 'sam'
and produces something like 'माझं नाव आहे'.
Following are some changes can be done to Improve this Translator
The above points were some of few Improvement which I can suggest.
But we know that there is no end if Improvements things keeps Improving as time progresses. Hence the
Improvment of this NMT is not limited to above points.
Neural Machine Translation
Model Performance
Improvements
English to Marathi Translator