机器翻译

2018/11/12 NLP

机器翻译种类

传统机器翻译

手动制定规则。

统计机器翻译SMT(statistical machine translation)

从大规模的预料中学习规则。

神经机器翻译NMT(neural machine translation)

CNN、RNN、LSTM、Encoder-Decoder Architecture

Encoder-Decoder的缺陷

源语言和目标语言都用固定大小的词典，如果出现未出现的词，那么无法转换为向量表示或者无法对罕见词预测。需要处理未登录词并且实现open-vocabulary式的NMT。
Encoder得到的固定维度的向量无法包含源语句的所有信息。而且目标语言端的词往往只和源语言端的部分词有关，需要引入attention机制。

神经机器翻译

encoder-decoder架构：seq2seq

convolutional seq2seq

RNN encoder-decoder + Attention

Transformer

You May Not Need Attention

你可能不再需要Attention：这是一个贼简单的神经机器翻译架构

You May Not Need Attention详解

YouMayNotNeedAttention github

评价指标

BLEU(bilingual evaluation understudy)

评估机器翻译和专业人工翻译之间的对应关系。

论文trick

Sequence to Sequence Learning with Neural Networks

Sequence to Sequence Learning with Neural Networks

encoder和decoder的LSTM是两个不同的模型。
deep LSTM表现比shallow好，选用了4层的LSTM。
实践中发现将输入句子reverse后再进行训练效果更好。So for example, instead of mapping the sentence a,b,c to the sentence α,β,γ, the LSTM is asked to map c,b,a to α,β,γ, where α, β, γ is the translation of a, b, c. This way, a is in close proximity to α, b is fairly close to β, and so on, a fact that makes it easy for SGD to “establish communication” between the input and the output.

参考

机器学习（二十三）——Beam Search, NLP机器翻译常用评价度量, 模型驱动 vs 数据驱动

机器翻译质量评测算法-BLEU

神经网络机器翻译Neural Machine Translation(1): Encoder-Decoder Architecture

神经网络机器翻译Neural Machine Translation(3): Achieving Open Vocabulary Neural MT

神经网络机器翻译Neural Machine Translation(4): Modeling Coverage & MRT (提升attention的对其质量，针对漏译与过译)

深度学习方法（九）：自然语言处理中的Attention Model注意力模型（没看）

Seq2seq模型的一个变种网络：Pointer Network的简单介绍（没看）

Pointer-network（没看）

Pointer Networks（没看）

机器阅读理解（没看）

论文阅读：《Neural Machine Translation by Jointly Learning to Align and Translate》（没看）

代码实践：

Pointer-network理论及tensorflow实战（很好）

Seq2Seq-PyTorch github

Pointer Networks in TensorFlow (with sample code)（没看）

2015 Tensorflow implementation of Semi-supervised Sequence Learning github

Search

Table of Contents