2019年1月17日木曜日

Google BERTはNMTのTransfer Learningを実現?

Google Research BERT on Github:  BERT


BERT is a method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). BERT outperforms previous methods because it is the first unsuperviseddeeply bidirectional system for pre-training NLP.

凄すぎますね!
Pre-training is fairly expensive (four days on 4 to 16 Cloud TPUs), but is a one-time procedure for each language (current models are English-only, but multilingual models will be released in the near future). We are releasing a number of pre-trained models from the paper which were pre-trained at Google. Most NLP researchers will never need to pre-train their own model from scratch.

Cloud TPU4~16個をGoogle Cloudで借りて4日間で日本語版を作成できる?もっと凄いですね!

と思いましたら、なんと既にMultilingual Modelがありまして、日本語が含まれていました!

日本語!

Multilingual Modelはこちら:Multilingual Model

ちょっと嬉しい発見でしたので午後にまた続けます。午前はOpenai Gymに充てようと思います。

更新:

BERTの内部動作を解明


Conclusion:

Bert is able to use contextual info to map words in several locations based on context vs. Word2Vec which maps words to one spot regardless of context or differences in meaning, ie. right and left indicating direction or the political right and left.



0 件のコメント:

コメントを投稿

My Github repo

In case anyone is interested:   https://github.com/nyck33