您現在的位置:首頁> 外文會議>Annual meeting of the Society for Computation in Linguistics >文獻詳情

Unsupervised Learning of Cross-Lingual Symbol Embeddings Without Parallel Data

機器翻譯沒有并行數據的跨語言符號嵌入的無監督學習

原文傳遞 原文傳遞并翻譯 加入購物車 收藏
3 【6hr】

【摘要】We present a new method for unsupervised learning of multilingual symbol (e.g. character) embeddings, without any parallel data or prior knowledge about correspondences between languages. It is able to exploit similarities across languages between the distributions over symbols' contexts of use within their language, even in the absence of any symbols in common to the two languages. In experiments with an artificially corrupted text corpus, we show that the method can retrieve character correspondences obscured by noise. We then present encouraging results of applying the method to real linguistic data, including for low-resourced languages. The learned representations open the possibility of fully unsupervised comparative studies of text or speech corpora in low-resourced languages with no prior knowledge regarding their symbol sets.

【作者】Mark Granroth-Wilding; Hannu Toivonen;

【作者單位】University of Helsinki; University of Helsinki;

【年(卷),期】2019,,

【頁碼】19-28

【總頁數】10

【正文語種】eng

【中圖分類】;

【關鍵詞】;


激情球迷怎么玩 双色球机选号码 棋牌室图片中式 开奖现场直播 房地产投资赚钱技法电子书 股票融资是利好吗 江西快3统计图表 贩卖焦虑也能赚钱 半全场盈利技巧 E游彩安卓 学电子的如何去赚钱 股民是通过股票涨钱转卖股票赚钱 楚天福彩30选5走势图 黑龙江p62中奖查询 天津快乐十分预测 陕西十一选五开奖预测 广西快乐双彩官网