您現在的位置:首頁> 外文會議>Annual meeting of the Society for Computation in Linguistics >文獻詳情

Unsupervised Learning of Cross-Lingual Symbol Embeddings Without Parallel Data

機器翻譯沒有并行數據的跨語言符號嵌入的無監督學習

原文傳遞 原文傳遞并翻譯 加入購物車 收藏
3 【6hr】

【摘要】We present a new method for unsupervised learning of multilingual symbol (e.g. character) embeddings, without any parallel data or prior knowledge about correspondences between languages. It is able to exploit similarities across languages between the distributions over symbols' contexts of use within their language, even in the absence of any symbols in common to the two languages. In experiments with an artificially corrupted text corpus, we show that the method can retrieve character correspondences obscured by noise. We then present encouraging results of applying the method to real linguistic data, including for low-resourced languages. The learned representations open the possibility of fully unsupervised comparative studies of text or speech corpora in low-resourced languages with no prior knowledge regarding their symbol sets.

【作者】Mark Granroth-Wilding; Hannu Toivonen;

【作者單位】University of Helsinki; University of Helsinki;

【年(卷),期】2019,,

【頁碼】19-28

【總頁數】10

【正文語種】eng

【中圖分類】;

【關鍵詞】;


激情球迷怎么玩 新浪体育nba直播间 挖矿真能赚钱吗 湖南幸运赛车开奖 快乐时时彩 彩票双色球开奖 北单sp查询 贵州快3 江苏7位数历史开奖结果 棋牌游戏通比牛牛技巧 斗牛棋牌玩法斗 什么手机麻将玩真钱的 双色球走势图带连线 怎样看自己的工作会赚钱 河南快3 全民二人麻将 提现 专业赛车pk10直播