Èç½ñ£¬×ÔÈ»ÓïÑÔ´¦Àí£¨NLP£©¿Éν±éµØ¿ª»¨£¬¿ÉÒÔ˵ÕýÊÇÎÒÃÇÁ˽âËüµÄºÃʱ»ú¡£
NLPµÄ¿ìËÙÔö³¤Ö÷ÒªµÃÒæÓÚͨ¹ýԤѵÁ·Ä£ÐÍʵÏÖ×ªÒÆÑ§Ï°µÄ¸ÅÄî¡£ÔÚNLPÖУ¬×ªÒÆÑ§Ï°±¾ÖÊÉÏÊÇÖ¸ÔÚÒ»¸öÊý¾Ý¼¯ÉÏѵÁ·Ä£ÐÍ£¬È»ºóµ÷Õû¸ÃÄ£ÐÍÒÔ±ãÔÚ²»Í¬Êý¾Ý¼¯ÉÏʵÏÖNLPµÄ¹¦ÄÜ¡£
ÕâÒ»Í»ÆÆÊ¹NLPÓ¦ÓñäµÃÈç´Ë¼òµ¥£¬ÓÈÆäÊÇÄÇЩûÓÐʱ¼ä»ò×ÊÔ´´ÓÍ·¿ªÊ¼¹¹½¨NLPÄ£Ð͵ÄÈË¡£»òÕߣ¬¶ÔÓÚÏëҪѧϰ»òÕß´ÓÆäËûÁìÓò¹ý¶Éµ½NLPµÄÐÂÊÖÀ´Ëµ£¬Õâ¼òÖ±¾ÍÊÇÍêÃÀ¡£
ΪʲôҪʹÓÃԤѵÁ·Ä£ÐÍ£¿
Ä£Ð͵Ä×÷ÕßÒѾÉè¼Æ³öÁË»ù׼ģÐÍ£¬ÕâÑùÎÒÃǾͿÉÒÔÔÚ×Ô¼ºµÄNLPÊý¾Ý¼¯ÉÏʹÓøÃԤѵÁ·Ä£ÐÍ£¬¶øÎÞÐè´ÓÍ·¿ªÊ¼¹¹½¨Ä£ÐÍÀ´½â¾öÀàËÆµÄÎÊÌâ
¾¡¹ÜÐèÒª½øÐÐһЩ΢µ÷£¬µ«ÕâΪÎÒÃǽÚÊ¡ÁË´óÁ¿µÄʱ¼äºÍ¼ÆËã×ÊÔ´
ÔÚ±¾ÎÄÖÐչʾÁËÄÇЩÖúÄ㿪ʼNLPÖ®ÂõĶ¥¼¶Ô¤ÑµÁ·Ä£ÐÍ£¬ÒÔ¼°¸ÃÁìÓòµÄ×îÐÂÑо¿³É¹û¡£Òª²é¿´¹ØÓÚ¼ÆËã»úÊÓ¾õÖеĶ¥¼¶Ô¤ÑµÁ·Ä£Ð͵ÄÎÄÕ£¬Çë²ÎÔÄ£º
https://www.analyticsvidhya.com/blog/2018/07/top-10-pretrained-models-get-started-deep-learning-part-1-computer-vision/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
±¾Îĺ¸ÇµÄNLPԤѵÁ·Ä£ÐÍ
ÎÒ¸ù¾ÝÓ¦Óó¡¾°½«Ô¤ÑµÁ·Ä£ÐÍ·ÖΪÈýÀࣺ
¶àÓÃ;NLPÄ£ÐÍ
ULMFiT
Transformer
¹È¸èµÄBERT
Transformer-XL
OpenAIµÄGPT-2
´ÊǶÈëNLPÄ£ÐÍ
ELMo
Flair
ÆäËûԤѵÁ·Ä£ÐÍ
StanfordNLP
¶àÓÃ;NLPÄ£ÐÍ
¶àÓÃ;ģÐÍÔÚNLPÁìÓòÀïһֱΪÈËÃÇËù¹Ø×¢¡£ÕâЩģÐÍΪÌṩÁËÐí¶àÁîÈ˸ÐÐËȤµÄNLPÓ¦Óà - »úÆ÷·Òë¡¢ÎÊ´ðϵͳ¡¢ÁÄÌì»úÆ÷ÈË¡¢Çé¸Ð·ÖÎöµÈ¡£ÕâЩ¶àÓÃ;NLPÄ£Ð͵ĺËÐÄÊÇÓïÑÔ½¨Ä£µÄÀíÄî¡£
¼òµ¥À´Ëµ£¬ÓïÑÔÄ£Ð͵ÄÄ¿µÄÊÇÔ¤²âÓï¾äÐòÁÐÖеÄÏÂÒ»¸öµ¥´Ê»ò×Ö·û£¬ÔÚÎÒÃÇÁ˽â¸÷Ä£ÐÍʱ¾Í»áÃ÷°×ÕâÒ»µã¡£
Èç¹ûÄãÊÇNLP°®ºÃÕߣ¬ÄÇôһ¶¨»áϲ»¶ÏÖÔÚÕⲿ·Ö£¬ÈÃÎÒÃÇÉîÈëÑо¿5¸ö×îÏȽøµÄ¶àÓÃ;NLPÄ£ÐÍ¿ò¼Ü¡£ÕâÀïÎÒÌṩÁËÿÖÖÄ£Ð͵ÄÑо¿ÂÛÎĺÍԤѵÁ·Ä£Ð͵ÄÁ´½Ó£¬À´Ì½Ë÷һϰɣ¡
ULMFiTÄ£ÐÍ
ULMFiTÓÉfast.ai£¨Éî¶ÈÑ§Ï°ÍøÕ¾£©µÄJeremy HowardºÍDeepMind£¨Ò»¼ÒÈ˹¤ÖÇÄÜÆóÒµ£©µÄSebastian RuderÌá³ö²¢Éè¼Æ¡£¿ÉÒÔÕâô˵£¬ULMFiT¿ªÆôÁË×ªÒÆÑ§Ï°µÄÈȳ±¡£
ÕýÈçÎÒÃÇÔÚ±¾ÎÄÖÐËùÊö£¬ULMFiTʹÓÃÐÂÓ±µÄNLP¼¼ÊõÈ¡µÃÁËÁîÈËÖõÄ¿µÄ³É¹û¡£¸Ã·½·¨¶ÔԤѵÁ·ÓïÑÔÄ£ÐͽøÐÐ΢µ÷£¬½«ÆäÔÚWikiText-103Êý¾Ý¼¯£¨Î¬»ù°Ù¿ÆµÄ³¤ÆÚÒÀÀµÓïÑÔ½¨Ä£Êý¾Ý¼¯WikitextÖ®Ò»£©ÉÏѵÁ·£¬´Ó¶øµÃµ½ÐÂÊý¾Ý¼¯£¬Í¨¹ýÕâÖÖ·½Ê½Ê¹Æä²»»áÍü¼Ç֮ǰѧ¹ýµÄÄÚÈÝ¡£
ULMFiT±ÈÐí¶àÏȽøµÄÎı¾·ÖÀàÄ£ÐÍ»¹ÒªºÃ¡£ÎÒϲ°®ULMFiTÊÇÒòΪËüÖ»ÐèÒªºÜÉÙµÄÊý¾Ý¾Í¿ÉÒÔÀ´²úÉúÁîÈËÓ¡ÏóÉî¿ÌµÄ½á¹û£¬Ê¹ÎÒÃǸüÈÝÒ×Àí½â²¢ÔÚ»úÆ÷ÉÏʵÏÖËü£¡
Ò²ÐíÄã²»ÖªµÀULMFiTÆäʵÊÇUniversal Language Model Fine-Tuning£¨Í¨ÓÃÓïÑÔÄ£ÐÍ΢µ÷£©µÄ¼ò³Æ¡£¡°Í¨Óá±Ò»´ÊÔÚÕâÀï·Ç³£ÌùÇÐ - ¸Ã¿ò¼Ü¼¸ºõ¿ÉÒÔÓ¦ÓÃÓÚÈκÎNLPÈÎÎñ¡£
ÏëÖªµÀÓйØULMFiTµÄ¸ü¶àÐÅÏ¢£¬Çë²ÎÔÄÒÔÏÂÎÄÕºÍÂÛÎÄ£º
ʹÓÃULMFiTÄ£ÐͺÍPython µÄfastai¿â½øÐÐÎı¾·ÖÀࣨNLP£©½Ì³Ì
https://www.analyticsvidhya.com/blog/2018/11/tutorial-text-classification-ulmfit-fastai-library/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
ULMFiTµÄԤѵÁ·Ä£ÐÍÂÛÎÄ
https://www.paperswithcode.com/paper/universal-language-model-fine-tuning-for-text
ÆäËûÑо¿ÂÛÎÄ
https://arxiv.org/abs/1801.06146
TransformerÄ£ÐÍ
Transformer¼Ü¹¹ÊÇNLP½üÆÚ×îºËÐĵķ¢Õ¹£¬2017ÄêÓɹȸèÌá³ö¡£µ±Ê±£¬ÓÃÓÚÓïÑÔ´¦ÀíÈÎÎñ£¨Èç»úÆ÷·ÒëºÍÎÊ´ðϵͳ£©µÄÊÇÑ»·Éñ¾ÍøÂ磨RNN£©¡£
Transformer¼Ü¹¹µÄÐÔÄܱÈRNNºÍCNN£¨¾í»ýÉñ¾ÍøÂ磩ҪºÃ£¬ÑµÁ·Ä£ÐÍËùÐèµÄ¼ÆËã×ÊÔ´Ò²¸üÉÙ£¬Õâ¶ÔÓÚÿ¸öʹÓÃNLPµÄÈËÀ´Ëµ¶¼ÊÇ˫Ӯ¡£¿´¿´ÏÂÃæµÄ±È½Ï£º
¸÷Ä£Ð͵ÄÓ¢µÂ·ÒëÖÊÁ¿
¸ù¾ÝGoogleµÄ˵·¨£¬TransformerÄ£ÐÍ¡°Ó¦ÓÃÁËÒ»ÖÖ×Ô×¢ÒâÁ¦»úÖÆ£¬¿ÉÖ±½ÓÄ£Äâ¾ä×ÓÖÐËùÓе¥´ÊÖ®¼äµÄ¹ØÏµ£¬¶øÎÞÐèÀí»áÆä¸÷×ÔµÄλÖÃÈçºÎ¡±¡£ËüʹÓù̶¨³¤¶ÈÉÏÏÂÎÄ£¨Ò²¾ÍÊÇÇ°ÃæµÄµ¥´Ê£©À´ÊµÏÖ¡£Ì«¸´ÔÓÁË£¿Ã»Ê£¬ÎÒÃǾÙÒ»Àý×Ó¼òµ¥ËµÃ÷¡£
Óо仰¡°She found the shells on the bank of the river.¡±´ËʱģÐÍÐèÒªÃ÷°×£¬ÕâÀïµÄ¡°bank¡±ÊÇÖ¸°¶±ß£¬¶ø²»ÊǽðÈÚ»ú¹¹£¨ÒøÐУ©¡£TransformerÄ£ÐÍÖ»ÐèÒ»²½¾ÍÄÜÀí½âÕâÒ»µã¡£ÎÒÏ£ÍûÄãÄÜÔĶÁÏÂÃæÁ´½ÓµÄÍêÕûÂÛÎÄ£¬ÒÔÁ˽âÆä¹¤×÷ÔÀí¡£Ëü¿Ï¶¨»áÈÃÄã´ó³ÔÒ»¾ª¡£
¹È¸èÈ¥Äê·¢²¼ÁËÒ»¿îÃûΪUniversal TransformerµÄ¸Ä½ø°æTransformerÄ£ÐÍ¡£Ëü»¹ÓÐÒ»¸ö¸üУ¬¸üÖ±¹ÛµÄÃû×Ö£¬½ÐTransformer-XL£¬ÎÒÃǽ«ÔÚºóÃæ½éÉÜ¡£
ÏëѧϰºÍÔĶÁ¸ü¶àÓйØTransformerµÄÐÅÏ¢£¬Çë·ÃÎÊ£º
¹È¸è¹Ù·½²©ÎÄ
https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
TransformerԤѵÁ·Ä£ÐÍÂÛÎÄ¡¶Attention Is All You Need¡·
https://www.paperswithcode.com/paper/attention-is-all-you-need
ÆäËûÑо¿ÂÛÎÄ
https://arxiv.org/abs/1706.03762
BERTÄ£ÐÍ£¨¹È¸è£©
¹È¸è·¢²¼BERT¿ò¼Ü²¢¿ª·ÅÆäÔ´´úÂëÔÚÒµ½çÏÆÆð²¨À½£¬ÉõÖÁÓÐÈËÈÏΪÕâÊÇ·ñ±êÖ¾×Å¡° NLPÐÂʱ´ú¡±µÄµ½À´¡£µ«ÖÁÉÙÓÐÒ»µã¿ÉÒԿ϶¨£¬BERTÊÇÒ»¸ö·Ç³£ÓÐÓõĿò¼Ü£¬¿ÉÒԺܺõØÍƹ㵽¸÷ÖÖNLPÈÎÎñÖС£
BERTÊÇBidirectionalEncoderRepresentations£¨Ë«Ïò±àÂëÆ÷±íÕ÷£©µÄ¼ò³Æ¡£Õâ¸öÄ£ÐÍ¿ÉÒÔͬʱ¿¼ÂÇÒ»¸ö´ÊµÄÁ½²à£¨×ó²àºÍÓҲࣩÉÏÏÂÎÄ£¬¶øÒÔǰµÄËùÓÐÄ£ÐÍÿ´Î¶¼ÊÇÖ»¿¼ÂǴʵĵ¥²à£¨×ó²à»òÓҲࣩÉÏÏÂÎÄ¡£ÕâÖÖË«Ïò¿¼ÂÇÓÐÖúÓÚÄ£Ð͸üºÃµØÀí½âµ¥´ÊµÄÉÏÏÂÎÄ¡£´ËÍ⣬BERT¿ÉÒÔ½øÐжàÈÎÎñѧϰ£¬Ò²¾ÍÊÇ˵£¬Ëü¿ÉÒÔͬʱִÐв»Í¬µÄNLPÈÎÎñ¡£
BERTÊÇÊ׸öÎ޼ලµÄ¡¢Éî¶ÈË«ÏòԤѵÁ·NLPÄ£ÐÍ£¬½öʹÓô¿Îı¾ÓïÁÏ¿â½øÐÐѵÁ·¡£
ÔÚ·¢²¼Ê±£¬¹È¸è³ÆBERT½øÐÐÁË11¸ö×ÔÈ»ÓïÑÔ´¦Àí£¨NLP£©ÈÎÎñ£¬²¢²úÉú¸ßˮƽµÄ½á¹û£¬Õâһ׳¾ÙÒâÒåÉîÔ¶£¡Äã¿ÉÒÔÔڶ̶̼¸¸öСʱÄÚ£¨ÔÚµ¥¸öGPUÉÏ£©Ê¹ÓÃBERTѵÁ·ºÃ×Ô¼ºµÄNLPÄ£ÐÍ£¨ÀýÈçÎÊ´ðϵͳ£©¡£
Ïë»ñµÃ¸ü¶àÓйØBERTµÄ×ÊÔ´£¬Çë²ÎÔÄ£º
¹È¸è¹Ù·½²©ÎÄ
https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
BERTԤѵÁ·Ä£ÐÍÂÛÎÄ
https://www.paperswithcode.com/paper/bert-pre-training-of-deep-bidirectional#code
ÆäËûÑо¿ÂÛÎÄ
https://arxiv.org/pdf/1810.04805.pdf
Transformer-XLÄ£ÐÍ£¨¹È¸è£©
´Ó³¤ÆÚÀ´¿´£¬¹È¸è·¢²¼µÄ°æ±¾¶ÔNLP¶øÑÔÊǷdz£ÖØÒªµÄ¡£Èç¹ûÄãÊdzõѧÕߣ¬Õâ¸ö¸ÅÄî¿ÉÄÜ»áÓе㼬ÊÖ£¬ËùÒÔÎÒ¹ÄÀøÄã¶à¶Á¼¸±éÀ´ÕÆÎÕËü¡£ÎÒ»¹ÔÚ±¾½ÚÏÂÃæÌṩÁ˶àÖÖ×ÊÔ´À´°ïÖúÄ㿪ʼʹÓÃTransformer-XL¡£
ÏëÏóһϡª¡ªÄã¸Õ¶Áµ½Ò»±¾ÊéµÄÒ»°ë£¬Í»È»³öÏÖÁËÕâ±¾Ê鿪ͷÌáµ½µÄÒ»¸ö´Ê»òÕßÒ»¾ä»°Ê±£¬¾ÍÄÜ»ØÒäÆðÄÇÊÇʲôÁË¡£µ«¿ÉÒÔÀí½â£¬»úÆ÷ºÜÄѽ¨Á¢³¤ÆÚµÄ¼ÇÒäÄ£ÐÍ¡£
ÈçÉÏËùÊö£¬Òª´ï³ÉÕâ¸öÄ¿µÄµÄÒ»ÖÖ·½·¨ÊÇʹÓÃTransformers£¬µ«ËüÃÇÊÇÔڹ̶¨³¤¶ÈµÄÉÏÏÂÎÄÖÐʵÏֵġ£»»¾ä»°Ëµ£¬Èç¹ûʹÓÃÕâÖÖ·½·¨£¬¾ÍûÓÐÌ«´óµÄÁé»îÐÔ¡£
Transformer-XLºÜºÃµØÃÖ²¹ÁËÕâ¸ö²î¾à¡£ËüÓÉGoogle AIÍŶӿª·¢£¬ÊÇÒ»ÖÖÐÂÓ±µÄNLP¼Ü¹¹£¬Äܹ»°ïÖú»úÆ÷Àí½â³¬³ö¹Ì¶¨³¤¶ÈÏÞÖÆµÄÉÏÏÂÎÄ¡£Transformer-XLµÄÍÆÀíËٶȱȴ«Í³µÄTransformer¿ì1800±¶¡£
ͨ¹ýä¯ÀÀÏÂÃæ¹È¸è·¢²¼µÄÁ½¸ögifÎļþ£¬Äã¾Í»áÃ÷°×ÕâÆäÖеÄÇø±ð:
ÕýÈçÄãÏÖÔÚ¿ÉÄÜÒѾԤ²âµ½µÄ£¬Transformer-XLÔÚ¸÷ÖÖÓïÑÔ½¨Ä£»ù×¼/Êý¾Ý¼¯ÉÏÈ¡µÃÁË×îеļ¼Êõ³É¹û¡£ÒÔÏÂÊÇËûÃÇÒ³ÃæÉϵÄÒ»¸öС±í¸ñ£¬ËµÃ÷ÁËÕâÒ»µã:
֮ǰ¸ø¹ýÁ´½Ó²¢½«ÔÚÏÂÃæÌáµ½µÄTransformer-XL GitHub´æ´¢¿â°üº¬ÁËPyTorchºÍTensorFlowÖеĴúÂë¡£
ѧϰºÍÔĶÁ¸ü¶àTransformer-XLÓйØÐÅÏ¢µÄ×ÊÔ´£º
¹È¸èµÄ¹Ù·½²©¿ÍÎÄÕÂ
https://ai.googleblog.com/2019/01/transformer-xl-unleashing-potential-of.html
Transformer-XLµÄԤѵÁ·Ä£ÐÍ
https://www.paperswithcode.com/paper/transformer-xl-attentive-language-models
Ñо¿ÂÛÎÄ
https://arxiv.org/abs/1901.02860
GPT-2Ä£ÐÍ£¨OpenAI£©
ÕâÊÇÒ»¸öÊ®·ÖÓÐÕùÒéµÄÄ£ÐÍ£¬Ò»Ð©ÈË»áÈÏΪGPT-2µÄ·¢²¼ÊÇOpenAIµÄÓªÏúàåÍ·¡£ÎÒ¿ÉÒÔÀí½âËûÃǵÄÏë·¨£¬µ«ÊÇÎÒÈÏΪÖÁÉÙÓ¦¸ÃÒªÏȶÔOpenAI·¢²¼µÄ´úÂë½øÐг¢ÊÔ¡£
Ê×ÏÈ£¬ÎªÄÇЩ²»ÖªµÀÎÒÔÚ˵ʲôµÄÈËÌṩһЩ±³¾°ÐÅÏ¢¡£OpenAIÔÚ2Ô·ݷ¢±íÁËһƪ²©¿ÍÎÄÕ£¬ËûÃÇÉù³ÆÒѾÉè¼ÆÁËÒ»¸öÃûΪGPT-2µÄNLPÄ£ÐÍ£¬Õâ¸öÄ£Ðͷdz£ºÃ£¬ÒÔÖÁÓÚµ£Ðı»¶ñÒâʹÓöøÎÞ·¨·¢²¼ÍêÕûµÄ°æ±¾£¬Õ⵱ȻÒýÆðÁËÉç»áµÄ¹Ø×¢¡£
GPT-2¾¹ýѵÁ·£¬¿ÉÒÔÓÃÀ´Ô¤²â40GBµÄ»¥ÁªÍøÎı¾Êý¾ÝÖеÄÏÂÒ»¸ö³öÏֵĴʡ£ ¸Ã¿ò¼ÜÒ²ÊÇÒ»¸ö»ùÓÚtransformerµÄÄ£ÐÍ£¬¶øÕâ¸öÄ£ÐÍÊÇ»ùÓÚ800Íò¸öwebÒ³ÃæµÄÊý¾Ý¼¯À´½øÐÐѵÁ·¡£ËûÃÇÔÚÍøÕ¾ÉÏ·¢²¼µÄ½á¹û¼òÖ±ÁîÈËÕ𾪣¬ÒòΪ¸ÃÄ£ÐÍÄܹ»¸ù¾ÝÎÒÃÇÊäÈëµÄ¼¸¸ö¾ä×Ó±àд³öÒ»¸öÍêÕûµÄ¹ÊÊ¡£¿´¿´Õâ¸öÀý×Ó£º
ÄÑÒÔÖÃÐÅ£¬Êǰɣ¿
¿ª·¢ÈËÔ±ÒѾ·¢²¼ÁËÒ»¸ö¸üС°æ±¾µÄGPT-2£¬¹©Ñо¿ÈËÔ±ºÍ¹¤³Ìʦ²âÊÔ¡£ÔʼģÐÍÓÐ15ÒÚ¸ö²ÎÊý¡ª¡ª¿ª·ÅÔ´ÂëʾÀýÄ£ÐÍÓÐ1.17ÒÚ¸ö²ÎÊý¡£
ѧϰºÍÔĶÁ¸ü¶àGPT-2ÓйØÐÅÏ¢µÄ×ÊÔ´£º
OpenAIµÄ¹Ù·½²©¿ÍÎÄÕÂ
https://openai.com/blog/better-language-models/
GPT-2µÄԤѵÁ·Ä£ÐÍ
https://github.com/openai/gpt-2
Ñо¿ÂÛÎÄ
https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
´ÊǶÈ루word embedding£©Ä£ÐÍ
ÎÒÃÇʹÓõĴó¶àÊý»úÆ÷ѧϰºÍÉî¶ÈѧϰËã·¨¶¼ÎÞ·¨Ö±½Ó´¦Àí×Ö·û´®ºÍ´¿Îı¾¡£ÕâЩ¼¼ÊõÒªÇóÎÒÃǽ«Îı¾Êý¾Ýת»»ÎªÊý×Ö£¬È»ºó²ÅÄÜÖ´ÐÐÈÎÎñ£¨ÀýÈç»Ø¹é»ò·ÖÀࣩ¡£
Òò´Ë¼òµ¥À´Ëµ£¬ ´ÊǶÈ루word embedding£©ÊÇÎı¾¿é£¬ÕâЩÎı¾¿é±»×ª»»³ÉÊý×ÖÒÔÓÃÓÚÖ´ÐÐNLPÈÎÎñ¡£´ÊǶÈ루word embedding£©¸ñʽͨ³£³¢ÊÔʹÓÃ×ֵ佫µ¥´ÊÓ³Éäµ½ÏòÁ¿¡£
Äã¿ÉÒÔÔÚÏÂÃæµÄÎÄÕÂÖиüÉîÈëµØÁ˽âword embedding¡¢ËüµÄ²»Í¬ÀàÐÍÒÔ¼°ÈçºÎÔÚÊý¾Ý¼¯ÖÐʹÓÃËüÃÇ¡£Èç¹ûÄã²»ÊìϤÕâ¸ö¸ÅÄÎÒÈÏΪ±¾Ö¸ÄϱضÁ£º
¶Ô´ÊǶÈëµÄÖ±¹ÛÀí½â£º´Ó¼ÆËãÏòÁ¿µ½Word2Vec
https://www.analyticsvidhya.com/blog/2019/03/pretrained-models-get-started-nlp/
ÔÚ±¾½ÚÖУ¬ÎÒÃǽ«½éÉÜNLPµÄÁ½¸ö×îÏȽøµÄ´ÊǶÈ루word embedding£©¡£ÎÒ»¹ÌṩÁ˽̳ÌÁ´½Ó£¬ÒÔ±ãÄã¿ÉÒÔ¶Ôÿ¸öÖ÷ÌâÓÐʵ¼ÊµÄÁ˽⡣
ELMoÄ£ÐÍ
Õâ¸öELMo²¢²»ÊÇ¡¶Ö¥Âé½Ö¡·ÀïµÄÄǸö½ÇÉ«£¬µ«ÊÇÕâ¸öELMo£¨Embeddings from Language Models£¨ÓïÑÔÄ£ÐÍǶÈ룩µÄËõд£©ÔÚ¹¹½¨NLPÄ£Ð͵ÄÉÏÏÂÎÄÖзdz£ÓÐÓá£
ELMoÊÇÒ»ÖÖÓÃÏòÁ¿ºÍǶÈë±íʾµ¥´ÊµÄз½·¨¡£ÕâЩELMo ´ÊǶÈ루word embedding£©°ïÖúÎÒÃÇÔÚ¶à¸öNLPÈÎÎñÉÏʵÏÖ×îÏȽøµÄ½á¹û£¬ÈçÏÂͼËùʾ:
ÈÃÎÒÃÇ»¨µãʱ¼äÀ´Á˽âÒ»ÏÂELMoÊÇÈçºÎ¹¤×÷µÄ¡£»ØÏëÒ»ÏÂÎÒÃÇ֮ǰÌÖÂÛ¹ýµÄË«ÏòÓïÑÔÄ£ÐÍ¡£´ÓÕâÆªÎÄÕÂÖÐÎÒÃÇÄܹ»µÃµ½Ìáʾ£¬¡°ELMoµ¥´ÊÏòÁ¿ÊÇÔÚË«²ãË«ÏòÓïÑÔÄ£ÐÍ£¨biLM£©µÄ»ù´¡ÉϽøÐмÆËãµÄ¡£Õâ¸öbiLMÄ£ÐÍÓÐÁ½²ãµþ¼ÓÔÚÒ»Æð£¬Ã¿Ò»²ã¶¼ÓÐ2¸öͨµÀ¡ª¡ªÇ°ÏòͨµÀºÍºóÏòͨµÀ£º
ELMoµ¥´Ê±íʾ¿¼ÂǼÆËã´ÊǶÈ루word embedding£©µÄÍêÕûÊäÈëÓï¾ä¡£Òò´Ë£¬¡°read¡±Õâµ¥´ÊÔÚ²»Í¬µÄÉÏÏÂÎÄÖоßÓв»Í¬µÄELMoÏòÁ¿¡£ÕâÓë¾É°æµÄ´ÊǶÈ루word embedding£©´ó²»Ïàͬ£¬¾É°æÖÐÎÞÂÛÔÚʲôÑùµÄÉÏÏÂÎÄÖÐʹÓõ¥´Ê¡°read¡±£¬·ÖÅ䏸¸Ãµ¥´ÊµÄÏòÁ¿ÊÇÏàͬµÄ¡£
ѧϰºÍÔĶÁ¸ü¶àELMoÓйØÐÅÏ¢µÄ×ÊÔ´£º
ÑÐò½¥½øµÄNLPÖ¸ÄÏ£¬Á˽âELMo´ÓÎı¾ÖÐÌáÈ¡ÌØÕ÷
https://www.analyticsvidhya.com/blog/2019/03/learn-to-use-elmo-to-extract-features-from-text/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
ԤѵÁ·Ä£Ð͵ÄGitHub´æ´¢¿â
https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md
Ñо¿ÂÛÎÄ
https://arxiv.org/pdf/1802.05365.pdf
FlairÄ£ÐÍ
Flair²»ÊÇÒ»¸ö´ÊǶÈ루word embedding£©£¬¶øÊÇËüµÄ×éºÏ¡£ÎÒÃÇ¿ÉÒÔ½«Flair³ÆÎª½áºÏÁËGloVe¡¢BERTÓëELMoµÈǶÈ뷽ʽµÄNLP¿â¡£Zalando ResearchµÄÓÅÐãÔ±¹¤ÒѾ¿ª·¢ÁË¿ªÔ´µÄFlair¡£
¸ÃÍŶÓÒѾΪÒÔÏÂNLPÈÎÎñ·¢²¼Á˼¸¸öԤѵÁ·Ä£ÐÍ£º
Ãû³Æ - ʵÌåʶ±ð£¨NER£©
´ÊÐÔ±ê×¢£¨PoS£©
Îı¾·ÖÀà
Åàѵ¶¨ÖÆÄ£ÐÍ
²»ÏàÐÅÂð£¿ÄÇô£¬Õâ¸ö¶ÔÕÕ±í»á°ïÄãÕÒµ½´ð°¸:
¡°Flair Embedding¡±ÊÇFlair¿âÖдò°üµÄÇ©ÃûǶÈ룬ËüÓÉÉÏÏÂÎÄ×Ö·û´®Ç¶ÈëÌṩ֧³Ö¡£Á˽âÖ§³ÖFlairµÄºËÐÄ×é¼þ¿ÉÒÔÔĶÁÕâÆªÎÄÕ£º
https://www.analyticsvidhya.com/blog/2019/02/flair-nlp-library-python/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
ÎÒÌØ±ðϲ»¶FlairµÄµØ·½ÊÇËüÖ§³Ö¶àÖÖÓïÑÔ£¬¶øÕâô¶àµÄNLP·¢Ðаæ´ó¶à¶¼Ö»ÓÐÓ¢Îİ汾¡£Èç¹ûNLPÒªÔÚÈ«Çò»ñµÃÎüÒýÁ¦£¬ÎÒÃÇÐèÒªÔÚ´Ë»ù´¡ÉϽøÐÐÀ©Õ¹¡£
ѧϰºÍÔĶÁ¸ü¶àÓйØFlairµÄ×ÊÔ´£º
Flair for NLP¼ò½é£ºÒ»¸ö¼òµ¥µ«¹¦ÄÜÇ¿´óµÄ×îÏȽøµÄNLP¿â
https://www.analyticsvidhya.com/blog/2019/02/flair-nlp-library-python/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
FlairµÄԤѵÁ·Ä£ÐÍ
https://github.com/zalandoresearch/flair
ÆäËûԤѵÁ·Ä£ÐÍ
StanfordNLP £¨Ë¹Ì¹¸££©
Ìáµ½À©Õ¹NLPʹÆä²»¾ÖÏÞÓÚÓ¢ÓÕâÀïÓÐÒ»¸öÒѾʵÏÖ¸ÃÄ¿µÄµÄ¿â¡ª¡ªStanfordNLP¡£Æä×÷ÕßÉù³ÆStanfordNLPÖ§³Ö³¬¹ý53ÖÖÓïÑÔ£¬Õ⵱ȻÒýÆðÁËÎÒÃǵÄ×¢Òâ¡£
ÎÒÃǵÄÍŶÓÊǵÚÒ»ÅúʹÓøÿⲢÔÚÕæÊµÊý¾Ý¼¯ÉÏ·¢²¼½á¹ûµÄÍŶÓÖ®Ò»¡£ÎÒÃÇͨ¹ý³¢ÊÔ£¬·¢ÏÖStanfordNLPȷʵΪÔÚ·ÇÓ¢ÓïÓïÑÔÉÏÓ¦ÓÃNLP¼¼ÊõÌṩÁ˺ܶà¿ÉÄÜÐÔ£¬±ÈÈçÓ¡µØÓï¡¢ººÓïºÍÈÕÓï¡£
StanfordNLPÊÇһϵÁо¹ýÔ¤ÏÈѵÁ·µÄ×îÏȽøµÄNLPÄ£Ð͵ļ¯ºÏ¡£ÕâЩģÐͲ»½öÊÇÔÚʵÑéÊÒÀï½øÐвâÊÔ¡ª¡ª×÷ÕßÔÚ2017ÄêºÍ2018ÄêµÄCoNLL¾ºÈüÖж¼Ê¹ÓÃÁËÕâЩģÐÍ¡£ÔÚStanfordNLPÖдò°üµÄËùÓÐԤѵÁ·NLPÄ£ÐͶ¼ÊÇ»ùÓÚPyTorch¹¹½¨µÄ£¬¿ÉÒÔÔÚÄã×Ô¼ºµÄ×¢ÊÍÊý¾ÝÉϽøÐÐѵÁ·ºÍÆÀ¹À¡£
ÎÒÃÇÈÏΪÄãÓ¦¸Ã¿¼ÂÇʹÓÃStanfordNLPµÄÁ½¸öÖ÷ÒªÔÒòÊÇ£º
ÓÃÓÚÖ´ÐÐÎı¾·ÖÎöµÄÍêÕûÉñ¾ÍøÂç¹ÜµÀ£¬°üÀ¨¡£
±ê¼Ç»¯
¶à×ÖÁîÅÆ£¨MWT£©ÍØÕ¹
´ÊÐλ¹Ô
´ÊÐÔ£¨POS£©ºÍ´ÊÐÎÌØÕ÷±ê¼Ç
ÒÀ´æÓï·¨·ÖÎö
Îȶ¨µÄStanford CoreNLPÈí¼þµÄ¹Ù·½Python½Ó¿Ú¡£
ѧϰºÍÔĶÁ¸ü¶àStanfordNLPÓйØÐÅÏ¢µÄ×ÊÔ´£º
StanfordNLP¼ò½é£ºÒ»¸ö²»¿É˼ÒéµÄÖ§³Ö53ÖÖÓïÑÔµÄ×îÏȽøNLP¿â £¨Ê¹ÓÃPython´úÂ룩
https://www.analyticsvidhya.com/blog/2019/02/stanfordnlp-nlp-library-python/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
StanfordNLPµÄԤѵÁ·Ä£ÐÍ
https://github.com/stanfordnlp/stanfordnlp
βע
Õâ¾ø²»ÊÇÒ»¸öԤѵÁ·NLPÄ£Ð͵ÄÏ꾡ÁÐ±í£¬Óиü¶àÄÜÓõĿÉÒÔÔÚÕâ¸öÍøÕ¾ÉÏÕÒµ½£ºhttps://paperswithcode.com
ÒÔÏÂÊÇѧϰNLPµÄһЩÓÐÓÃ×ÊÔ´£º
ʹÓÃPython¿Î³Ì½øÐÐ×ÔÈ»ÓïÑÔ´¦Àíhttps://courses.analyticsvidhya.com/courses/natural-language-processing-nlp?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
ÈÏÖ¤ÏîÄ¿£ºNLP³õѧÕßhttps://courses.analyticsvidhya.com/bundles/nlp-combo?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
×ÔÈ»ÓïÑÔ´¦Àí£¨NLP£©ÏµÁÐÎÄÕÂhttps://www.analyticsvidhya.com/blog/category/nlp/?utm_source=blog&utm_medium=top-pretrained-models-nlp-article
ÎÒºÜÏëÌýÌýÄã¶ÔÕâ·ÝÇåµ¥µÄ¿´·¨¡£ÄãÒÔǰÓùýÕâЩԤѵÁ·¹ýµÄÄ£ÐÍÂ𣿻òÕßÄãÒѾ̽Ë÷¹ýÆäËûµÄÄ£ÐÍ£¿ÇëÔÚÏÂÃæµÄÆÀÂÛÇø¸æËßÎÒ¡ª¡ªÎÒºÜÀÖÒâËÑË÷ËüÃDz¢Ìí¼Óµ½Õâ¸öÁбíÖС£
-
Êý¾Ý¼¯
+¹Ø×¢
¹Ø×¢
4ÎÄÕÂ
1224ä¯ÀÀÁ¿
25541 -
×ÔÈ»ÓïÑÔ´¦Àí
+¹Ø×¢
¹Ø×¢
1ÎÄÕÂ
628ä¯ÀÀÁ¿
14198 -
nlp
+¹Ø×¢
¹Ø×¢
1ÎÄÕÂ
490ä¯ÀÀÁ¿
22660
ÔÎıêÌ⣺8ÖÖÓÅÐãԤѵÁ·Ä£ÐÍ´óÅ̵㣬NLPÓ¦ÓÃso easy£¡
ÎÄÕ³ö´¦£º¡¾Î¢ÐźţºBigDataDigest£¬Î¢ÐŹ«Öںţº´óÊý¾ÝÎÄÕª¡¿»¶ÓÌí¼Ó¹Ø×¢£¡ÎÄÕÂ×ªÔØÇë×¢Ã÷³ö´¦¡£
·¢²¼ÆÀÂÛÇëÏÈ µÇ¼
ÇëÎÊÈçºÎÔÚimx8mplusÉϲ¿ÊðºÍÔËÐÐYOLOv5ѵÁ·µÄÄ£ÐÍ£¿
ÓÃPaddleNLPΪGPT-2Ä£ÐÍÖÆ×÷FineWeb¶þ½øÖÆÔ¤ÑµÁ·Êý¾Ý¼¯

ѵÁ·ºÃµÄaiÄ£Ð͵¼Èëcubemx²»³É¹¦Ôõô´¦Àí£¿
ÊÇ·ñ¿ÉÒÔÊäÈëËæ»úÊý¾Ý¼¯À´Éú³ÉINT8ѵÁ·ºóÁ¿»¯Ä£ÐÍ£¿
´ÓOpen Model ZooÏÂÔØµÄFastSeg´óÐ͹«¹²Ô¤ÑµÁ·Ä£ÐÍ£¬ÎÞ·¨µ¼ÈëÃû³ÆÊÇÔõô»ØÊ£¿
ÓÃPaddleNLPÔÚ4060µ¥¿¨ÉÏʵ¼ù´óÄ£ÐÍԤѵÁ·¼¼Êõ

¡¾¡¸»ùÓÚ´óÄ£Ð͵ÄRAGÓ¦Óÿª·¢ÓëÓÅ»¯¡¹ÔĶÁÌåÑé¡¿+´óÄ£ÐÍ΢µ÷¼¼Êõ½â¶Á
KerasHubͳһ¡¢È«ÃæµÄԤѵÁ·Ä£ÐÍ¿â
GPUÊÇÈçºÎѵÁ·AI´óÄ£Ð͵Ä
ÈçºÎʹÓÃFP8м¼Êõ¼ÓËÙ´óÄ£ÐÍѵÁ·
ʲôÊÇ´óÄ£ÐÍ¡¢´óÄ£ÐÍÊÇÔõôѵÁ·³öÀ´µÄ¼°´óÄ£ÐÍ×÷ÓÃ

ÈçºÎѵÁ·×Ô¼ºµÄLLMÄ£ÐÍ
ÈçºÎѵÁ·×Ô¼ºµÄAI´óÄ£ÐÍ
Ö±²¥Ô¤Ô¼ |Êý¾ÝÖÇÄÜϵÁн²×ùµÚ4ÆÚ£ºÔ¤ÑµÁ·µÄ»ù´¡Ä£ÐÍϵijÖÐøÑ§Ï°

FP8Ä£ÐÍѵÁ·ÖÐDebugÓÅ»¯Ë¼Â·

ÆÀÂÛ