Sentence embedding

Sentence embedding is the collective name for a set of techniques in natural language processing (NLP) where sentences are mapped to vectors of real numbers.[1][2][3][4][5][6][7][8]

Application

Sentence embedding is used by the deep learning software libraries PyTorch[9] and TensorFlow.[10]

Popular embeddings are based on the hidden layer outputs of transformer models like BERT, see SBERT. An alternative direction is to aggregate word embeddings, such those returned by Word2vec, into sentence embeddings. The most straightforward approach is to simply compute the average of word vectors, known as continuous bag-of-words (CBOW).[11] However, more elaborate solutions based on word vector quantization have also been proposed. One such approach is the vector of locally aggregated word embeddings (VLAWE),[12] which demonstrated performance improvements in downstream text classification tasks.

Evaluation

A way of testing sentence encodings is to apply them on Sentences Involving Compositional Knowledge (SICK) corpus[13] for both entailment (SICK-E) and relatedness (SICK-R).

In [14] the best results are obtained using a BiLSTM network trained on the Stanford Natural Language Inference (SNLI) Corpus. The Pearson correlation coefficient for SICK-R is 0.885 and the result for SICK-E is 86.3. A slight improvement over previous scores is presented in:[15] SICK-R: 0.888 and SICK-E: 87.8 using a concatenation of bidirectional Gated recurrent unit.

External links

References

Paper Summary: Evaluation of sentence embeddings in downstream and linguistic probing tasks
Barkan, Oren; Razin, Noam; Malkiel, Itzik; Katz, Ori; Caciularu, Avi; Koenigstein, Noam (2019). "Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding". arXiv:1908.05161 [cs.LG].
The Current Best of Universal Word Embeddings and Sentence Embeddings
Cer, Daniel; Yang, Yinfei; Kong, Sheng-yi; Hua, Nan; Limtiaco, Nicole; John, Rhomni St.; Constant, Noah; Guajardo-Cespedes, Mario; Yuan, Steve; Tar, Chris; Sung, Yun-Hsuan; Strope, Brian; Kurzweil, Ray (2018). "Universal Sentence Encoder". arXiv:1803.11175 [cs.CL].
Wu, Ledell; Fisch, Adam; Chopra, Sumit; Adams, Keith; Bordes, Antoine; Weston, Jason (2017). "StarSpace: Embed All the Things!". arXiv:1709.03856 [cs.CL].
Sanjeev Arora, Yingyu Liang, and Tengyu Ma. "A simple but tough-to-beat baseline for sentence embeddings.", 2016; openreview:SyK00v5xx.
Trifan, Mircea; Ionescu, Bogdan; Gadea, Cristian; Ionescu, Dan (2015). "A graph digital signal processing method for semantic analysis". 2015 IEEE 10th Jubilee International Symposium on Applied Computational Intelligence and Informatics. pp. 187–192. doi:10.1109/SACI.2015.7208196. ISBN 978-1-4799-9911-8. S2CID 17099431.
Basile, Pierpaolo; Caputo, Annalina; Semeraro, Giovanni (2012). "A Study on Compositional Semantics of Words in Distributional Spaces". 2012 IEEE Sixth International Conference on Semantic Computing. pp. 154–161. doi:10.1109/ICSC.2012.55. ISBN 978-1-4673-4433-3. S2CID 552921.
Microsoft. "distilled-sentence-embedding". GitHub.
Google. "universal-sentence-encoder". TensorFlow Hub. Retrieved 6 October 2018. {{cite web}}: |last= has generic name (help)
Mikolov, Tomas; Chen, Kai; Corrado, Greg; Dean, Jeffrey (2013-09-06). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781 [cs.CL].
Ionescu, Radu Tudor; Butnaru, Andrei (2019). "Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation". Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics: 363–369. doi:10.18653/v1/N19-1033. S2CID 85500146.
Marco Marelli, Stefano Menini, Marco Baroni, Luisa Bentivogli, Raffaella Bernardi, and Roberto Zamparelli. "A SICK cure for the evaluation of compositional distributional semantic models." In LREC, pp. 216-223. 2014 .
Conneau, Alexis; Kiela, Douwe; Schwenk, Holger; Barrault, Loic; Bordes, Antoine (2017). "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data". arXiv:1705.02364 [cs.CL].
Subramanian, Sandeep; Trischler, Adam; Bengio, Yoshua; Christopher J Pal (2018). "Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning". arXiv:1804.00079 [cs.CL].

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Paper Summary: Evaluation of sentence embeddings in downstream and linguistic probing tasks

[2] Barkan, Oren; Razin, Noam; Malkiel, Itzik; Katz, Ori; Caciularu, Avi; Koenigstein, Noam (2019). "Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding". arXiv:1908.05161 [cs.LG].

[3] The Current Best of Universal Word Embeddings and Sentence Embeddings

[4] Cer, Daniel; Yang, Yinfei; Kong, Sheng-yi; Hua, Nan; Limtiaco, Nicole; John, Rhomni St.; Constant, Noah; Guajardo-Cespedes, Mario; Yuan, Steve; Tar, Chris; Sung, Yun-Hsuan; Strope, Brian; Kurzweil, Ray (2018). "Universal Sentence Encoder". arXiv:1803.11175 [cs.CL].

[5] Wu, Ledell; Fisch, Adam; Chopra, Sumit; Adams, Keith; Bordes, Antoine; Weston, Jason (2017). "StarSpace: Embed All the Things!". arXiv:1709.03856 [cs.CL].

[6] Sanjeev Arora, Yingyu Liang, and Tengyu Ma. "A simple but tough-to-beat baseline for sentence embeddings.", 2016; openreview:SyK00v5xx.

[7] Trifan, Mircea; Ionescu, Bogdan; Gadea, Cristian; Ionescu, Dan (2015). "A graph digital signal processing method for semantic analysis". 2015 IEEE 10th Jubilee International Symposium on Applied Computational Intelligence and Informatics. pp. 187–192. doi:10.1109/SACI.2015.7208196. ISBN 978-1-4799-9911-8. S2CID 17099431.

[8] Basile, Pierpaolo; Caputo, Annalina; Semeraro, Giovanni (2012). "A Study on Compositional Semantics of Words in Distributional Spaces". 2012 IEEE Sixth International Conference on Semantic Computing. pp. 154–161. doi:10.1109/ICSC.2012.55. ISBN 978-1-4673-4433-3. S2CID 552921.

[9] Microsoft. "distilled-sentence-embedding". GitHub.

[10] Google. "universal-sentence-encoder". TensorFlow Hub. Retrieved 6 October 2018. {{cite web}}: |last= has generic name (help)

[11] Mikolov, Tomas; Chen, Kai; Corrado, Greg; Dean, Jeffrey (2013-09-06). "Efficient Estimation of Word Representations in Vector Space". arXiv:1301.3781 [cs.CL].

[12] Ionescu, Radu Tudor; Butnaru, Andrei (2019). "Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation". Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics: 363–369. doi:10.18653/v1/N19-1033. S2CID 85500146.

[13] Marco Marelli, Stefano Menini, Marco Baroni, Luisa Bentivogli, Raffaella Bernardi, and Roberto Zamparelli. "A SICK cure for the evaluation of compositional distributional semantic models." In LREC, pp. 216-223. 2014 .

[14] Conneau, Alexis; Kiela, Douwe; Schwenk, Holger; Barrault, Loic; Bordes, Antoine (2017). "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data". arXiv:1705.02364 [cs.CL].

[15] Subramanian, Sandeep; Trischler, Adam; Bengio, Yoshua; Christopher J Pal (2018). "Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning". arXiv:1804.00079 [cs.CL].

Sentence embedding

Application

Evaluation

See also

External links

References