1. [1] L. Yang, J. Hu, M. Qiu, C. Qu, J. Gao, W.B. Croft, X. Liu, Y. Shen, and J. Liu, "A hybrid retrieval-generation neural conversation model," arXiv preprint arXiv:1904.09068, 2019. [
DOI:10.1145/3357384.3357881]
2. [2] N. M. Rezk, M. Purnaprajna, T. Nordström and Z. Ul-Abdin,"Recurrent neural networks: an embedded computing perspective," IEEE Access, vol. 8, pp. 57967-57996, 2020. [
DOI:10.1109/ACCESS.2020.2982416]
3. [3] B. C. Mateus, M. Mendes, J. T. Farinha, R. Assis, and A. M. Cardoso, "Comparing LSTM and GRU models to predict the condition of a pulp paper press," Energies, vol. 14, issue 21,2021. [
DOI:10.3390/en14216958]
4. [4] U. Naseem, I. Razzak, S. Khalid-Khan, and M. Prasad, "A comprehensive survey on word representation models: from classical to state-of-the-art word representation language models," arXiv preprint arXiv:2010.15036, 2020. [
DOI:10.1145/3434237]
5. [5] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need. arXiv preprint arXiv:1706.03762, 2017.
6. [6] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
7. [7] M. Farahani, M. Gharachorloo, M. Farahani, and M. Manthouri, "ParsBERT: Transformer-based model for Persian language understanding," Neural Processing Letters, vol. 53, issue 6, pp. 3831-3847, Dec 2021. [
DOI:10.1007/s11063-021-10528-4]
8. [8] M. Schuster, and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, vol. 45, issue 11, pp. 2673 - 2681, 1997. [
DOI:10.1109/78.650093]
9. [9] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter," arXiv preprint arXiv:1910.01108, 2019.
10. [10] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, "RoBERTa: A robustly optimized BERT pretraining approach," arXiv preprint arXiv:1907.11692, 2019.
11. [11] T. Wolf, V. Sanh, J. Chaumond, C. Delangue, "TransferTransfo: a transfer learning approach for neural network based conversational agents," arXiv preprint arXiv:1901.08149, 2019.
12. [12] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever,"Improving language understanding by generative pre- training," 2018
13. [13] S. Roller, E. Dinan, N. Goyal, D. Ju, et al. "Recipes for building an open-domain Chatbot," Proc. Of 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021, pp. 300-325. [
DOI:10.18653/v1/2021.eacl-main.24]
14. [14] S. Bao, H. He, F. Wang, H. Wu, H. et al., "PLATO-2: towards building an open-domain Chatbot via curriculum learning," Findings of the Association for Computational Linguistics, 2021, pp. 2513-2525. [
DOI:10.18653/v1/2021.findings-acl.222]
15. [15] Q. Xie, Q. Zhang, D. Tan, T. Zhu, S. Xiao, B. Li, L. Sun, P. Yi, and J. Wang, "Chatbot application on cryptocurrency," IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr), 2019. [
DOI:10.1109/CIFEr.2019.8759121]
16. [16] S. Nagargoje, V. Mamdyal and R. Tapase, "Chatbot for depressed people," United International Journal for Research & Technology (UIJRT), vol. 2, issue 7, pp.208-211, 2021.
17. [17] A. Xu, Z. Liu, Y. Guo, V. Sinha and R. Akkiraju. "A new Chatbot for customer service on social media." Proc. of the 2017 CHI Conference on Human Factors in Computing Systems, 2017, pp. 3506-3510. [
DOI:10.1145/3025453.3025496] [
PMID]
18. [18] J. Zhang, T. He, S. Sra, A. Jadbabaie, "Why gradient clipping accelerates training: a theoretical justification for adaptivity," International Conference on Learning Representations, 2022.