[1]
Sang, H. and Hai, G. 2019. A Framework: Region-Frame-Attention-Compact Bilinear Pooling Layer Based S2VT For Video Description.
European Journal of Applied Sciences
. 7, 4 (Sep. 2019), 17–30. DOI:https://doi.org/10.14738/aivp.74.6717.