File:A demonstration of the pretraining tasks, including visual grounding, grounded captioning, image-text matching, image captioning, VQA, object detection, image infilling as well as text infilling.png

File
File history
File usage

Size of this preview: 799 × 342 pixels. Other resolution: 1,626 × 696 pixels.

Original file ‎(1,626 × 696 pixels, file size: 656 KB, MIME type: image/png)

Summary

Description	English: A demonstration of the pretraining tasks, including visual grounding, grounded captioning, image-text matching, image captioning, VQA, object detection, image infilling as well as text infilling.
Date	16 April 2023(2023-04-16)
File source	OFA: Unifying Architectures, Tasks and Modalities through a simple sequence to sequence learning framework
Author	PengWang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang

Licensing

Permission is granted to copy, distribute and/or modify this document according to the terms in Creative Commons License, Attribution-ShareAlike 4.0. The full text of this license may be found here: CC by-sa 4.0

File history

Click on a date/time to view the file as it appeared at that time.

	Date/Time	Thumbnail	Dimensions	User	Comment
current	09:09, 19 April 2023		1,626 × 696 (656 KB)	MEHARBHATIA (talk \| contribs)	Uploaded a work by PengWang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang from OFA: Unifying Architectures, Tasks and Modalities through a simple sequence to sequence learning framework with UploadWizard

You cannot overwrite this file.

File usage

The following 2 pages use this file: