Skip to content

Developing Visual Language Models for Artificial Intelligence Systems

Chinese researchers from Sun Yat-sen University and Huawei Noah's Ark Lab, an international AI research entity connected to Huawei Technologies, compiled a database of 100 million images, including ordinary scenes like soccer matches and vaccination check-ups, each accompanied by text description.

Developing Visual-Linguistic Model Architectures
Developing Visual-Linguistic Model Architectures

Developing Visual Language Models for Artificial Intelligence Systems

In the realm of AI research, a significant dataset has been created by Sun Yat-sen University and Huawei Noah's Ark Lab, designed to train vision-language models in Chinese. However, this dataset, with its 100 million images, does not appear in the usual search results.

This dataset, featuring common scenes such as soccer games and vaccine screenings, is accompanied by text descriptions in both Chinese and English. But, finding a direct link or access instructions can be challenging.

Here are some strategies to locate this specialized dataset:

  1. Explore official publications and repositories: Researchers often release datasets alongside their research papers or on their institutional or lab webpages. These sources may provide download links, access protocols, or contact information for dataset requests.
  2. Scour academic conferences and AI model/data platforms: Large datasets are sometimes announced and published at conferences like CVPR, ICCV, ACL, or CV-related workshops, and the data is hosted on platforms such as GitHub, Zenodo, or AI-dedicated data hubs.
  3. Reach out to the authors or lab directly: If the dataset is not publicly indexed, contacting the corresponding authors or the lab's data management team can yield access options, including data use agreements if required.

While the search results do not list or link to the dataset directly, they do point to similar large public image or multimodal datasets collated for computer vision research. However, none of these explicitly mention the Chinese dataset by Sun Yat-sen University and Huawei Noah's Ark Lab.

If you require the dataset urgently, consider:

  • Visiting the websites of Sun Yat-sen University’s computer vision or AI labs and Huawei Noah’s Ark Lab for possible announcements or data release pages.
  • Looking up recent papers on Chinese vision-language pretraining by these groups, where dataset links or supplementary material might be provided.
  • Monitoring prominent AI dataset repositories or AI model databases such as Papers With Code or Epoch AI for new entries related to Chinese multimodal data.

Without direct links or official repositories in the public domain, access may require institutional affiliation or a data request process. It's important to note that the dataset was created by an international AI research organization called Huawei Noah's Ark Lab, and the images were not credited in this article, but are sourced from Flickr user Mathias Apitz (München).

  1. To access the Chinese dataset for training vision-language models created by Sun Yat-sen University and Huawei Noah's Ark Lab, it might be beneficial to search for it on official publications, academic conferences, or specialized dataset repositories associated with the universities or the lab.
  2. If the dataset is not readily available or indexed in search results, reaching out to the corresponding authors or the data management team of Sun Yat-sen University, Huawei Noah's Ark Lab, or even the original Flickr user Mathias Apitz can potentially provide access options or data use agreements.

Read also:

    Latest