视觉问答¶

1. 项目介绍¶

视觉问答（Visual Question Answering,VQA）是一项结合计算机视觉和自然语言处理的学习任务，该项目使用 BLIP 视觉语言模态预训练模型在 VQA 任务上进行 Finetuned ，我们只需要上传图片和输入问题，便能快速生成问题答案，快来试试吧！

2. 项目结构¶

# 显示文件夹树状目录
import os
import os.path
 
def dfs_showdir(path, depth):
    if depth == 0:
        print("root:[" + path + "]")
 
    for item in os.listdir(path):
        if item[0] not in ['.', '__']:
            print("|      " * depth + "+--" + item)
            newitem = path +'/'+ item
            if os.path.isdir(newitem):
                dfs_showdir(newitem, depth +1)
 
 
if __name__ == '__main__':
    path = os.getcwd()             # 文件夹路径
    dfs_showdir(path, 0)  # 显示文件夹的树状结构

root:[/home/jovyan/work]
+--handler.py
+--app_spec.yml
+--img
|      +--demo.jpg
+--models
|      +--__init__.py
|      +--blip.py
|      +--blip_itm.py
|      +--blip_nlvr.py
|      +--blip_pretrain.py
|      +--blip_retrieval.py
|      +--blip_vqa.py
|      +--med.py
|      +--nlvr_encoder.py
|      +--vit.py
|      +--__pycache__
|      |      +--__init__.cpython-37.pyc
|      |      +--vit.cpython-37.pyc
|      |      +--blip_vqa.cpython-37.pyc
|      |      +--blip.cpython-37.pyc
|      |      +--med.cpython-37.pyc
+--configs
|      +--bert_config.json
|      +--caption_coco.yaml
|      +--med_config.json
|      +--nlvr.yaml
|      +--nocaps.yaml
|      +--pretrain.yaml
|      +--retrieval_coco.yaml
|      +--retrieval_flickr.yaml
|      +--retrieval_msrvtt.yaml
|      +--vqa.yaml
+--ckpt
|      +--model_base_vqa_capfilt_large.pth
+--demo.ipynb
+--requirement.txt
+--__pycache__
|      +--handler.cpython-37.pyc
|      +--app.cpython-37.pyc
+--env.ipynb.invalid
+--env.ipynb
+--project_requirements.txt
+--Introduce.ipynb
+--app.py

3. 项目demo¶

# 环境安装
!/home/jovyan/.virtualenvs/basenv/bin/pip install -r requirement.txt -i https://pypi.doubanio.com/simple/

# 导入相关模块
from app import *

2022-08-08 17:13:03.608206: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2022-08-08 17:13:03.608246: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

load checkpoint from ./ckpt/model_base_vqa_capfilt_large.pth

print("Input img:")
Image.open('./img/demo.jpg').resize((256, 256))

Input img:

print("Output Answer:")
handle({'Photo': './img/demo.jpg', 'Question': 'What is in this image?'})

Output Answer:
Runned time: 2.17 s

'woman and dog'

Introduce.ipynb @master — view markup · raw · history · blame

视觉问答¶

1. 项目介绍¶

2. 项目结构¶

3. 项目demo¶