MOTIVATION

We aim to build a smart system that can automatically answer visual questions from blind people. The images and questions you will see are from visually impaired people and the answer is from crowd workers.


TASK

In this task, we ask you to carefully review the question, image, and the answer provided, and then finish step 1, step 2, step 3 (if activated), and step 4. There are examples to follow for each step.


Step 1: Indicate if there is more than one question asked.

Show step 1's examples

    Question: Could you tell me the expiration date on this milk and is this lactose free?
    Answer: Feb 20 2021. Yes.

    ...

    Yes

    More than one questions asked.

    Question: Can you answer this question? Is this lactose free?
    Answer: Yes.

    ...

    No

    Though there are two question markers, only one question is needed to be answered.

    Question: Is this a chair what color?
    Answer: Yes. Black.

    ...

    Yes

    More than one questions asked.


Step 2: Indicate if the answer is referring to more than one region/object in the image.

Show step 2's examples

    Question: What is this?
    Answer: Rice Vinegar.

    ...

    No

    It just refers to one object.

    Question: What is this?
    Answer: Mushroom.

    ...

    No

    It is referring the mushroom as a whole

    Question: What is this?
    Answer: Rice Vinegar, garlic powder, and pepper.

    ...

    Yes

    The answer focus on multi-objects.

    Question: How many mushrooms are there?
    Answer: Around 60

    ...

    Yes

    Usually, the answer to a counting question is referring to more than one object unless the counting answer is one.

    Question: How many bottles are there?
    Answer: 5

    ...

    Yes

    The answer is referring to 5 objects instead of one.


If both Step 1 and Step 2 are "No", go to step 3. Otherwise, step 3 will not be activated, please skip step 3 and go to step 4.


Step 3: You have two options for step 3:


Option (a): If the answer is not in the image, select the "cannot draw" option and indicate the reason why you cannot draw it.

Option (b): If the answer exists in the image, draw ONE closed polygon to segment the region/object that most prominently justifies the answer.

If you selected option (b), please follow the following instructions to draw the polygon:

• How to draw: click the image to draw points one by one to form a polygon. No drag operation is needed.

• How to finish drawing: move your cursor to the first point (the polygon will turn purple when your cursor is on the first point you draw), and click the first point to finish. Or you can press the keborad shortcut 'Enter' to finish.

• How to undo a point: You can use the keyboard shortcut 'Ctrl+Z' or click the Undo button to Undo.

• After finishing, the cursor will be disabled. If you would like to make a change, either click the Clear button or Undo button or use the keyboard shortcut 'Ctrl+Z' to enable the cursor.

Show step 3 option (b)'s examples

    Please view the 5 tabs to see the 5 different kinds of examples.


    If the object (e.g., tyre, donut, ring) has a hole, you just need to draw the outside boundary

    Question: What is this?
    Answer: laundry detergent

    ...

    Question: What is this?
    Answer: laundry detergent

    ...

    Just One closed polygon is allowed.


    when the answer is related to text, please first identify if the object is mentioned in the question.

    If not, see the example on the left. Usually, they ask questions like “what is this?”. In this case, text is used to describe the object and you should draw the outline of the object.

    If so, see the example on the middle and right. You should draw outline around the related text.

    Question: What is this?
    Answer: CeraVe daily moisturizing lotion

    ...

    The question is asking what the object is. Thus you need to draw the outline of the whole object. Note that the answer to this question is the same as the answer to the middle image, while their annotated area are different.

    Question: What type of the lotion is this?
    Answer: CeraVe daily moisturizing lotion

    ...

    The question has metioned what the object is. Thus you only need to draw outline around the related text area.

    Question: What's the brand of the lotion?
    Answer: CeraVe

    ...

    The question has metioned what the object is. Thus you only need to draw the outline of the related text area.


    If the answer is referring to the whole image, draw a rectangle to ground the whole picture as the target region. Usually, this happens when the camera is set too close to the object.

    Question: What is this?
    Answer: CeraVe daily moisturizing lotion

    ...

    Draw a rectangle to include the whole image



    Please try your best to trace the boundary of the object as tightly as possible. Only when the boundary is too complex that you may not need to perfectly trace the boundary.


    Question: What is this?
    Answer: CeraVe daily moisturizing lotion for normal to dry skin.

    ...

    Question: What is this?
    Answer: CeraVe daily moisturizing lotion for normal to dry skin.

    ...

    Please trace the boundar as tightly as you can

    Question: What plant is this?
    Answer: Dracaena sanderiana ...

    Question: What plant is this?
    Answer: Dracaena sanderiana

    ...

    Question: What plant is this?
    Answer: Dracaena sanderiana

    ...

    Please trace the boundary as tightly as possible


    If something obscures the target object, please do not include that. Please only label for the visible part of the target object.

    Question: What is this?
    Answer: CeraVe daily moisturizing lotion for normal to dry skin

    ...

    Avoid Occulusion if you can.

    Question: What is this?
    Answer: CeraVe daily moisturizing lotion for normal to dry skin

    ...

    In this case, it's acceptable to include the chopsticks. Otherwise you cannot use just one closed polygon to label for this image.

    Question: What is this?
    Answer: CeraVe daily moisturizing lotion for normal to dry skin

    ...

    You should not include the hand for this image annotation


Step 4: Click next and go to the next image.



NOTE

  • You will annotate for five image-question pairs in one HIT.

  • You cannot go to next page until you finish the current one.

  • Please do not refresh the webpage once you have started working, as you will lose all your progress, and have to start at the beginning.

  • It is possible that some images could be meaningless, inappropriate, or offensive. We cannot control what pictures are taken. Kindly use your best judgement for this task.

You can see this information anytime by clicking "Hide / Show Details" button above.

Please read the following question and answer about the image to the left and finish the 3 steps below

Step 1: Are there more than one question asked?

Step 2: Is the answer referring to multi-regions/objects?

Step 3: Draw one closed polygon to localize the region that the answer is referring to by clicking on the image.

OR