Chongyan Chen

Hi, I'm a first year PhD student at University of Texas at Austin. My research interest is visual question answering (VQA). Recently I am studying visual grounding and external knowledge for VQA. My supervisor is Dr. Danna Gurari. Here is the link to our group: link

I like drawing and reading philosophy/psychology books during my leisure time.


  • PhD, information school, University of Texas at Austin2020-2025
  • Msc, information school, University of Texas at Austin2018-2020GPA: 3.96/4.0,
  • B.Eng. Electronic Engineering, South China University of Technology2014-2018 GPA: 3.5/4.0,


"Evaluation of Mental Stress and Heart Rate Variability Derived from Wrist-Based Photoplethysmography" Chongyan Chen, Chunhung Li, Chih-Wei Tsai, and Xinghua Deng. IEEE Eurasia Conference on Biomedical Engineering, Healthcare and Sustainability 5/31/2019. Paper | Poster | Award
"Activity Recognition with Wristband Based on Histogram and Bayesian Classifiers" Yi-Cong Huang, Wing-Kuen Ling, Chi-Wa Cheng, Chun-Hung Li, Chong-Yan Chen. IEEE 5th International Conference on Signal and Image Processing (ICSIP), 7/19/2019. Paper


  • Kilgarlin Fellowship, 2020-2024
  • William and Margaret Kilgarlin Endowed Scholarship ($54,750), 2020-2021
  • Master Thesis-Dean’s Choice Award finalist, 5/8/2020
  • Best Conference Paper Award, IEEE ECBIOS, 6/2019
  • First Prize Award: 311 Calls and 500 Cities Hackathon, University of Texas at Austin,10/2018
  • Skills

      Programming Languages
    • Python
    • Java
    • Kotlin
    • C
    • C++
      Artificial Intelligence
    • Deep Learning
    • Machine Learning
    • scikit-learn
    • Keras
    • PyTorch
    • Tensorflow
    • Transfer Learning
    • Explainable AI
    • Attention mechanisms
    • Transformer
    • CNN
    • GAN, WGAN
      Web + Mobile development
    • React Native
    • Android development(Java, Kotlin)
    • HTML
    • CSS
    • Bootstrap
    • JavaScript
    • JQuery
    • Ajax
    • PHP
    • Flask+Jinja
    • mod_wsgi+Apache
      Backend + Systems
    • Linux
    • Azure
    • Google Cloud
    • AWS
    • Docker
      Data related
    • SQL
    • NoSQL (MongoDB)
    • Qlik
      Knowledge Graph
    • GCN
    • Node2vec
    • AMTurk
    • Git
    • MATLAB
    • Unit Test/Jenkins
    • LaTex
    • Unity 3D
    • Sketch
    • InDesign
    • Photoshop
      Human Languages
    • Mandarin
    • English
    • Cantonese

    Work Experience

    Algorithm Engineer (Intern), HUAWEI, Shenzhen, China, 06/2019 – 08/2019

    • Developed the Petal Search application in the HUAWEI Consumer Business Group.
    • Conducted URL pattern extraction using Regex and page update detection using md5. Designed three strategies for dead links recognition and achieved 17% accuracy increment. Improved the web content extraction, obtained 11% accuracy improvement, and achieved faster speed by utilizing Hash and Dynamic Programming.
    • Reconstructed part of the XML Path Language.

    Algorithm Engineer (Intern), Add Care Ltd, Shenzhen, China11/2017 – 3/2019

  • Trained CNN to detect utensils in videos, used Haar-like features and Adaboost to detect human faces, and tracked them using Kernelized Correlation Filter. Identified eating gestures by collision checking between the path of utensils and human faces.
  • Designed stress induction experiment. Collected, filtered ECG and wrist-based PPG signals and detected signal quality. Designed Peak Finding Algorithms for PPG and ECG.
  • Calculated Heart Rate Variability to classify stress states. The overall Leave-One-Participant-Out accuracy of wristed-based PPG with 3 mins temporal window reaches 80%.
  • Add Care official website Glutrac- one of the best health tech at CES 2020

    Projects and Research

    VizWiz Research Project

  • VQA app: Developed a demo to capture photo and spoken question and applied speech to text (DeepSpeech) and image quality detection algorithms. (3/2019-6/2019)
  • Question Answerability: Extracted features for visual question’s answerability using OpenCV and Azure API; extracted text features using NLTK to predict answerability of a visual question.
  • Master Thesis: Answered visual question with external Knowledge (Knowledge Base, Reverse Image Search, and Image Search by Text). The results show that including external knowledge can largely improve the accuracy of VQA and show a possibility of answering questions that are labeled as unanswerable by crowd workers. (6/2019-6/2020)
  • VQA tutorials: Wrote a GitHub page to summarize recent advances in VQA. Wrote articles in Zhihu (Chinese version of Quora) for VQA tutorials.
  • VQA algorithms: applied state-of-the-art algorithms for VQA, e.g., MCAN + grid features and Pythia 3. Studying fusion methods and attention mechanisms for VQA algorithms. (6/2020-present)
  • VQA Crowdsourcing: Building VizWiz-Visual Grounding dataset with Amazon Mechanical Turk. (9/2020-present)
  • VizWiz Websites Our Image & Video Computing Group

    Course Project for Natural Language Generation:

  • Chest disease classification and visual grounding: Applied hard attention model and stand-alone self-attention model to extract chest X-ray radiology images. Used multi-task learning and contrastive learning to teach model to learn from radiomics features, predict pneumonia disease, and ground the disease areas.
  • Radiology report generation: Reproduced the paper when radiology report generation meets knowledge graph: used Densenet-121 to extract the image features and build knowledge graph via GCN and generate report via multi-level LSTM.
  • Course Proj for AI in health

  • Built COVID-19 Knowledge GraphUsed BioBERT and PubTator for biomedical name entity recognition for PubMed dataset and COVID-19 44K dataset. Built coronavirus related knowledge graphs using Gephi. Integrated the coronavirus knowledge graph with the KG from Data2Discovery company.
  • Built Tutorials: Built BioBERT tutorial and SQL tutorial for PubMed and MIMIC III dataset. Built Knowledge Graph mining algorithms tutorials.
  • Paper1 | Paper 2

    Course Proj for Advanced Programming Tools: ByteMe

  • Developed ByteMe application for both Web (frontend: HTML + JS + Ajax; backend: Python + Flask) and Mobile platforms (React Native and Kotlin).
  • Built, deployed and managed application using Google App Engine, wrote python Database API to handle MongoDB, developed navigation function, camera function, and user login function with Google Firebase.
  • Implemented “NewByte” page with the AutoFill function using Food 101 classification model based on Google Inception V3 model and Azure API.
  • WebsiteApp made by Kotlin App made by React Native

    Research Proj: Evaluation of Mental Stress and Heart Rate Variability Derived from Wrist-Based Photoplethysmography

  • Designed stress induction experiment. Collected, filtered ECG and wrist-based PPG signals and detected signal quality. Designed Peak Finding Algorithms for PPG and ECG.
  • Calculated Heart Rate Variability to classify stress states. The overall Leave-One-Participant-Out accuracy of wristed-based PPG with 3 mins temporal window reaches 80%.
  • Paper | Poster | Award

    Course Proj for HCI: Understanding Health-related Information Searching Behavior Through Eye Tracking

    Collected eye-tracking data (AOIs, TTFF, etc.) using Tobii TX300 eye-tracker and iMotions. Analyzed data using Kruskal-Wallis test, One-Way Anova and Mann-Whitney U Test. (Paper)

    Paper | Poster

    Course Proj for Activities Recognition: Activities Recognition in Self-Driving Car

    Collected ten peoples’ five activities to solve the take-over problem. Reduced individual differences. Built pose estimator to detect skeleton of people. Extracted secondary features to help classify similar activities. Ensemble them with LSTM. (Paper)


    Course Proj for Visual Environment

  • Summary: We used Unity 3D to build a virual presentation demo.
  • Why: Our design can help people with presentation anxiety and improve presentation skills. It also provide a solution for distance meeting.
  • Details: We disigned different human-human interaction/attitudes for virtual audience. For positive attitude, some virtual audience would imitate the actions of the speakers when the speaker is doing experiment, some audience would always pay attention to the speaker by turning their body towards the speaker. For passive attitude, the audience just ignore the speaker.
    Besides, we designed different human-objects interactions: interacting with slides, poping out details of the display item when user gets close to the item, etc.
  • Research Proj: 2017 Mathematical Contest in Modeling - "Cooperate and navigate"

  • Summary: We analyzed of the effects of allowing self-driving, cooperating cars on the roads in several countries in the U.S. as well as suggesting the best percentage of self-driving car, and policy changes like setting exclusive lane.
  • Why: Self-driving, cooperating cars have been proposed as a solution to increase capacity of highways without increasing number of lanes or roads. The behavior of these cars interacting with the existing traffic flow and each other is not well understood at this point
  • Details: We built Phantom Traffic Jam Model to simulate traffic jam on highway with few intersections and accidents. Created Smart Driver Model with versions for human drivers and smart cars.
    We predicted traffic condition with varied road density and smart car proportions.
    We built Global Decision Model to control smart car proportions and provide optimal route plans for both human drivers and smart cars. Paper
  • Contact Me