Xueyang Wu
I am an undergraduate from Shanghai Jiao Tong University, looking forward to continue my study. I am applying for PhD in Computer Science.
Email: xueyangwu@yeah.net
Resume: Click Here to Get the PDF Version
Education
Shanghai Jiao Tong Unviersity (2013-Present)
IEEE Horner Class Student, Major in CS
GPA: 87.05 (13/72)
Skills
Programming Language
C++
, Python
, JAVA
Neural Network Tools
Keras
, Theano
, Torch
Awards
- 2015 Academic Excellence Scholarship (Top 10%)
- 2015 MSRA 2015 Penta-Hackthon: Best UI Design Award (Rank 2/31)
- 2014 CUMCM Second Prize in Shanghai (Top 15%)
- 2014 Merit Student of SJTU (Top 10 %)
- 2014 Academic Excellence Scholarship
Publications
- Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, Kai Yu. Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues. Interspeech. 2016.
- Xueyang Wu, Su Zhu, Yue Wu, Kai Yu. Rich Punctuations Prediction using Large-scale Deep Learning. ISCSLP 2016.
Interests
- Natural Language Processing
- Artificial Intelligence
- Information Retrieval
- Data Mining
- Machine Learning
Experience
Research Intern
- Speech Lab, SJTU | June 2015-Present
Leader of the Department of Academy and Innovation
- Student Union of SEIEE of SJTU | June 2014-June 2015
Volunteer
- Kinds of Activities | Sep 2013-Present
Research and Projects
Rich Punctuations Prediction using Large-scale Deep Learning
July 2015 - Present
- Punctuation prediction is necessary for downstream processing on speech transcripts. My work not only focuses on punctuation prediction but also semantic segmentation.
- Sequence labeling methods are applied to predicting punctuation, treating every punctuation (including ”no punctuation”) as a label to each word in the text.
- I have tried CRF, CRF with various featrues and RNN-LSTM, improve the performance from around 40% to 80%.
- Currently, I am working on adopting multiview and multitask structure to this task.
Automobile Comments Analysis System
Mar 2016 – May 2016| Machine Learning Course Project
- This system consists of three part: topic categorization, key-phrase extraction and sentiment analysis.
- We crawled comments from forums talking about automobiles, some of which includes labels so that we could utilize these labeled data for supervised learning.
- Different CNN were applied to topic categorization and sentiment analysis with accuracy 88\% accuracy and 92\%, respectively.
- I also researched on word embedding for contrasting meaning and akin meaning but it was useless to improve the performance of sentiment analysis.
Computer GO
Oct 2015 - Jan 2016 | Rank 2/13
- The searching space of GO is extremely large and not like other board games, GO doesn’t have a well-defined evaluation function for heuristic searching.
- We used Monte Carlo Tree Search with RAVE-AMAF to find the best next-move. Additionally, statistical method was also applied to imitating expert play in the opening moves by building a Joseki tree.
Search Engine of Book with Image Recognition and Natural Language Understanding
Dec 2014 - Jan 2015 | Best Course Project
- The search engine were built with Lucene, based on 1.8 million books’ information that we crawled from website.
- We used SIFT algorithm to detect and describe local features in images.
- Different from other engines that can only search books according to book title, we
allow fuzzy search in terms of author, type, content and comment. What’s more, I
designed a system based on parser to understand users’ natural language input.
Chinese Word Segmentation Tool
Dec 2013 – Jan 2014
- I was in charge of building the framework and implementing segmentation algorithms. Our program was one of the fastest and most accurate tools among all course projects.
- I introduced an empirical model combining forward maximum matching method and reverse maximum matching method, with several specific dictionary for disambiguation.
- We used trie tree and hash table to increase the segmentation speed.
Innovative and Entrepreneurial Project of SJTU : Intelligent Lecture Organizing Application
Jan 2015 - May 2016
- This project aims at automatically generating notes for lecture and helping users
get their notes organized. - Lecture notes are outlined by terminology extraction, which I designed to detect
domain specific keywords from text using TF-IDF, and other syntax features.