Xueyang Wu

I am an undergraduate from Shanghai Jiao Tong University, looking forward to continue my study. I am applying for PhD in Computer Science.
Email: xueyangwu@yeah.net
Resume: Click Here to Get the PDF Version

Education

Shanghai Jiao Tong Unviersity (2013-Present)

IEEE Horner Class Student, Major in CS
GPA: 87.05 (13/72)

Skills

Programming Language

C++, Python, JAVA

Neural Network Tools

Keras, Theano, Torch

Awards

2015 Academic Excellence Scholarship (Top 10%)
2015 MSRA 2015 Penta-Hackthon: Best UI Design Award (Rank 2/31)
2014 CUMCM Second Prize in Shanghai (Top 15%)
2014 Merit Student of SJTU (Top 10 %)
2014 Academic Excellence Scholarship

Publications

Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, Kai Yu. Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues. Interspeech. 2016.
Xueyang Wu, Su Zhu, Yue Wu, Kai Yu. Rich Punctuations Prediction using Large-scale Deep Learning. ISCSLP 2016.

Interests

Natural Language Processing
Artificial Intelligence
Information Retrieval
Data Mining
Machine Learning

Experience

Research Intern

Speech Lab, SJTU | June 2015-Present

Leader of the Department of Academy and Innovation

Student Union of SEIEE of SJTU | June 2014-June 2015

Volunteer

Kinds of Activities | Sep 2013-Present

Research and Projects

Rich Punctuations Prediction using Large-scale Deep Learning

July 2015 - Present

Punctuation prediction is necessary for downstream processing on speech transcripts. My work not only focuses on punctuation prediction but also semantic segmentation.
Sequence labeling methods are applied to predicting punctuation, treating every punctuation (including ”no punctuation”) as a label to each word in the text.
I have tried CRF, CRF with various featrues and RNN-LSTM, improve the performance from around 40% to 80%.
Currently, I am working on adopting multiview and multitask structure to this task.

Automobile Comments Analysis System

Mar 2016 – May 2016| Machine Learning Course Project

This system consists of three part: topic categorization, key-phrase extraction and sentiment analysis.
We crawled comments from forums talking about automobiles, some of which includes labels so that we could utilize these labeled data for supervised learning.
Different CNN were applied to topic categorization and sentiment analysis with accuracy 88\% accuracy and 92\%, respectively.
I also researched on word embedding for contrasting meaning and akin meaning but it was useless to improve the performance of sentiment analysis.

Computer GO

Oct 2015 - Jan 2016 | Rank 2/13

The searching space of GO is extremely large and not like other board games, GO doesn’t have a well-defined evaluation function for heuristic searching.
We used Monte Carlo Tree Search with RAVE-AMAF to find the best next-move. Additionally, statistical method was also applied to imitating expert play in the opening moves by building a Joseki tree.

Search Engine of Book with Image Recognition and Natural Language Understanding

Dec 2014 - Jan 2015 | Best Course Project

The search engine were built with Lucene, based on 1.8 million books’ information that we crawled from website.
We used SIFT algorithm to detect and describe local features in images.
Different from other engines that can only search books according to book title, we
allow fuzzy search in terms of author, type, content and comment. What’s more, I
designed a system based on parser to understand users’ natural language input.

Chinese Word Segmentation Tool

Dec 2013 – Jan 2014

I was in charge of building the framework and implementing segmentation algorithms. Our program was one of the fastest and most accurate tools among all course projects.
I introduced an empirical model combining forward maximum matching method and reverse maximum matching method, with several specific dictionary for disambiguation.
We used trie tree and hash table to increase the segmentation speed.

Innovative and Entrepreneurial Project of SJTU : Intelligent Lecture Organizing Application

Jan 2015 - May 2016

This project aims at automatically generating notes for lecture and helping users
get their notes organized.
Lecture notes are outlined by terminology extraction, which I designed to detect
domain specific keywords from text using TF-IDF, and other syntax features.