Back to top

Master's Thesis Silvia Severini

Last modified Jan 8, 2020
   No tags assigned

Multi-Task Deep Learning in the Software Development Domain

Recently, industrial practitioners (DeepMind, Microsoft, Google, Facebook, ...) are growing a strong interest in the integration of machine learning into Software Engineering solutions.
Deep learning has already achieved competitive performances against previous algorithms on about 40 SE tasks [1].
Working on the Software Development domain implies working with the English language combined with programming languages like Jave, C#, SQL, Python. This type of tasks requires the combination of Machine Learning, Deep Learning and Natural Language Processing in order to build a solution.  However, some challenges arise: the vocabulary of the input becomes unlimited as opposed to a classic NLP task and the whole solution appear more complex. Moreover, the dataset with source code is scarce and difficult to be acquired.

In this work, we first chose 9 subtasks coming from different topics in the Software development domain based on the interest and on the availability of the datasets. We also added the unsupervised task of language modeling that works on 4 programming languages and on the English language.
Subsequently, we will create a Multi-task model to overcome some of the previously mentioned difficulties and exploit some advantages [2] of this architecture like the implicit data augmentation and the increasing of the generalization capabilities compared to Single-task models.

With this project, we want to understand if Multi-task Deep Learning is beneficial in the Software Development domain. Moreover, we want to train the model with different multi-task combinations (tasks from the same topic only or all the tasks together) to see which gives better performances.

 

References:

[1] Zhang, Yu, and Qiang Yang. "A survey on multi-task learning." arXiv preprint arXiv:1707.08114 (2017).

[2] Ruder, Sebastian. "An overview of multi-task learning in deep neural networks." arXiv preprint arXiv:1706.05098 (2017)

 

Files and Subpages