Automated Scientific Content Generation Using Semantic Analysis and Deep Learning

Innovation Development
Thesis Code: 
18002

Thesis Type: Thesis in Computer Science, Data Engineering, Computer Engineering, Mathematical Engineering, Data Science

Requirements
• Experience with Python and/or Java
• Basic knowledge of modular development
• Beginner of (or willing to learn quickly) machine learning
• Curiosity-driven mindset.

Description
Automated assistants are now more than ever taking place in our daily life. Assistants are thus asked to generate content according to user’ inputs and contextual objectives. Let take the case of a scientist in his daily task of performing experiments, filling tables and reporting findings. Lots of his time is spent in transcribing findings that have been already elaborated and encoded in tables. The advancements achieved in artificial intelligence support scenarios of co-operation between an artificial intelligence-based assistant and a scientist when writing technical reports. The objective of this thesis will be thus researching and prototyping an intelligent system able to write science starting from tables. In this thesis the undergraduate will develop an AI-based system for writing scientific papers using both semantic analysis and deep learning. The system will be able to learn autonomously from pairs of tables and papers created as gold examples and generate from a newer table a report.

The thesis will be structured as follows:
• state-of-the-art critical analysis in the field of document generation using both semantic analysis and deep learning;
• problem formulation: objective function, data structures and resources to be used;
• algorithm design and prototyping;
• in-lab testing verification with real data and measurement of the goodness of the approach.

The undergraduate will benefit from being immersed in a research environment. It’s a unique setting to get into a research mindset with a strong push for innovation. At the end of the thesis the undergraduate will be familiar with semantic analysis and deep learning and be able to implement an intelligent system. As additional benefit, she/he will use proficiently control version systems, continuous integration systems, remote deploying and monitoring techniques.

Contact: Send a resume with attached the list of exams to giuseppe.rizzo@ismb.it specifying the thesis code and title.