Extractive Summarization with Discourse Graphs

A Kaggle Project of INF554 Machine Learning and Deep Learning course at Ecole Polytechnique

Description

This project, led by Samuel Gaudin, Alexandre Ver Hulst and I at École Polytechnique part of INF554 course, focuses on identifying key messages in business dialogues using machine learning. By analyzing 137 labeled business dialogues, we explored various features—such as message size, speaker type, sentence embeddings, and sentiment scores—to determine message importance.

Acces to report :

Description of the project.

Approach and Models

We implemented a range of machine learning and deep learning models, including logistic regression, support vector machines (SVM), XGBoost, and advanced neural networks. Graph Neural Networks (GNN) and LSTM-based models were particularly effective, with LSTM proving best at capturing dialogue nuances.

Graph of a conversation and final model

Results

The LSTM neural network model demonstrated superior performance in understanding professional dialogue, thanks to its ability to process sequential text.