Special Sessions

Recent Advances on Text and Document Streams Mining

Scope

A huge amount of information is nowadays available in form of texts and documents. These data are continuously generated by different sources such as online social networks, newspapers website, clinical and industrial report repositories, digital libraries, massive open online courses.

Such a massive amount of texts and documents represents a mine from which to extract valuable knowledge to be exploited in several contexts, such as decision making, event detection and prediction, user and customer profiling, marketing campaigns, sentiment analysis, opinion mining, text summarization. Indeed, in the last years, we experienced an increase of the number of scientific contributions about text and document mining techniques, mainly based on data mining, machine learning, artificial intelligence and statistics. Moreover, also several commercial solutions for text and document analysis have been proposed by well-known companies such as Google and Amazon.

A number of challenging issues are related with streams of text and documents, especially because the massive volume of the data must be processed in online fashion. Moreover, when dealing with continuous analysis of streams of texts and documents along the time, the issue of concept drift should be analyzed. Indeed, the phenomenon under observation can change over time and the performance of the adopted algorithms may deteriorate. Finally, the unstructured nature of text increases the difficulties in the design of mining algorithms.

The aim of this session is to offer a forum for both academic and industrial communities to share and disseminate their innovative research efforts and developments regarding scientific and technological challenges for designing and implementing tools for extracting useful knowledge from streams of texts and documents.

Topics

Potential topics of interest include but are not limited to:

  • Text and Document Categorization
  • Text and Document Summarization
  • Sentiment Analysis
  • Social Sensing
  • Scientific Document Analysis
  • Fake news detection
  • Community Discovery
  • Event Detections from Textual Sources
  • Opinion Mining
  • Concept Drift Detection in Text and Document Classification and Clustering
  • Incremental Learning of models for Texts and Document Mining
  • Crawling and Scraping Solutions for Streams of Texts and Documents

Organizers

Pietro Ducange, University of Pisa, Italy

Francesco Marcelloni, University of Pisa, Italy

Manuel Jesus Cobo Martin, University of Cadiz, Spain

Enrique Herrera-Viedma, University of Granada, Spain