Publications

Ph.D. Thesis

Sudipta Kar

Dept. of Computer Science, University of Houston

Computerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing.
In this dissertation, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.
We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.

Patent

1. AutoBook analysis and recommendation

Holly Lynn Payne, Mark Fielding Bregman, Bogart Vargas, Thamar Solorio, Suraj Maharjan,matic Story Characterization

Sudipta Kar

Computerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing.
In this dissertation, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.
We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.

Papers

1.Integrating Summarization and Retrieval for Enhanced Personalization via Large Language Models

Chris Richardson and Yao Zhang and Kellen Gillespie and Sudipta Kar and Arshdeep Singh and Zeynab Raeesy and Omar Zia Khan and Abhinav Sethy
Computerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing.
In this dissertation, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.
We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.

2.MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition

Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Rokhlenko, Oleg and Malmasi, Shervin
Computerized systems capable of generating high-level story descriptions have many potential real-life applications. However, enabling computers to do so requires teaching computers to obtain an abstract understanding of natural language stories algorithmically, which is one of the non-trivial problems in Artificial Intelligence and Natural Language Processing.
In this dissertation, we tackle the challenge of automatically characterizing stories at a high-level by generating a set of tags from narrative texts written in English. We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.
We start by presenting a background study on the problem, discuss the required resources for research, and propose a new corpus to facilitate research on high-level story understanding by selecting tag prediction for movies as an application of this problem. Then, we focus on designing methods for high-level story understanding from written narratives and predicting tags for movies from the written plot synopses. First, we employ a wide range of linguistic features to design a machine learning approach for generating descriptive tags for stories from narrative texts. At the next step, we design a neural methodology for modeling the flow of emotions throughout stories and enhance a system that uses a high-level representation of narrative texts to predict tags. We furthermore exploit the hierarchical structure of text documents to encode the synopses and strengthen the tag prediction mechanism. In the final part of this dissertation, we demonstrate a technique utilizing user reviews to generate tags for characterizing stories at a high-level. We made the new dataset, the source code of the systems, and a live tag prediction system publicly available to the community to encourage further exploration in the direction of automatic story characterization.