Natural Language Processing
Main research themes bring together researchers working on the automatic analysis of legal texts - judgments and bills, with the construction of a new intelligent legal system as a goal. So far a corpus consisting of approx. 2 Million texts from legal domain has been created. Based on its contents a dictionary containing legal terms is automatically constructed, as well as language models based on word and sentence embeddings. A set of general purpose text analysis tools aimed at Polish is developed. These include sentence segmenter, tokenicer, a Polish tagger, named entity recognizer and acronym detector. The other research projects cover information extraction from text and cyberbullying detection, placing great interest in texts coming from social networks.
keywords: natural language processing, information extraction, tagging, automatic summarization, text segmentation, automatic dictionary construction, full text search, dialog systems
Team: A. Smywiński-Pohl, M. Strzała, K. Wróbel, M. Piech, K. Bałazy
Contact: Aleksander Smywiński-Pohl