Natural Language Processing

Main research themes bring together researchers working on the automatic analysis of legal texts - judgments and bills, with the construction of a new intelligent legal system as a goal. So far a corpus consisting of approx. 2 Million texts from legal domain has been created. Based on its contents a dictionary containing legal terms is automatically constructed, as well as language models based on word and sentence embeddings. A set of general purpose text analysis tools aimed at Polish is developed. These include sentence segmenter, tokenicer, a Polish tagger, named entity recognizer and acronym detector. The other research projects cover information extraction from text and cyberbullying detection, placing great interest in texts coming from social networks.

keywords: natural language processing, information extraction, tagging, automatic summarization, text segmentation, automatic dictionary construction, full text search, dialog systems

Projects: https://lemkin.pl

Team: A. Smywiński-Pohl, M. Strzała, K. Wróbel, M. Piech, K. Bałazy

Contact: Aleksander Smywiński-Pohl