Your first NLP project: peaks and pitfalls of unstructured data

If you are looking for short, practical recipes for different natural language processing use cases in Python, this talk is for you!

Tags: NLP

Scheduled on wednesday 11:30 in room cubus


Anna Widiger

Anna Widiger has a B.A. degree in Computational Linguistics. She’s been doing NLP since her very first programming assignment, specializing in Russian morphology, German syntax, cross-lingual named entity recognition, topic modeling and natural language understanding. She likes Pi, pies and PyPeople.


Natural Language Processing improves the quality of your text data for future analysis and increases the accuracy of your machine learning model. It’s important to know what goes into the bag of words and what are some potential do's and don'ts of text pre-processing. Which text normalization steps are necessary and which ones are “nice-to-have”? Why is classic NLP still relevant in the age of Deep Learning? What metrics can be used to compare word frequencies and what can machine learning algorithms do with those numbers? This NLP talk provides answers to these questions and more! You'll see three examples of NLP pipelines using spaCy: sentiment analysis and emoji in tweets, named entity recognition in Yelp reviews, and multilingual topic modeling for news articles.