Digital Methods in Humanities and Social Sciences

Workshops

Lecturer: Pärtel Lippus (partel.lippus@ut.ee), University of Tartu,

Co-lecturer: Anton Malmi (antonm@ut.ee), University of Tartu

Date: 26.08.

Room: Lossi 3-406

Description

This workshop takes you through the first steps in the command-line based statistical computing program R. We will go through the basic syntax of R. We will learn how to get your data into R, weather it is tables or text, how to observe the data with R and how to get your observations out from R. We will be using R with RStudio software. Previous experience with R is not needed for participating. This workshop is highly recommended for everyone who has no previous experience with R, but is planning on participating in other workshops which use R, such as the ones by Peeter Tinits, Kimmo Elo and Cornelius Puschmann.

About the instructor

Pärtel Lippus has a background in experimental phonetics. He has studied the acoustics and the perception of Estonian prosody. His PhD (2011) was on the Estonian three-way quantity system. He has used R for quantitative data analysis and plotting, but also for formatting a text database. He has taught basic R courses at the University of Tartu.

Lecturer: Richard Rogers (R.A.Rogers@uva.nl), University of Amsterdam

Date: 26.08.

Room: Jakobi 2-106

Description

The workshop opens with a discussion of how to repurpose digital “methods of the medium” for social and cultural scholarly research, including its limitations, critiques and ethics. Subsequently the discussion turns to the practicalities of using digital methods hands-on. How to use crawlers for dynamic URL sampling and issue network mapping? How to employ scrapers to create a bias or partisanship diagnostic instrument? We also consider how to deploy online platforms for social research. How to transform Wikipedia from an online encyclopaedia to a device for cross-cultural memory studies? How to make use of social media so as to profile the preferences and tastes of politicians’ friends, and also locate most engaged with content? How to make use of Twitter analytics to debanalize tweets, and provide compelling accounts of events on the ground? Finally, the workshop turns to the question of employing web data and metrics as societal indices more generally.

About the instructor

Richard Rogers is Professor of New Media & Digital Culture at the University of Amsterdam. He is Director of the Digital Methods Initiative, dedicated to the study of the ‘natively digital’ and online epistemologies. He is also the Academic Director of the Netherlands National Research School for Media Studies (RMeS). Among other works, Rogers is author of Information Politics on the Web (MIT Press, 2004), awarded the best book of the year by the American Society of Information Science & Technology (ASIS&T), and Digital Methods (MIT Press, 2013) awarded Outstanding Book of the Year from the International Communication Association (ICA). Rogers is a three-time Ford Fellow and has received research grants from the Soros Foundation, Open Society Institute, Mondriaan Foundation, MacArthur Foundation and Gates Foundation. His most recent book, Doing Digital Methods (Sage, 2019), is a teaching resource.

Lecturer: Christian Ritter (christian.ritter@tlu.ee), Tallinn University

Date: 26.08.

Room: Jakobi 2-105

Description

The digitalisation of working life, civic engagement, and leisure activities changed everyday lives on a global scale. Since day-to-day routines are increasingly intertwined with the Internet, digital ethnography can cast further light on the ways in which people live, work and relax. This participatory workshop will bring together researchers who share the aim to understand the role of media technologies in everyday life. Digital ethnographers assess the various socio-cultural transformations accompanying the omnipresence of internet-based communication. A growing number of desktop computers, laptops, tablets, smartphones and wearables generate natively digital data, which can be analysed to enhance ethnographic investigations. For these reasons, the workshop will also put emphasis on developing mixed-method approaches combining ethnographic immersions with computational research techniques, such as the software applications IssueCrawler and Gephi.

Workshop participants will become familiar with recent approaches in digital ethnography by discussing strategies for researching experiences, practices, localities, things, and social relationships in an increasingly data-saturated world. The participatory workshop will also initiate discussions on research ethics and explore experimental forms of gathering and presenting ethnographic data. At various stages of the workshop, attendees will be given the opportunity to break into small groups and reflect on their own research projects.

About the instructor

Christian Ritter is a research fellow at the Centre of Excellence in Media Innovation and Digital Culture (MEDIT) at Tallinn University and an affiliate researcher in the Department of Social Anthropology at the Norwegian University of Science and Technology. He received his PhD from Ulster University, UK. Based on long-term ethnographic fieldwork in Estonia, Ireland, Norway and Turkey, his main research interests revolve around cultures of expertise, digital labour, contemporary mobilities and the socio-technical systems of the Internet.

Lecturer: Kimmo Elo (kimmo.elo@utu.fi), University of Turku

Co-lecturer: Peeter Tinits (peeter.tinits@ut.ee), University of Tartu, Tallinn University

Date: 27.08.

Room: Lossi 3-406

Description

Social Network Analysis (SNA) is part of the Anglo-Saxon, quantitatively oriented development line of network research, rooted theoretically in graph theory in mathematics. A network is defined as a set of dots (nodes, vertices) and connecting lines (edges) between nodes. SNA is based on the assumption that the network structure is significant when it comes to understanding and explaining the larger phenomenon the network is connected to. In other words, network structure is assumed to explain observed (social) behaviour. Further, SNA heavily takes advantage of graphical visualisations, network graphs are used to present the network structure, to help the scholar to focus her research, and to analyse network dynamics.

The course is an introductory workshop in Social Network Analysis (SNA). The workshop will cover three main topics:

Foundations of network analysis: theories, concepts, and tools
From material to data: data collection and preparation
Network visualisations and analysis

Students interested in participating in the workshops should have elementary knowledge of a spreadsheet programme (LibreOffice, Excel). After the course students should have basic skills of Visone, a network visualisation and analysis software.

About the instructor

Adjunct professor, Dr Kimmo Elo is senior researcher at the Centre for Parliamentary Studies at University of Turku (FIN). He has a decade-long experience in the application of digital research methods and tools (e.g. network analysis, text/data mining, data visualisation) in human sciences.

Dr. Elo is the Visiting Lecturer in Digital Humanities at the University of Tartu during spring semester of 2019.

Lecturer: Minna Ruckenstein (minna.ruckenstein@helsinki.fi), University of Helsinki

Date: 27.08.

Room: Jakobi 2-110

Description

This workshop explores metaphors that are used for presenting digital data and for advancing data-related inquiry. We will unpack commonly used metaphors, such as data being the new oil, and formulations, such as data being soil, sweat, or waste. An additional stream of inquiry will focus on data’s quality as lively, stuck or dead.

After laying a common ground for thinking about how metaphors work as partial and perspectival framing devices, we discuss how they arrange and provoke ideas and act as a domain within which facts, connections and relationships are presented and imagined. We will dig deeper into the concept-metaphor of ‘broken data’, suggesting that digital data might be broken and fail to perform, or be in need of repair.

By focusing on the broken data metaphor we examine the implications of breakages in the data and consequent repair work. In concrete terms, we discuss aspects of broken data in relation to various kinds of data initiatives and data uses. The goal is to demonstrate that a focus on data breakages is an opportunity to stumble into unexpected research questions and to account for how data breakages and related uncertainties challenge linear and too confident stories about data work. Overall, the broken data metaphor sensitizes us to consider less secure and ambivalent aspects of data worlds.

References:

Data metaphors – a reading list

https://socialmediacollective.org/reading-lists/metaphors-of-data-a-reading-list/

Pink, S., Ruckenstein, M., Willim, R., & Duque, M. (2018). Broken data: Conceptualising data in an emerging world. Big Data & Society, 5(1), 2053951717753228.

About the instructor

Minna Ruckenstein works as an associate professor at the Consumer Society Research Centre and the Helsinki Center for Digital Humanities, University of Helsinki. Her ongoing research focuses on digitalization/datafication by highlighting emotional, social, political and economic aspects of current and emerging data practices.

Lecturer: Barbara Denicolo (Barbara.Denicolo@uibk.ac.at), University of Innsbruck

Co-lecturer: Artyoms Šela (artjoms.sela@ut.ee), University of Tartu

Date: 27.08.

Room: Jakobi 2-106

Description

The TRANSKRIBUS tool presented during the workshop is not only a program for manual or automatic transcription of texts, but also a research platform that can connect a wide variety of scientists, institutions and interest groups with each other and support them in their collaboration.
TRANSKRIBUS offers the possibility to raise the traditional transcription of handwritten texts to a new level, both by linking text and image (on block, line and word level) as well as by exporting them to various formats (TEI, PDF, Word, METS, Excel).
At the same time, the produced transcriptions can now also be used for training the neural networks of the HTR (Handwritten Text Recognition). The automatic transcription allows even large amounts of text to be opened up and made searchable.
The workshop will cover the following content and aspects:

Presentation of the TRANSKRIBUS Platform
Introduction to the expert tool
Introduction to Handwritten Text Recognition
Generate Ground Truth, apply existing models, train your own models.
Presentation of further additional tools
Presentation of some usecases
Questions, exchange, discussion (feedback from users welcome)
guided exercises

About the instructor

Barbara Denicolò studied History, German and Latin at the University of Innsbruck. Since 2016 she has been part of the READ project and uses Transkribus for various purposes, source collections and research projects.
She manages a Crowdsourcing and Citizen Science project entitled “Bozner Ratsprotokolle: Transkribiert!”, where interested volunteers transcribe and annotate the city council protocols of the city of Bolzano (South Tyrol) in Transkribus.
Her research focuses on the German and Latin palaeography of the Middle Ages and early modern times, digital cataloguing and processing of manuscripts and archival material, as well as nutritional history and cookbook research, forestry and mining history of the Alpine region.

Lecturer: Peeter Tinits (peeter.tinits@ut.ee), Tallinn University, University of Tartu

Co-lecturer: Artyoms Šela (artjoms.sela@ut.ee), University of Tartu

Date: 28.08.

Room: Lossi 3-406

Description

The increasing availability of textual data gives new opportunities for humanities and social sciences that we are only beginning to explore. The nature of the data can vary quite a bit ranging from old digitized newspapers to Twitter or forum posts that are born and live digitally. Provided that we can access the data, they allow quite diverse questions to be answered.

In this 5-hour tutorial, we will learn the basics of text mining in R following the tidyverse principles. R is a computing environment for statistical analysis and graphics that allows analyses performed to be easily reproduced later and by other researchers. Tidyverse is an opinionated set of packages that aim to make using R easy to read and learn.

Using reproducible analyses allows humanities and social sciences to increase transparency in the research process, make it easier to collaborate, and and easier to build on earlier research. Movements among researchers have shown the benefits of Open Science and Open Research Practices for our scientific knowledge about the world.

What exactly: We will use tidytext and ggplot2 packages to make simple visualizations of texts. We will compare word frequencies, do simple sentiment analysis, and find keywords in texts. We will explore these on novels, dramas, and/or song lyrics in English. Exploring your own texts is a possibility. The tutorial aims to give the basic techniques that would help you get started on research project of your own.

Requirements: Knowing R helps, but is not obligatory. Starting with tidyverse you may get a biased view of R, but the tutorial ought to be understandable with no prior experience in scripting.

The lessons will take place in a computer class with the required software installed. If using your own computer, install R (https://www.r-project.org) and Rstudio (https://www.rstudio.com) beforehand.

NB! It is highly recommended for everyone who has no previous experience with R to participate in the workshop “First steps in R” on August 26!

Materials:
– Silge, Julia, and Robinson, David (2017) Text Mining with R. A tidy approach. O’Reilly Media.
– Grolemund, Garrett, and Wickham, Hadley (2017) R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media.

About the instructor
Peeter Tinits is a would-be digital humanist and an open science aficionado with an interest in historical texts. In his research he has combined textual and non-textual data to study topics like the standardization of spelling norms, the structure of film production crews and writing techniques in Wikipedia. He is a firm believer that anyone can learn to code, and the humanities have a lot to gain from adopting reproducible research practices.

He is currently finishing his PhD in Tallinn University and has started to work on text mining historical newspapers to track large societal transitions in the University of Tartu.

Lecturer: Joshua Wilbur (joshwilbur@gmx.net), University of Freiburg, University of Tartu

Date: 28.08.

Room: Jakobi 2-105

Description

Primary data in humanities research often consists of digital audio/video files that researchers want to annotate. The ELAN application provides an effective, flexible format for creating structured annotations linked to such media files. This workshop will provide an introduction to using ELAN’s features for creating useful annotation documents that can be used for further scientific analyses. We will look at how to design meaningful annotation structures in ELAN, how to link ELAN files to media files, how to transcribe and further annotate those media files, how to link metadata, and how to use ELAN as a corpus search engine for a collection of ELAN annotations. While the software was developed mainly with projects aimed at documenting endangered languages in mind, it can also be used in other disciplines in which primary data consist of digital audio/video files.

ELAN is freeware developed at Max Planck Institute for Psycholinguistics in Nijmegen/NL, and available for download at https://tla.mpi.nl/tools/tla-tools/elan/. Participants wishing to use their own computers should download and install the most recent version of the program. Feel free to bring your own media file to work on if you have any, but otherwise some practice examples will be provided.

About the instructor

Joshua Wilbur is a linguist who has been working on digital documentation and description of Pite Saami, a highly endangered Uralic language of northern Sweden, since 2008. After receiving his Ph.D. in General Linguistics in 2013 from the University of Kiel, he published a grammar of Pite Saami in 2014, and went on to publish lexicographic materials and a preliminary orthographic standard for the language. In his recent post-doctoral project at the Freiburg Research Group in Saami Studies aimed at describing syntactic structures in Pite Saami, he created a digital infrastructure (using FST and Constraint Grammar together) to automatically annotate ELAN transcriptions in his Pite Saami language corpus.

Starting in fall 2019, Dr Wilbur will be Visiting Lecturer in Digital Humanities at the University of Tartu.

Lecturers: Maili Pilt (maili.pilt@ut.ee) & Siim Sorokin (siim.sorokin@ut.ee), University of Tartu

Date: 28.08.

Room: Jakobi 2-110

Description

Social media has changed the ways of how we understand narratives and storytelling. In our workshop we deal with the questions of what narratives mean in social media context and what are the most appropriate methods to study them. How does social media influence the form and the content of narratives? What are the possible research topics with regard to narratives on social media? What kind of approaches and methods have been used for social media narratives? Every workshop session consists of a short topical lecture, discussions and group work. We will learn how to spot narratives and narrativity in social media texts within various environments (eg. blogs and asynchronous discussion forums); how to select suitable research methods for concrete research questions and narratives; and how to solve the questions of research ethics. The final session of the workshop is mostly practical. Concrete qualitative and quantitative (computer software) methods are used for the analysis of social media narratives.

Reading list

Wilson, Michael. 2014. “Another Fine Mess”: The Condition of Storytelling in the Digital Age. – Narrative Culture, Vol 1, No. 2, pp 125-144.
Page, E. Page. 2012. Stories and Social Media. Identities and Interaction. Routledge Studies in Sociolinguistics.

About the instructors

Maili Pilt is a folklorist, who is in the final year of her doctorate at the Institute of Cultural Research, University of Tartu. She is a member of Estonian Society for Digital Humanities (dh.org.ee). Pilt researches the storytelling practices in online groups of women who share their personal stories about conception, pregnancy, in vitro fertilization and childbirth. Her research also focuses on the methodological problems involved in social media narrative research. She has published on topics such as research ethics, reflexivity, and methods for collecting and analysing digital narratives.

Siim Sorokin works as a Research Fellow (Culture Studies) and Project Research Staff member at the Institute of Cultural Research, University of Tartu. Sorokin is also a member of the Nordic Network of Narrative Studies (NNNS). In 2018, Sorokin defended the degree of Doctor of Philosophy in Folkloristics with dissertation entitled Character Engagement and Digital Community Practice: A Multidisciplinary Study of “Breaking Bad.” Sorokin’s main research involves the analyses of online discourses of sense-making and reception of narrative media in social media communities with special emphasis on the expressive dimension of character and person engagement, theorized, inter alia, through the lens of externalist (and anti-idealist) philosophies of mind. Sorokin has published in high-profile local scientific journals and has a co-authored article in Frontiers of Narrative Studies (with prof. dr. Marina Grishakova, and dr. Remo Gramigna); as well as a chapter in the edited volume published by Cambridge Scholars Publishing forthcoming in 2019. Sorokin’s article on online misogyny discourse in the reception of “Breaking Bad” for a Special Issue of Narrative Inquiry is presently under double-blind peer-review. Sorokin is also working on a monograph elaborating the themes of the dissertation.

Lecturer: Cornelius Puschmann, Leibniz Institute for Media Research, c.puschmann@leibniz-hbi.de

Date: 29.08.

Unfortunately Cornelius Puschmann is not able to attend the summer school and thus his workshop and plenary lecture are cancelled. Luckily UT digital humanities guest lecturer Artjoms Šela is offering a replacement workshop titled “Introduction to stylometry and multivariate text analysis in R”.

Description

Social media platforms play an increasingly important role in politics, business, culture and academia. From services such as Facebook, Twitter, YouTube and Instagram in Western and Middle Eastern countries to platforms like VK (Russia) and WeChat (China), social media are used for a diverse set of purposes by a wide range of actors, from government entities and political activists to celebrities and public intellectuals. They also play a controversial role for public debate, having both been framed as instruments of democratization and openness and as dangerous, polarizing and pervaded by misinformation and extremism. What is largely undisputed however is that social media represent a vital data source for the study of politics, culture and society at large, and are therefore of growing relevance to communication and media researchers.

This class will focus on how one particular instrument from the toolbox of computational methods — sentiment analysis — may be used in communication and media research to study how discourse dynamics change over time, as well as how sentiment differs by actor and source. A particular focus will be placed on validating sentiment analysis results using a variety of computational as well as manual strategies. In technical terms, we will rely on R and the quanteda and tidytext packages, as well as a broad range of sentiment dictionaries and pre-annotated textual resources. We will also experiment with Google’s Cloud Natural Language API for assessing sentiment across languages.

Aims

Participants will be introduced to the use of R for content analysis with quanteda and various sentiment dictionaries. They will also learn the fundamentals of managing data and visualizing results.

Prerequisites

The course will assume general familiarity with R (https://www.r-project.org/). Ideally, participants should be able to know how to read datasets in R, work with vectors and data frames, and run basic statistical analyses, such as linear regression.

NB! It is highly recommended for everyone who has no previous experience with R to participate in the workshop “First steps in R” on August 26!

Software and data

The course will use the open-source software R and the development environment RStudio, which greatly facilitates coding with R. Both R and RStudio are freely available and each participant should bring a laptop computer on which the current version of R and RStudio are preinstalled, and on which they have the necessary permissions to install packages. See the attached software and data appendix for instructions on how to download the software and necessary data used in the class.

References

Benoit, K. (2017). Getting started with quanteda. Available at https://quanteda.io/articles/quickstart.html
Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24-54.
von Nordheim, G., Boczek, K., Koppers, L., & Erdmann, E. (2018). Reuniting a divided public? Tracing the TTIP debate on Twitter and in traditional media. International Journal of Communication, 12, 548–569.
Wickham, H., & Grolemund, G. (2016). R for Data Science. London; New York: O’Reilly.
Young, L., & Soroka, S. (2012). Affective news: The automated coding of sentiment in political texts. Political Communication, 29(2), 205-231.

About the instructor

Cornelius Puschmann is a senior researcher at the Leibniz Institute for Media Research in Hamburg where he coordinates the international research network Algorithmed Public Spheres, as well as the author of a popular German-language introduction to content analysis with R (inhaltsanalyse-mit-r.de). He has a background in communication and information science and is interested in the study of online hate speech, the role of algorithms for the selection of media content, and methodological aspects of computational social science.

Lecturers: Hembo Pagi (hembo@archaeovision.eu) & Andres Uueni (andres@archaeovision.eu), Archaeovision

Date: 29.08.

Room: Jakobi 2-105 (computer class)

Description

This short workshop gives some insight about Computational Imaging. We will talk mainly about photogrammetry and Reflectance Transformation Imaging (RTI). We share our experience applying those methods with different type and size of objects and materials. You will learn some basics and get to practice both techniques. If possible, you can bring your camera along.

Photogrammetry is a survey technique which is based on photograps. 3D coordinates can be derived from the overlapping images and therefor 3D space can be reconstruced. It is widely used for object recording and presentation in cultural heritage sector.

RTI has been used for cultural heritage documentation since its introduction in 2001. The technique allows for the recording of 3D surface reflectance properties and visualise them as 2D interactive images. The method can be used to investigate objects in various lighting conditions to enhance small surface changes, to bring out cracks, tool marks, scratches, pencil impressions and many more features that are not visible to the naked eye. The method is a valuable tool when examining coins, writing tablets and daguerreotypes, as features such as fine polishing lines, retouches and deteriorations can be identified.

About the instructors

Hembo Pagi and Andres Uueni are both founding members of Archaeovision, an organisation that offers number services for heritage documentation and management. These include 3D recording, imaging, web and data management and UAV recording.

Additional information: archaeovision.eu.

Lecturer: Jean Nitzke (nitzke@uni-hildesheim.de), University of Mainz, University of Hildesheim

Co-lecturer: Kaidi Lõo (kloo@ualberta.ca), University of Alberta, University of Tartu

Date: 29.08.

Room: Jakobi 2-106

Description

This workshop is set to give the participants a practical orientated overview on conducting eyetracking studies with a focus on translation process research. After a brief theoretical introduction to eyetracking, we will go through an entire experiment, including

designing an experiment (research questions, presenting stimuli, choosing participants, etc.),
recording data (handling participants, calibration, etc.), and
analysing the recorded data (areas of interests, visualisations and raw data).

A special focus will be on Does and Don’ts in all the just mentioned steps. The participants will be able to conduct first simple eyetracking studies after the workshop. The learnt contents will also be applicable to research questions outside the field of translation studies.

About the instructor

Jean Nitzke graduated her M.A. studies at the Faculty of Translation Studies, Linguistics, and Cultural Studies at the Johannes Gutenberg University of Mainz in 2011. She has been a teacher and researcher at the same faculty since April 2012. Her main research interests are translation process research, cognitive processes during translation, translation technologies, post-editing, and non-native English with a methodological focus on keylogging and eyetracking. Her PhD thesis dealt with problem solving in translation, contrasting translation from scratch and post-editing.

Instructor: Artjoms Šela (artjoms.sela@ut.ee), University of Tartu

Date: 29.08.

Room: Lossi 3-406 (computer class)

This workshop replaces the canceled one by Cornelius Puschmann.

Description

Stylometry – a discipline that measures variation of features within a text or a set of texts – appeared much earlier than computers, but the age of computations allowed to see a style as a clearly distributed phenomena: hundreds of textual features taken simultaneously seemed to describe individuality much better than handful of hand-picked examples. The usual and well-documented application of stylometric techniques was always an authorship attribution and forensics. In this workshop we will use the general principles behind the multivariate analysis of style and authorial identity to follow the workflow of almost any textual analysis: extracting features, dealing with texts as vectors of these features, surfing the multidimensional space of these vectors.

The workshops starts with introducing the “stylo” package for R (Eder, Rybicki , Kestemont 2016), which is simple to use yet powerful enough to be customizable and open to the research needs. After covering the basics we will move to build our own simple stylometric tool using “tidyverse” and “tidytext” packages that will allow us to demystify the process. Finally we will discuss how to use stylometry beyond authorhship attribution and will run a small experiment on classificiation of newspaper articles according to their political stance. Participants are encourged to bring in their datasets, text collections and research questions.

NB! It is recommended for everyone who has no previous experience with R to participate in the workshop “First steps in R” on August 26!

Literature

Maciej Eder, Jan Rybicki and Mike Kestemont. 2016. Stylometry with R: A Package for Computational Text Analysis. – The R Journal, 8:1, pages 107-121.

About the instructor

Artjoms Šela is a digital humanities guest lecturer at University of Tartu. In 2018 Šela received his PhD in russian literature at University of Tartu. He teaches courses focusing on digital humanities in general, methods and literature.

Digital Methods in Humanities and Social Sciences

Workshops

First steps in R – Pärtel Lippus

Query design and digital methods – Richard Rogers

Researching the Internet through Digital Ethnography – Christian Ritter

Introduction in Social Network Analysis – Kimmo Elo

Data metaphors and broken data – Minna Ruckenstein

Hands-on introduction to transcription tool Transkribus – Barbara Denicolo

Text mining in R and reproducible research – Peeter Tinits

Creating rich annotations for media files using ELAN – Joshua Wilbur

Narratives in Social Media: On Approaches and Research Methods – Maili Pilt & Siim Sorokin

CANCELED Introduction to Sentiment Analysis with R – Cornelius Puschmann

Computational Imaging for Digital Humanists – Hembo Pagi & Andres Uueni

All eyes on Eyetracking – Jean Nitzke

Introduction to stylometry and multivariate text analysis in R – Artjoms Šela