Digital Methods in Humanities and Social Sciences
Workshops
Instructor: Pärtel Lippus, University of Tartu
Assistant: Anton Malmi, University of Tartu
Date: 21.08.
Description
This workshop takes you through the first steps in the command-line based statistical computing program R. We will go through the basic syntax of R. We will learn how to get your data into R, weather it is tables or text, how to observe the data with R and how to get your observations out from R. We will be using R with RStudio software. Previous experience with R is not needed for participating. This workshop is highly recommended for everyone who has no previous experience with R, but is planning on participating in other workshops which use R.
About the instructor
Pärtel Lippus has a background in experimental phonetics. He has studied the acoustics and the perception of Estonian prosody. His PhD (2011) was on the Estonian three-way quantity system. He has used R for quantitative data analysis and plotting, but also for formatting a text database. He has taught basic R courses at the University of Tartu.
Instructor: Mike Thelwall, University of Wolverhampton
Date: 21.08.
Description
This workshop will give an overview of sentiment analysis methods and tools and will then describe how to use and customise the multilingual sentiment analysis software SentiStrength. No prior knowledge is needed other than experience of using computer programs. Experience with at least one social web site and knowledge of linguistic issues related to language would be an advantage. Prior programming experience is not needed. The workshop will describe how to use SentiStrength to estimate the strength of sentiment in large sets of short or medium length social web texts. The customisation part will show how to develop new languages for SentiStrength and to optimise it for sets of texts expressing sentiment in non-standard or specialised ways.
About the instructor
Mike Thelwall is Professor of Information Science and leader of the Statistical Cybermetrics Research Group at the University of Wolverhampton, which he joined in 1989. He is also Docent at the Department of Information Studies at Åbo Akademi University, and a research associate at the Oxford Internet Institute. His PhD was in Pure Mathematics from the University of Lancaster. His current research field includes identifying and analysing web phenomena using quantitative-led research methods, including altmetrics and sentiment analysis, and has pioneered an information science approach to link analysis. Mike has developed a wide range of tools for gathering and analysing web data, including hyperlink analysis, sentiment analysis and content analysis for Twitter, YouTube, MySpace, blogs and the web in general. His 400+ publications include 244 refereed journal articles and two books, including Introduction to Webometrics. He is an associate editor of the Journal of the Association for Information Science and Technology and sits on three other editorial boards. For more information, see: http://www.scit.wlv.ac.uk/~cm1993/mycv.html
Instructor: Rajesh Sharma, University of Tartu
Assistant: Gopichand Gopini, University of Tartu
Date: 21.08.
Description
The aim of the workshop is to get familiar with the concepts of 1) social network/media analysis and text analytics domains and 2) Discussion of various applications, where these techniques have their applications, 3) Algorithms for solving various problems in these domains which can handle massive data, 4) hands on session to get familiar with some tools, which can be used for analytics. Some case studies and examples will be discussed for introducing network science concepts. This is a hands-on session using R and Gephi.
Background of students: 1) Some background in R would be preferred (but not necessary) and 2) Bring your laptop (if the workshop does not provides the computers/laptops) with R Installation (We can send a small help in installation of R, before the starting of the summer school).
About the instructor
Rajesh Sharma joined Institute of Computer Science at the University of Tartu as a Senior Researcher in August 2017. From Jan 2014 to July 2017, he has held Research Fellow and postdoc positions at the University of Bristol, Queen’s University Belfast, UK and the University of Bologna, Italy. Prior to that, he completed his PhD from Nanyang Technological University, Singapore in December 2013. He has also worked in IT industry for about 2.5 years after completing his Masters from Indian Institute of Technology (IIT), Roorkee, India.
Instructor: Mike Thelwall, University of Wolverhampton
Date: 22.08.
Description
This workshop will describe how to use the Windows software tool Mozdeh to gather and analyse posts from the social web sites Twitter and YouTube. There are no formal prerequisites other than experience of using computer programs. Familiarity with Microsoft Windows would be an advantage, as would experience of using Twitter and YouTube. The workshop will describe how to harvest tweets in real time using keyword searches and how to download the tweets from individual users. It will also show how the comments on sets of videos or video channels can be downloaded from YouTube. The workshop will introduce a range of analytics techniques, including word association mining, sentiment and gender detection. It will also demonstrate how to create time series graph and network from the data.
About the instructor
Mike Thelwall is Professor of Information Science and leader of the Statistical Cybermetrics Research Group at the University of Wolverhampton, which he joined in 1989. He is also Docent at the Department of Information Studies at Åbo Akademi University, and a research associate at the Oxford Internet Institute. His PhD was in Pure Mathematics from the University of Lancaster. His current research field includes identifying and analysing web phenomena using quantitative-led research methods, including altmetrics and sentiment analysis, and has pioneered an information science approach to link analysis. Mike has developed a wide range of tools for gathering and analysing web data, including hyperlink analysis, sentiment analysis and content analysis for Twitter, YouTube, MySpace, blogs and the web in general. His 400+ publications include 244 refereed journal articles and two books, including Introduction to Webometrics. He is an associate editor of the Journal of the Association for Information Science and Technology and sits on three other editorial boards. For more information, see: http://www.scit.wlv.ac.uk/~cm1993/mycv.html
Instructor: Crystal Abidin, Jönköping University
Date: 22.08.
Description
As new platforms and technologies emerge, young people are inventing innovative ways to express ideas and communicate with their peers using mixed media on the internet. Most prominently, internet paralanguages that draw on non-lexical visual cultures are flourishing in mainstream, subcultural, and countercultural internet communities. They have been used to communicative sensitive information across networks under the radar of authoritarian censors during global social movements, and situated to demonstrate different coded meanings for different audiences by prominent internet users such as Influencers. In this session, participants will explore some of these internet paralanguages, and draw from their personal experiences of these communicative symbols. Through brief case studies, the session will demonstrate how we can systematically track and understand the emergence of internet paralanguages through ethnographic methods. Participants will be invited to apply the methodologies on their chosen case study for sharing with the class in short flash lectures, and work in small groups to produce a short experimental film using only social media apps.
Reading list (please read any two):
-Ask, Kristine, and Crystal Abidin. 2018. “My life is a mess: Self-deprecating relatability and collective identities in the memification of student issues.” Information, Communication and Society. http://www.tandfonline.com/eprint/rf6nmFYkMWwSsnwss5T3/full
-Highfield, Tim. 2016. “Waiving (hash)flags: Some thoughts on Twitter hashtag emoji.”Medium.com. OA: https://medium.com/dmrc-at-large/waiving-hash-flags-some-thoughts-on-twitter-hashtag-emoji-bfdcdc4ab9ad#.vczn6qfgl
-Miltner, Kate M. 2014. “There’s no place for lulz on LOLCats: The role of genre, gender, and group identity in the interpretation and enjoyment of an Internet meme.” First Monday 19(8). OA: http://firstmonday.org/ojs/index.php/fm/article/view/5391/4103
-Stark, Luke, and Kate Crawford. 2015. “The Conservatism of Emoji: Work, Affect, and Communication.” Social Media + Society Journal 1(2). OA: http://sms.sagepub.com/content/1/2/2056305115604853.full
-Willard, Lesley. 2016. “Tumblr’s Gif Economy: The Promotional Function of Industrially Gifted Gifsets.” Flowjournal.org. OA: http://www.flowjournal.org/2016/07/tumblrs-gif-economy/
About the instructor
Crystal Abidin is an anthropologist and ethnographer who researches internet culture and young people’s relationships with internet celebrity, self-curation, and vulnerability. She is presently authoring two monographs on the history of blogshops and the Influencer industry. She obtained her PhD in Social Sciences (Anthropology & Sociology, Media & Communications) in 2016 from the University of Western Australia. Crystal is Postdoctoral Fellow with the Media Management and Transformation Centre (MMTC) at Jönköping University, Researcher with Handelsrådet (Swedish Retail and Wholesale Development Council), and Adjunct Researcher with the Centre for Culture and Technology (CCAT) at Curtin University. Crystal’s forthcoming book, Internet Celebrity: Understanding Fame Online (Emerald Publishing, 2018) critically analyzes the contemporary histories and impacts of internet-native celebrity today. Reach her at wishcrys.com.
Instructor: Yin Yin Lu, Oxford Internet Institute, Balliol College (University of Oxford)
Assistant: Kristiina Vaik, University of Tartu
Dates: 22.-23.08. NB! Remember to register for both days of the workshop!
Description
The goal of this two-day workshop is twofold: 1) to demonstrate how Python’s Natural Language Toolkit (NLTK) library can be used to analyse large volumes of textual data, and 2) to empirically detect themes in this data through topic modelling. We will begin by exploring fundamental corpus linguistics functions using NLTK: tokenisation, frequency distributions of keywords, part-of-speech tagging, n-grams, and collocations. This will allow for a descriptive understanding of the corpus (word categories and counts), which sets the stage for the detection of themes.
A ‘theme’ is essentially a collection of words; topic models assign themes to documents based upon the co-occurrences of words in the documents. They operate under a naïve ‘bag of words’ assumption: a document is defined by the distribution of its vocabulary across various themes; syntax (and thereby context) is not taken into consideration. That being said, this naïve model can generate powerful insights about a corpus of text that instigate further qualitative analyses. In this workshop, the canonical topic model ‘Latent Dirichlet Allocation’ (LDA) will be introduced. Results will be visualised through bar charts and the interactive pyLDAvis library.
To attend and fully benefit from this workshop, participants should have basic knowledge of the programming language Python and its ecosystem, and should bring laptops equipped with Python 3.6 or higher. Installation through the Anaconda distribution (https://www.continuum.io/) is highly recommended, as it bundles together a range of open-source Python packages and libraries used in data analysis and scientific computing—including Jupyter Notebook, the web application that we will be using in the workshop to run our code. Alternatively, Python can be installed through a binary installer from the Python Software Foundation (https://www.python.org/), or through an operating system’s package manager (e.g., apt on Debian Linux and homebrew on macOS).
As noted above, it is essential that participants download not only Python 3.6 or higher, but also Jupyter Notebook (which is included in the Anaconda distribution). A very useful Quick Start Guide can be found here: https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/install.html. Participants who are unfamiliar with Jupyter should watch this 30-minute YouTube tutorial prior to the workshop: https://www.youtube.com/watch?v=HW29067qVWk
About the instructor
Yin Yin Lu is a final-year DPhil (PhD) Candidate at the Oxford Internet Institute and Balliol College (University of Oxford). She researches persuasion in the context of new media, focusing specifically on the rhetoric and resonance of Brexit tweets. Her multi-strategy design encompasses qualitative text analysis, multivariate regressions, semi-structured trace interviews, and natural language processing algorithms. She convenes the #SocialHumanities network at The Oxford Research Centre in the Humanities (TORCH), blogs from perrinewynkel.blogspot.co.uk, tweets from @Yinneth, and is a media commentator on online propaganda.
Instructor: Crystal Abidin, Jönköping University
Date: 23.08.
Description
The demand of the market today is that we present our online selves as consistent and recognizable, and easy to locate. In the climate of the “attention economy” that demands the public and constant sharing of our own lives and the consuming of others’, internet celebrities and Influencers are perhaps the epitome of living on the internet. But how can we best study such internet-native phenomenon? Are traditional ethnographic methods rooted in anthropological tenets still relevant? In this session, we will learn the basics of conducting digital ethnography, such as participant observation in digital spaces, immersion within online communities, connecting discursive and semiotic threads through grounded theory, and applying content analysis to a viable sample of visual media. We will hone our skills in these methods through the vehicle of internet celebrities, in a lecture that introduces to a brief history of internet celebrity culture, and interrogates the different forms of internet celebrity around the world. You are required to actively engage in our class discussion, and identify and take notes on the celebrity strategies presented in the lecture. Are there patterns pertaining to different persons, cultures, and societies? Participants will be invited to apply the concepts from the class to formulate their own DIY internet celebrity in small groups using the methodologies of digital ethnography.
Reading list (please read any two):
-Abidin, Crystal. 2016. “Aren’t these just young, rich women doing vain things online?: Influencer selfies as subversive frivolity.” Social Media + Society 2(2): 1-17. http://journals.sagepub.com/doi/full/10.1177/2056305116641342
-Abidin, Crystal. 2017. “#familygoals: Family Influencers, Calibrated Amateurism, and Justifying Young Digital Labour.” Social Media + Society 3(2): 1-15. http://journals.sagepub.com/doi/full/10.1177/2056305117707191
-Abidin, Crystal. 2017. “Vote for my selfie: Politician selfies as charismatic engagement.” Pp. 75-87 in Selfie Citizenship, edited by Adi Kuntsman. London: Palgrave Pivot. https://www.academia.edu/24830965/Abidin_Crystal._2017._Vote_for_my_selfie_Politician_selfies_as_charismatic_engagement._Pp._75-87_in_Selfie_Citizenship_edited_by_Adi_Kuntsman._London_Palgrave
-Abidin, Crystal. 2017. “Sex Bait: Sex talk on commercial blogs as informal sexuality education.” Pp. 493-508 in Palgrave Handbook of Sexuality Education, edited by Louisa Allen and Mary Lou Rasmussen. London: Palgrave Macmillan. DOI: 10.1057/978-1-137-40033-8_24 https://www.academia.edu/16152332/Abidin_Crystal._2017._Sex_Bait_Sex_talk_on_commercial_blogs_as_informal_sexuality_education._Pp._493-508_in_Palgrave_Handbook_of_Sexuality_Education_edited_by_Louisa_Allen_and_Mary_Lou_Rasmussen._London_Palgrave_Macmillan
-Abidin, Crystal. 2017. “Influencer Extravaganza: A decade of commercial ‘lifestyle’ microcelebrities in Singapore.” Pp. 158-168 in Routledge Companion to Digital Ethnography, edited by Larissa Hjorth, Heather Horst, Genevieve Bell, and Anne Galloway. London: Routledge. https://www.academia.edu/24831144/Abidin_Crystal._2017._Influencer_Extravaganza_A_decade_of_commercial_lifestyle_microcelebrities_in_Singapore._Pp._158-168_in_Routledge_Companion_to_Digital_Ethnography_edited_by_Larissa_Hjorth_Heather_Horst_Genevieve_Bell_and_Anne_Galloway._London_Routledge
About the instructor
Crystal Abidin is an anthropologist and ethnographer who researches internet culture and young people’s relationships with internet celebrity, self-curation, and vulnerability. She is presently authoring two monographs on the history of blogshops and the Influencer industry. She obtained her PhD in Social Sciences (Anthropology & Sociology, Media & Communications) in 2016 from the University of Western Australia. Crystal is Postdoctoral Fellow with the Media Management and Transformation Centre (MMTC) at Jönköping University, Researcher with Handelsrådet (Swedish Retail and Wholesale Development Council), and Adjunct Researcher with the Centre for Culture and Technology (CCAT) at Curtin University. Crystal’s forthcoming book, Internet Celebrity: Understanding Fame Online (Emerald Publishing, 2018) critically analyzes the contemporary histories and impacts of internet-native celebrity today. Reach her at wishcrys.com.
Instructor: Anto Aasa, University of Tartu
Assistant: Pilleriine Kamenjuk, University of Tartu
Date: 23.08.
Description
It is characteristic to the modern network and information society that we meet data everywhere and very often the data is big. Big data could be a co-product of some other ICT service: every usage of ICT produces a large amount of data. This data describes the usage of certain services, but can also talk about the user (frequency, location etc). In this workshop, we are going to look how the mobile data collection was used in previous times, how it is used today and what might bring the nearest future. Today, the mobile data collection is a very useful toolbox for mobility analysis. Hereby we try to “make friends” with several different datasets (mobile positioning, GPS-positioning, crimes etc). The main aim is to introduce the participants with the nature of mobile data, its applications and prospects as well as the disadvantages and shortcomings.
About the instructor
Anto Aasa is a Research Fellow in Human Geography at University of Tartu. Profile on Estonian Research Information System.
Instructor: Anto Aasa, University of Tartu
Assistant: Pilleriine Kamenjuk, University of Tartu
Date: 24.08.
Description
Visualization is powerful tool to present information quickly and clearly. In current workshop overview about principles of scientific visualization is given. We look back in time to explore some important milestones in visualization history (good and bad examples).
Then we start with the practical work. For this some software has to be installed in participant computer:
-
R is a free software environment for statistical computing and graphics (link).
-
RStudio is the premier integrated development environment for R (link).
During the practical work we try to find answers to several questions:
-
Why and when we should visualize our data?
-
Should we follow the traditional visualization language?
-
How to visualize different data types?
-
How to visualize volume, dynamics, trends, relationships?
-
What about time?
-
How to put data on map?
-
When it is good to use interactive visualizations or animations?
About the instructor
Anto Aasa is a Research Fellow in Human Geography at University of Tartu. Profile on Estonian Research Information System.
Instructor: Néhémie Strupler, Walter Benjamin Kolleg (University of Bern)
Date: 24.08.
Description
Spatial data are everywhere: on television, in newspapers, in books, on computer screens, on mobile devices, and on plain paper maps. Researchers have nowadays a chance to fetch, mix and scrap spatial data from archives and online repositories in order to address and represent global problems. However, making a map that is suited to its purpose and does not distort the underlying data unnecessarily is not easy. In learning how to use GIS (Geospatial Information Systems) for Spatial Humanities, this workshop aims at providing the basics to independently analyse and represent spatial data.
This workshop is an introduction to analysing spatial data in R, specifically through the making of map with R and various dedicated packages for R. It will teach the basics of using R as a fast and powerful command-line Geographic Information System that can be integrated with other software (like QGIS).
The workshop is practical: you will learn how to use the software, load, manipulate, and visualise spatial data, but you will also have a better overview about where and how to find spatial data for your research. No prior knowledge of R or spatial data analysis is required but some experience with R will help.The workshop is a mixture of lectures and hands-on sessions. A personal laptop is needed.
Software:
-
R (a detailed list of specific packages to install will be provided to participants beforehand)
-
QGIS
Reading list:
-
Jo Guldi, “What is the Spatial Turn?”, Spatial Humanities, http://spatial.scholarslab.org/spatial-turn/what-is-the-spatial-turn/
-
It may be worth following an introductory tutorial for R, such as: Robert Hijmans, “Introduction to R” (http://rspatial.org/intr/index.html); Chapter 1 and 2 of “R for Data Science” from Garrett Grolemund & Hadley Wickham (http://r4ds.had.co.nz/) or the official “Introduction to R” (https://cran.r-project.org/manuals.html)
About the instructor
Néhémie Strupler has a PhD in Near Eastern Archaeology jointly at the University of Strasbourg and at the University of Münster (2016). Before coming to Tartu, Néhémie was a Postdoctoral Fellow at the Walter Benjamin Kolleg (University of Bern) and at ANAMED (Koç University). From 2014-2016 he served as IT Officer for the Istanbul Branch of the German Archaeological Institute and managed geospatial data from archaeological excavation. He was trained as an Archaeologist, but at this point, his research is as much about Data Science and Digital Humanities as Archaeology. Néhémie is an Open Science and Free Software advocate and he is enthusiastic about developing theory and methods capable of exploring data through open and reproducible standards.
Instructor: Natalia Levshina, Leipzig University
Assistant: Maarja-Liisa Pilvik, University of Tartu
Dates: 24.-25.08. NB! Remember to register for both days of the workshop!
Description
In this two-day hands-on workshop the participants will learn how to fit the statistical models with R, perform diagnostics and interpret the results. It will cover the following topics:
-
dichotomous/binomial logistic models with fixed and mixed effects,
-
polytomous/multinomial logistic models,
-
generalized additive models,
-
introduction to Bayesian inference and logistic regression.
About the instructor
Natalia Levshina is an experienced teacher of statistics and quantitative linguistics, who has been teaching courses on different statistical topics at European universities and international summer and winter schools for advanced students. She obtained her PhD at the University of Leuven (Belgium) and is working as a corpus linguist in the ERC-funded project “Grammatical Universals” at Leipzig University at the moment. She has published a popular manual on statistics for linguists “How to Do Linguistics with R: Data exploration and statistical analysis” (2015).
Instructor: Jan Rybicki, Jagiellonian University
Date: 25.08.
Description
This workshop is aimed at introducing participants to the field of stylometry. An introductory lecture in the morning will show the main tenets and methods of the field, together with examples of research in authorship attribution and distant reading. In the following hands-on workshop, the participants will be acquainted with stylo, a package for the statistical programming environment R. This package is a way to avoid R’s steep learning curve so that humanists can easily perform advanced quantitative analyses of texts. While stylo has its own built-in visualization tools, the second part of the workshop will also introduce gephi, a piece of network analysis software. In the afternoon session, the participants will be able to perform their first own analyses on their own collections of texts or on those provided for them, beginning with inputting electronic texts through tokenization, distance measure calculation, cluster analysis, all the way to various modes of visualization. No programming skills are required!
If the participants wish to work on their own computers, they are strongly recommended to download and install R and gephi (and check if they are functioning correctly on their computers).
-
Gephi: https://gephi.org/
-
Download link for sample text collection for first analysis: https://1drv.ms/u/s!AjWxtkrEXCa7hPVFz9Aw1AKGsNrvkA
If the participants plan to try out the new methods on their own texts, these should be in plain text (.txt) format, UTF-8 encoded. Preferably, the file names should follow the pattern: author_title_date.txt (keep the underscores). It makes sense to bring texts by at least five authors, at least two texts each (from short story to novel or full piece of drama).
Reading list:
Rybicki, J., Eder, M., Hoover, D. “Computational Stylistics and Text Analysis.” In Doing Digital Humanities. Practice, Training, Research. Eds Crompton, C., Lane, R.J., Siemens, R. Oxford: Routledge, 2016, 123-144. A preprint version will can be sent to participants on request.
A number of preprint versions of stylometric papers is available at https://sites.google.com/site/computationalstylistics/. The following might be of particular interest:
About the instructor
Jan Rybicki is Assistant Professor at the Jagiellonian University in Kraków, Poland. He has written extensively on the application of quantitative methods in the study of literature, tracing the stylometric signals of authors, translators, genres and genders in literary texts in several languages. Together with Maciej Eder and Mike Kestemont, he is a co-author of the “stylo” package for R, which has become a well-known tool of stylometric analysis. He is also an active literary translator; he has translated some 30 novels from English to Polish by such authors as John le Carre, Kazuo Ishiguro or William Golding.
Instructor: Katrin Tiidenberg, Aarhus University / Talllinn University
Date: 25.08.
How to study moving targets? In a flow of other moving targets? These are increasingly pertinent questions for those of us studying social media. In this workshop we’ll discuss and experiment with one possible answer – the intertextual analysis of social media.
What does that even mean? We can, without a doubt, make fascinating observations by scraping or manually extracting social media images, texts, locations or hashtags and analyzing them as (self-) representations, focusing on what is on the image, what is said in the text, and what that may signify. However, an argument can be made for analyzing social media content in a way that reflects how it is produced, edited, viewed, shared, deleted and fought over – in streams of content, where what’s ours coexists with what is others’ and where visual, textual and hypertextual data are co-present in an assemblage called the post.
About the instructor
Katrin Tiidenberg, PhD is an Associate Professor of Social Media and Visual Culture at the Baltic Film, Media, Arts and Communication School of Tallinn University, Estonia and a post-doctoral researcher at Aarhus University, Denmark. She is the author of the forthcoming “Selfies, why we love (and hate) them”, as well as “Body and Soul on the Internet – making sense of social media” (in Estonian). Tiidenberg is a a long time member of the Association of Internet Researcher’s Ethics Committee, a founding member of the Estonian Young Academy of Sciences, second time board member of the Estonian Sociology Association. She is currently writing and publishing on selfie culture, digital research ethics and visual research methods. Her research interests include visual self-presentation, sexuality, and normative ideologies as mediated through social media practices. More info at: kkatot.tumblr.com
She has been using and developing a method of analyzing social media content contextually – this means she treats visual material (images, videos), textual material (captions, comments, profile descriptions), and hyper- textual material (hashtags) as intertextually relational.