CS 112: Introduction to Big Data

Week 1

These are slides from the first week, introducing some general (business-oriented) concepts of big data, and ending with a (non-business-oriented) case study in using techniques from data analytics to make sense of a complicated idea in literature.

Introductory Slides

Week 4

This week, we're working with literary metadata from Project Gutenberg: litgenres.csv. I've added an additional set of slides for more beginning use of graphics in R, below. Please let me know if you have any questions!

Slides on Special Data Types in R

Slides Introducing Visualization

These visualizations use the built-in graphics commands in R, a good way to begin to get comfortable seeing your results.

Slides on Intermediate Visualisation

If you're feeling comfortable and ready for a challenge, you might try visualizations using ggplot2. It has a slightly higher difficulty to get started, but it offers some commands that may be easier to get to know after that first hurdle.

Weeks 6 and 7

This week, we're working with data from politicians' deleted tweets, archived by Politwoops and Propublica: politwoops.csv. Please let me know if you have any questions!

Slides on Lists, Loops, and Functions

Slides on Sentiment Analysis with syuzhet in R

Week 10

These resources are for week 10, as you work through your project applying some element of sentiment analysis to social networking data from search terms you choose. It shows an example of the thought process that goes into such a project, and it offers a sample report.

Slides on the cycle of planning and implementing analysis

Further detail of process

These web pages are suplemental to the slides, in case a student missed class.

Project report

To make sure everything is easy to follow, these files show the project report in three different formats. Please keep in mind that this isn't an example of the best possible report as much as it is an example of one of many ways to present a good report.

Week 12

This week, we're beginning to incorporate geospatial data using ArcGIS. These slides offer a broad overview of concepts of GIS ("Geographical Information Services"), including some case studies as examples, including a YouTube video in which a Bucknell University student explains his research with GIS.

Introductory Slides

Week 13

Poetry Contest and SWAC Football

Major Crops Grown in Each County