These are slides from the first week, introducing some general (business-oriented) concepts of big data, and ending with a (non-business-oriented) case study in using techniques from data analytics to make sense of a complicated idea in literature.
This week, we're working with literary metadata from Project Gutenberg: litgenres.csv. I've added an additional set of slides for more beginning use of graphics in R, below. Please let me know if you have any questions!
These visualizations use the built-in graphics commands in R, a good way to begin to get comfortable seeing your results.
If you're feeling comfortable and ready for a challenge, you might try visualizations using ggplot2. It has a slightly higher difficulty to get started, but it offers some commands that may be easier to get to know after that first hurdle.
This week, we're working with data from politicians' deleted tweets, archived by Politwoops and Propublica: politwoops.csv. Please let me know if you have any questions!
These resources are for week 10, as you work through your project applying some element of sentiment analysis to social networking data from search terms you choose. It shows an example of the thought process that goes into such a project, and it offers a sample report.
These web pages are suplemental to the slides, in case a student missed class.
To make sure everything is easy to follow, these files show the project report in three different formats. Please keep in mind that this isn't an example of the best possible report as much as it is an example of one of many ways to present a good report.
This week, we're beginning to incorporate geospatial data using ArcGIS. These slides offer a broad overview of concepts of GIS ("Geographical Information Services"), including some case studies as examples, including a YouTube video in which a Bucknell University student explains his research with GIS.