Thinking Big презентация

Содержание


Thinking Big
 An Introduction to Big DataAbout Me
 Shawn Hermans
 Data Engineer/Scientist
 Technology consultant
 Physics, math, dataAbout this Talk
 Non-technical introduction to Big Data
 Not focused onShould you believe the hype?
 Should you believe the hype?Big Data Promises 
 No need for scientific method
 Predict diseaseBig Data Criticism 
 Garbage in, Garbage out
 Ignores the roleBig Data is just another way to think about data
 BigMental Models
 “A mental model is simply a representation of anExamples
 Occam's razor
 Mind maps
 Law of supply and demand
 NeverAll models are wrong, but some are useful
 All models areRelational Resistance
 Resistance to big data concepts, technologies, and techniques becauseData Mental Models
 Relational
 Linked
 Object Oriented
 Geospatial
 TemporalWhat is Big Data?
 What is Big Data?According to Gartner
 “Big data is high volume, high velocity, and/orAccording to Me
 Big data is the Bazaar to traditional data’sCathedral and Bazaar
 Traditional Data
 Clean
 Top down
 Carefully collected
 ScalesBig Data Differences
 Relational
 Normalization
 ACID
 SQL/Query
 Structured/SchemaIntegrating all available data is the promise of Big DataWhy should you care?
 Why should you care?Information as an Asset
 Target specific customer's needs rather than broadBig Data and You
 What information do you have, that noBig Data Technology
 Big Data TechnologyBig Data Platforms
 Cloud
 AWS
 Google
 MicrosoftBig Data Stack
 Batch Processing
 Data Collection
 SQL/Query
 Search
 Machine Learning
What about data science?
 What about data science?What IS Data Science?
 Data science is statistics on a Mac
The need for Data Science
 There is a LOT of data
Big Data has its limits
 Big Data has its limitsBlack Swans and Big Data
 There are fundamental limits to prediction
What’s next?
 What’s next?Getting Started
 Business
 Identify some unresolved questions
 Figure out what dataMy Info
 Twitter: @shawnhermans 
 Github: github.com/shawnhermans
 Blog: http://shawnhermans.github.io/ (In Progress)
Backup Slides
 Backup SlidesThe Fourth Quadrant and the Failure of Statistics
 The Fourth QuadrantSoothsayer 
 Simple HTTP/JSON API for training/classifying data
  Lots of



Слайды и текст этой презентации
Слайд 1
Описание слайда:
Thinking Big An Introduction to Big Data


Слайд 2
Описание слайда:
About Me Shawn Hermans Data Engineer/Scientist Technology consultant Physics, math, data geek

Слайд 3
Описание слайда:
About this Talk Non-technical introduction to Big Data Not focused on any technology or platform Focus on concepts

Слайд 4
Описание слайда:
Should you believe the hype? Should you believe the hype?

Слайд 5
Описание слайда:
Big Data Promises No need for scientific method Predict disease outbreaks before the CDC Cure cancer Innovating healthcare Solve world hunger Bring about world peace

Слайд 6
Описание слайда:

Слайд 7
Описание слайда:
Big Data Criticism Garbage in, Garbage out Ignores the role of the scientific method Lots of questions don’t require large amounts of data to get good stats Privacy issues

Слайд 8
Описание слайда:
Big Data is just another way to think about data Big Data is just another way to think about data

Слайд 9
Описание слайда:
Mental Models “A mental model is simply a representation of an external reality inside your head. Mental models are concerned with understanding knowledge about the world.” - Farnam Street Blog

Слайд 10
Описание слайда:
Examples Occam's razor Mind maps Law of supply and demand Never get in a land war in Asia

Слайд 11
Описание слайда:
All models are wrong, but some are useful All models are wrong, but some are useful

Слайд 12
Описание слайда:
Relational Resistance Resistance to big data concepts, technologies, and techniques because of belief that the relational model is the only way to think about data. See also: Theory induced blindness

Слайд 13
Описание слайда:

Слайд 14
Описание слайда:
Data Mental Models Relational Linked Object Oriented Geospatial Temporal

Слайд 15
Описание слайда:
What is Big Data? What is Big Data?

Слайд 16
Описание слайда:
According to Gartner “Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.”

Слайд 17
Описание слайда:
According to Me Big data is the Bazaar to traditional data’s Cathedral

Слайд 18
Описание слайда:
Cathedral and Bazaar Traditional Data Clean Top down Carefully collected Scales vertically One true way

Слайд 19
Описание слайда:
Big Data Differences Relational Normalization ACID SQL/Query Structured/Schema

Слайд 20
Описание слайда:
Integrating all available data is the promise of Big Data

Слайд 21
Описание слайда:
Why should you care? Why should you care?

Слайд 22
Описание слайда:

Слайд 23
Описание слайда:
Information as an Asset Target specific customer's needs rather than broad segments Just-in-time inventory management Evaluating demand for product Predict and track traffic patterns

Слайд 24
Описание слайда:
Big Data and You What information do you have, that no one else has? Can you easily integrate your data or is it locked in silos? What data don’t you collect? What data don’t you archive?

Слайд 25
Описание слайда:
Big Data Technology Big Data Technology

Слайд 26
Описание слайда:
Big Data Platforms Cloud AWS Google Microsoft

Слайд 27
Описание слайда:
Big Data Stack Batch Processing Data Collection SQL/Query Search Machine Learning Serialization Security

Слайд 28
Описание слайда:

Слайд 29
Описание слайда:
What about data science? What about data science?

Слайд 30
Описание слайда:
What IS Data Science? Data science is statistics on a Mac A data scientist is a statistician who lives in San Francisco Person who is better at statistics than any software engineer and better at software engineering than any statistician.

Слайд 31
Описание слайда:

Слайд 32
Описание слайда:
The need for Data Science There is a LOT of data Too much data for people to look at it all Probabilistic models help extract signal from the noise Need to automate the analysis and exploitation of data

Слайд 33
Описание слайда:
Big Data has its limits Big Data has its limits

Слайд 34
Описание слайда:
Black Swans and Big Data There are fundamental limits to prediction Hard to predict rare events where no prior data exists (i.e. Black Swans) Complex systems often have feedback loops (e.g. stock market)

Слайд 35
Описание слайда:
What’s next? What’s next?

Слайд 36
Описание слайда:
Getting Started Business Identify some unresolved questions Figure out what data could answer those questions Pick the easiest and test out your hypothesis

Слайд 37
Описание слайда:
My Info Twitter: @shawnhermans Github: github.com/shawnhermans Blog: http://shawnhermans.github.io/ (In Progress) Slideshare: www.slideshare.net/shawnhermans/ Quora: http://www.quora.com/Shawn-Hermans

Слайд 38
Описание слайда:
Backup Slides Backup Slides

Слайд 39
Описание слайда:

Слайд 40
Описание слайда:
The Fourth Quadrant and the Failure of Statistics The Fourth Quadrant and the Failure of Statistics

Слайд 41
Описание слайда:
Soothsayer Simple HTTP/JSON API for training/classifying data Lots of built in classifier statistics

Слайд 42
Описание слайда:


Скачать презентацию на тему Thinking Big можно ниже:

Tags Thinking Big
Похожие презентации