Comment

Everybody Lies

Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Sep 09, 2017nrizkalla rated this title 5 out of 5 stars
This book explains in a very interesting way how BIG DATA from internet could be analyzed and thus utilized in various domains in our lives (health, politics, education, sexuality, history... you name it!). First an important distinction in the truth value between data collected from social media sites (e.g. Facebook) and data from search engines (e.g. Google search). So if according to the author "Facebook is digital brag-to-my-friends-about-how-good-my-life-is", it can not be a reliable source of who we really are. On the other hand analyzing what people search in Google is more truthful. An illustrative funny example: the top word wives associate with 'husbands' in FB is 'the best' 'my best friend' while in google search it is 'gay' 'a jerk'! So, what makes BIG DATA from Internet so useful and unique to understand human social interactions (versus classical research such as surveys)? First, it is more true (e.g. does racism really does not matter in political choices in America?). Second, because of its great volume it can allow us to delve into very specific subsets of geography or segment. For example by analyzing the searching of some key symptoms we can know that en epidemic is occurring somewhere, or we can know the sexual preference of middle age women living in rural areas. Third, it allows for experimentation in a very fast and cheap way (the A/B experiments obviously done on us everyday by Google and FB). Fourth, it allows us to look at new data we would have never thought to seek with regular research (for example or when the US became a truly united country as to when it was referee to as the United State is not are!). The book is loaded with interesting examples to illustrate the points and it is fun to read. It is also touching on the ethical implications of this revolution in data science. It concludes with a BIG DATA analysis of the percentage of people who would finish a book (only 3% for a serious book like Capital in the 21st Century or 7% for Thinking, Fast and Slow, while more than 90% for a novel like Goldfinch!). This is a very important book to understand the World we are living in now, and how are data scientists utilizing the information we post and type everyday in the Internet!