big data drama saga: facebook (mis)steps in deep datamining doo-doo

SAP_Big_Data_1a. bigdata
b. facebook study
c. flak/ fallout/ backlash
d. stackexchange
e. casestudy/ areas/ niches
f. nuance/ analysis/ questions/ critiques/ reaction/ skeptical/ pushback
g. books/ papers/ edu
h. kaggle
i. data scientist
j. big picture/ trends
k. crypto
l. nsa
m. psyop

hi all. datamining was an early topic on this blog mainly with Nate Silver and politics. have been collecting datamining links for ~1½ yr now and holy cow, its a mass copious/bulging haul this time. the topic has exploded into the mainstream culture. you know when it hits USA Today, New Yorker, and Financial Times, nearly everyone has now heard of it.[f] its great to see it finally really pop into highly deserved exposure after years of being rather low-key and low-profile. ❗ 😀 😎

datamining is one of those geek topics that started to click with me in the mid 2000s, and it also hit a sort of escape velocity with the $1M netflix prize contest.

a sign of its maturity is that the media is not merely going into breathless accounts of its wonders but also looking at it in a more than 1-dimensional way. its occasionally going 2-dimensional with feverish fearmongering accounts of its downsides. (just kidding) … 🙄

we are seeing more nuance/ analysis/ questions/ critiques/ reaction/ skeptical/ pushback. a free press in action. and note the mass agglomeration of the media content is yet another dimension of Big Data.

have always wanted & dreamed of a datamining job, and they exist now, but it also seems like a bit tricky occupation. what if you are hired to do datamining on data that is very noisy? there is so much hype and dataminers/data scientists are seen as near-magicians right now and its considered a very hot emerging job.[i] it will take years for the field to tend to stabilize. and anyway, most dataminers probably have statistics or statistician background anyway right? but universities are racing to come up with curriculums and degree programs. some of the first comprehensive book on the subject are being written.[g]

⭐ ⭐ ⭐

the big news that merits this new datamining blog launch/ unleashing after so much time is the Facebook imbroglio.[b] basically they did a scientific study on whether users emotions could be influenced by the feed selection algorithm. when the paper was published it was met with huge flak/fallout/backlash including screechings over the terms of service (which were already very broad), the lack/breach of ethics of the experiment, a possible lawsuit, stinging editorials, possible government investigation by British (ICO)/and or US agencie(s) (FTC), response by facebook executive Sandberg, etc! [c] 😡

of course there is a lot of precedent for this based on Facebooks history, where years ago even Zuckerberg had to face the klieg-light glare over his missteps in this area. 👿

theres already a ton written on this subj & likely far more to come. cant really add a whole lot but do want to highlight an angle that is not widely reported in the media.

[m] one of the researchers involved in the facebook study is Cornell University’s Jeffrey T. Hancock who has also been funded by and participated in the Pentagon’s Minerva project which “seeks to provide authoritative knowledge on social movement mobilisation and contagions”.

now of course these studies have nice vanilla whitewashed headings, but those who understand the military will understand its basically research into what is known as “psychological operations”. this involves areas such as disinformation warfare waged on an enemy, but increasingly in the US, we see that the lines blur between foreign warfare and the domestic “police state lite”. o_O

that was hinted at in my last post on datamining and big data, which was posted before the eyepopping Snowden revelations from last summer. so it looks like in a brief time the world has (so to speak) gotten a really big hard look at the big dark side of big data. 😈

these are topics that somewhat fall in that taboo trifecta of “religion, politics, sex” that are generally not discussed in polite company, but here on this blog, in case you hadnt noticed, its always been “no holds barred”. and it believes and practices sincerely in that 1960s bumper stick philosophy of “subverting the dominant paradigm”.

the truth is clearly that the NSA has been working in “Big Data” for decades.[l] there is not a very clearcut stream of “technology transfer” as eg in contrast with other govt agencies like NASA. NSAs technology is not necessarily making its way into civilian use directly due to the high security associated with it.

the NSA [which was not all that long ago reorganized/transferred into the political jurisdiction of the Pentagon] does not necessarily have better technology either, but post-911 testosterone-soaked swelling/”tumescence” they just have mountains of cash to spend on it, and note based on whistleblower accounts that quite a bit of that is wasted in bureacratic black holes.

there is some hope that cryptography technology could be an answer to NSAs overreach but suggest not holding your breath on that one.[k] it does indeed have nearly miraculous/nearly-revolutionary properties at times eg with the rise of Bitcoin, but it also seems to have serious limitations and it has long seemed to me maybe our biggest problems with NSA are political and not technological.

⭐ ⭐ ⭐

other emerging topics. Stackexchange has an awesome open data policy and it has already led to major use and study of its data even in officially sponsored conferences such as MSR, International Working Conference on Mining Software Repositories (2013).[d]

another great success story is the birth and rapid rise of Kaggle, creating and harnessing an entire community.[h] (covered that recently in the Higgs ML competition.) there is a large crowd of data scientists geniuses spread out all over the world and this site is a hub for some tremendous focused energy and brainpower. am expecting to hear of major success stories continue to be associated with Kaggle for both public and private challenges. 💡 ❗

so yeah esp recently the MSM hyperventilates at times over the subject of big data and datamining but there are many other great/ inspiring stories of casestudies/ areas/ niches reverberating all over the place nowadays for anyone who wants real evidence of its gamechanging, technology-disrupting, paradigm-shifting nature.[e] it has the potential to make huge crosscutting impacts on multiple fields eg biology, physics, computer science, government/ politics, sociology/ psychology, health care, etcetera. the future is so bright it has to wear shades. [j] 😎 ❤ ⭐

a. bigdata

b. facebook study

c. flak/fallout/backlash

d. stackexchange

e. casestudy/areas/niches

f. nuance/analysis/questions/critiques/reaction/skeptical/pushback

g. books/papers/edu

h. kaggle

i. data scientist

j. big picture/trends

k. crypto

l. nsa

m. psyop

Advertisements

One thought on “big data drama saga: facebook (mis)steps in deep datamining doo-doo

  1. Pingback: RIP robin williams. reflections on some personal/ math/ CS angles to his movies | Turing Machine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s