Originally written in 2016, edited in Jan 2017.
I cannot turn a page in newspapers or browse for news on the internet without reading (and rolling my eyes) about the emergence, reemergence, take over, new era, new age, and deluge, of Big Data Analytics or how machine learning algorithms like deep learning or some other cognitive, neuro, learning are going to save the world. We all have heard some banalities being bandied about quite a bit…
- Big data is the next oil, the next soil, etc.
- Data matures like wine and applications like fish
- IoT is going to disrupt the way we live, it is a lot bigger than Big Data
- Artificial intelligence is going to take over the planet and all our jobs (Terminator style…well not quite, I made that one up, the other three are real by the way.)
My personal favorite rebuttal for all these is “Not everything that counts can be counted, and not everything that can be counted counts.” This quote was said to be hanging in Einstein’s office in Princeton (not sure if this is true or not, but the saying makes sense). With all due respect to data scientists and other analytics professionals (I am one of them) can we all please go easy on the hype and not make it all sound so cheesy. It’s like the dotcom bubble dejavu all over again. Every startup I hear about is using some fancy ‘new’ algorithm, every company is talking about how analytics will change the world. Sure, some will and should..for the better.
Let’s get a few things in order. Analytics and data science have been there for a decades. They were just known with different non-appealing names: statistics, optimization, computer science, algorithms, etc. Clearly, none of them sound as appealing as “Data Science” or “Analytics.” We should all be thankful that industry as well as academia woke up and took notice about “smart decision-making,” and I guess some amount of branding was necessary for it to be taken seriously. Duly noted.
Now, can we get back to doing good work and not sell snake oil. All of us end up sounding ridiculous, naïve, and quite frankly a little annoying. The field runs the risk of being turned into a sham by some used-car salesmen (no offense to them). Let me give you a personal anecdote. I approached a conference organizer (in India) about submitting a proposal to speak at a conference. He unabashedly sent me a brochure with a detailed price list of how I can buy slots to talk about my ideas. Never once did he talk about my proposal, what the idea was, or even what the model / algorithm / application was. All he cared about was $$$.
The brochure even said I can pay extra to talk more (buy an entire session that is). This is what knowledge in our world has come to. Who can sell the snake oil better…who can market things and make them sound better… who can come up with more cool sounding jargon…who can create entire fake conferences where people pay to talk and ideas go to die. Sure conferences cost money, but it has to have a rigorous review process, such as KDD or most of the IEEE conferences.
So how do we stop this madness? I have a few pointers that some of you may agree with. I have already spoken to a few serious data scientists and they share my views.
- Refrain from saying and posting stuff unless it makes scientific sense (do not do it just to get more likes and shares on your social media feeds)
- Reputed websites should have strict editorial and review processes and not publish garbage
- Serious data scientists should refrain from giving talks where you have to pay to simply buy a slot without any formal review process
- When recruiting for your teams, consider hiring analytics professionals who are certified or those who have demonstrable skills
If we do not give any value to our own profession, trust me no one else will. We will all end up looking like used car salespeople.