A good book always begins well. We don’t have to wait for the good parts to come by. The age of Crypto Currency by Paul Vigna (who kindly acknowledged my tweet) and Michael Casey, begins with a great quote from Mandela who said : Money won’t create success, the freedom to make it will.” This is specially true for more than half of the world’s population in many parts of Asia, Africa, Middle East, South America, Eastern Europe, and even several areas of the developed world. For all these people, the greatest excitement that cryptocurrency will hold is the elimination of middlemen from the value exchange equation. All they need is an internet connection (even a spotty one at that) and off they go—free at last. This is demonstrated with great examples from various regions of the world by Vigna and Casey. Afghan women, who previously could not even open bank accounts are now able to confidently earn money for the work they did in the form of bitcoins. Enterprising artists from Barbados are starting to using the bitcoin as a medium to be able to compete with artists from around the globe. The examples are endless and inspiring. I can imagine the euphoric feeling of these young entrepreneurs finally unshackled from the fiat currency regimes and this is captured well by the authors.
The authors use an example of buying a cup of coffee from a local vendor to illustrate the number of entities that are involved in making this seemingly innocuous transaction a success: 1) front-end processors, 2) accounting banks, 3) card associations, 4) card issuing banks, 5) payment processors, and so on. I think of myself as a relatively smart guy, but I never once thought of what goes on as I pay for my coffee each day. Around the world, such transactions fees are pegged at ~$250 Billion a year. Almost all of these fees to the middlemen / entities can be avoided using the bitcoin and the underlying technology. Isn’t that reason enough for us all to get behind some form of cryptocurrency? Bitcoin or Altcoin or whatever else that is perfected in a few years
The authors ease readers into some of the more technical stuff such as the mining process (including some cool flowcharts–for those visual learners), the 51% attacks, ASIC mining, Hashrates, and distributed networks, etc. [For those really interested in getting into the nitty-gritty, I recommend another good book “Mastering the Bitcoin” by Andreas Antonopoulos]. Vigna and Casey, however, keep the conversation focused on getting readers excited enough to read other books on the subject, but not turn them off with the gory details of the underlying code and mathematics. The authors rightfully note the transformational and truly disruptive nature of this technology. Ignorance about the technology may not be bliss. A great read—I highly recommend it. 4.5/5.0
Coined by Kabir Sehgal takes us through a journey of the history and future of the thing that makes the world go around–“Money.” Sehgal describes some of the interesting findings from the domain of Neuroeconomics about how the act of winning, losing, making, and donating money makes us feel. The history of money is described quite well, spanning from Egypt to Greece and some civilizations in between. The story of the Yap and their heavy limestones has become pretty standard on any book on money and it is to be found here as well. Some interesting equations about how money and its comparison as a token of energy that is exchanged for a symbiotic relationship is quite fascinating.
Sehgal also takes us through the “Metallist” and “Chartalist” camps about money. Interesting philosophies such as the Faustian bargain (in economic terms) are explained, where soft money is introduced backed by metal that may be mined later. As it always happens, as shown throughout history, it solves an immediate need and stabilizes economies, but has a downside later as the confidence in such money erodes in society. Quoting the author, “…soft money has demonstrated both great promise and peril.”
One of the very interesting aspects about the book is the journey we are taken on from money being something physical, to something with value, ultimately how it is connected with “Karma” and the soul. A sizable chapter is dedicated to religion and money and what money signifies in different religions. Is it something to pursue? Is it something that one should relinquish to reach a higher spiritual state? Such questions are discussed and obviously answers are to be found in most world religions.
Now, on to some of the clear misses and some criticisms about the book:
The Bitcoin. A measly 4 pages is dedicated to this potential future form of money that the entire world could be using in a decade. I think if Sehgal was to release a second version of this book, he would be wise to add a tad more than 4 pages.
Not much ink is devoted to Hinduism, which has treatises on money and economics. It is one of the few religions that actually says that earning money is an important aspect of the life of a human being and then renouncing it all at a later stage for spiritual attainment. Even the Vedas which were written around 1200BC had discussions about money and how it should be earned. Obviously focusing more on Hinduism and civilizations from Indian subcontinent aren’t going to sell books across the globe, given many in the rest of the world do not even know Hinduism exists (we are all imagined to be still riding elephants and charming snakes), so I see why that was glossed over.
A good read before you deep dive into all other aspects of money: 3.5/ 5.0
For all of us who have hit the proverbial “R” wall due to memory size limitations, H2O is a welcome relief. H2O (www.h2o.ai) is an open-source, in-memory, distributed machine learning platform.
H2O’s core code is written in Java. Inside H2O, a Distributed Key/Value store is used to access and reference data, models, objects, etc., across all nodes and machines. The algorithms are implemented on top of H2O’s distributed Map/Reduce framework and utilize the Java Fork/Join framework for multi-threading. [see: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html]
H2O does many things that R does: transformations, aggregations, etc. It also claims to have a rapidly expanding library for machine learning. The documentation is easy to follow, which is a big plus. Some of the world’s largest firms have been quoted on h2o’s website as users of their product. H2O also includes an interesting suite of tools with cool sounding names:
Sparkling Water (combining Spark and H2O…nice wordplay)
Steam (end-to-end AI engine to streamline deployment of apps)
Deep-water (state-of-the-art deep learning models in H2O)
I ran a random forest model with 500 trees and 1.8 million records and it ran pretty quickly on my laptop. Obviously the real computational power can be harnessed and experienced only when it is run on a large cluster with several nodes. The H2O billion row machine learning benchmark for solving a logistic regression problem is said to take ~35 seconds on 16EC2 nodes and the performance supposedly get better as more nodes are added (see: http://www.stat.berkeley.edu/~ledell/docs/h2o_hpccon_oct2015.pdf for a detailed performance assessment).
All in all, H2O is a great alternative to try out as you crunch those extremely large datasets, where R cannot help.
How should organizations define an analytics strategy? There are typically three usual models that I have come across with different penetration, reach, and success. What works for one organization may or may not work for others and that is axiomatic.
The Consulting Model: In this model a tight group of data science and analytics professionals work closely with business units to understand challenges, design, develop, and deliver the analytics. As the name suggests the approach is more consultative in nature (shorter term projects). This approach delivers the most bang for the buck, since the team can achieve quick wins for business leaders and demonstrate the value of analytics, leading to potential sustained consultative engagements with the business units. This model works well in organizations relatively new to analytics.
The Hub and Spoke Model: This model relies on a central hub or center of excellence, which builds an entire team of data scientists, data engineers, and analytics professionals. Such hubs/ COEs are given the mandate to serve as a clearing house for analytics in the organization. Many examples of this model exist in very large mature data organizations (IBM and Microsoft, among others). The spoke refers to small teams dispatched from the hub to design, develop, and deliver analytics. This provides a more sustainable approach for organizations dealing with external accounts/ clients / partners for deploying analytics. The COE will continue to serve as the delivery arm for the analytics since it has all the data, infrastructure, and personnel in one place.
The Embedded Model: This model has embedded analytics teams within business units. Usually, this is an approach taken by financial companies where specialized teams work with the business in continuously delivering insights. Obviously, this is not a scalable approach for organizations, albeit successful “locally.” This provides business units with analysts who do not have to be coached on the ins and outs of, say, quantitative trading strategies. However, it does have the limitation of ‘tunnel vision’ with respect to solving analytics challenges.
Obviously there are organizations which do a mix of all the above or a subset of the above approaches. Personally, I believe a hub and spoke model works best since it is in the interest of any organization to have a long-term vision of what analytics can do. If an organization as a whole wishes to be data-driven in everything they do, hub-and-spoke / COE model is the way to go. This also allows for expert generalists to be developed over time and gain experience and expertise across multiple business functions. This may take time to set up, but I believe the investment may be worth it. The age old adage “If you build it they will come,” works!
Here’s the video from our talk while I was a Senior Data Scientist at Impetus (collaborative work with Dr. Vijay Agneeswaran, currently Director of Big Data Analytics at Sapient), at the Spark Summit in San Francisco, 2014.[I am the guy in the White Shirt, starting at the 1:45 mark]
In the world of data science and analytics these days, we are all faced with the key question of what technique or methodology solves which problem. During many of the interviews I have conducted over the last several years, I have heard all fancy algorithms being paraded around, without candidates really understanding why those are to be used. The most common ones I hear from candidates are: support vector machines without understanding what support vectors are, deep learning without understanding what neural networks are, random forests without understanding what really makes them random, and naive Bayes classifier without understanding why it is called ‘naive.’
The package driven languages of data science like R, Python, SAS, etc have made it extraordinarily easy for people to use all these complex algorithms without actually understanding the underlying statistics, mathematics, and optimization principles. This is called the famous “hammer and nail analogy.” When you have a hammer everything looks like a nail. It has become commonplace for people to use R or Python and to try all algorithms and simply pick the one with high accuracy measures, without really understanding what the business needs and the problem needs. Not all problems require deep learning. No really, they don’t! Some of the common challenges in the insurance industry for example, may simply need association rule mining or decision trees. Some may need more complex modeling and simulation for risk-based analyses.
One of the first exercises I used to give to my doctoral assistants in research or my team in industry was to code an entire algorithm without using any packages in R or Python. This gives candidates a deep understanding of the internal workings of algorithms. Data Science and Analytics are part art and part science. Use it wisely. Your goal is to solve a business challenge and drive business value, not to show off what technique is the latest and greatest trend on social media. Do not use a chainsaw where a scalpel will do.