18 Jul Data Coffee Talk – Interview with Adam Votava, Founder and Chief Data Scientist at Alook Analytics
In today’s Data Coffee Talk Pavel from Keboola Singapore talks to Adam Votava, the founder of Data Science company aLook Analytics about analytics, data science and data visualisation, living as a digital nomad and data as the most valuable asset for companies.
“The data is more important than the algorithms.”
K: Adam, give us the elevator pitch of aLook Analytics.
A: We are young and pretty new data science consulting team with business driven attitude. We work with clients across industries and specialize in smaller companies (read smaller than Fortune 1000). We don’t restrict ourselves to any vertical, because we believe in the transferable experiences from different environments. What we do for a client in e-commerce space might be used in different way for client in manufacturing with surprising benefits.
K: As the chief data scientist, what was your journey and what’s your background?
A: I studied simultaneously law and statistics at technical university. Originally I joined a law firm very quickly realising that’s a no-no. I moved to a large bank starting as an intern in what was ultimately data mining team and through the 6 years tenure I grew into role of manager of data science team. Then my wife got opportunity to move to Tokyo and I started playing with an idea of starting my own business.
K: You live and work in Japan. Where are your clients based?
A: It’s all driven by our lifestyle. We want to be agile in our life and I want to have time dedicated to my biggest and very time consuming hobby, an Ironman triathlon. We are in Japan for a short time so we don’t have much of a network there. Today we have clients in Europe and in Africa. Our team is very distributed, we have people in the Czech Republic as well as in Ecuador.
K: Keboola functions very similarly in APAC region. The benefits are quite obvious, but I am curious, are there any negative aspects of this way of working?
A: It’s hard, but worth it. All our work is done online and we work across several time zones, so we have to be there when our clients need us to be there, which requires a degree of flexibility on our side. Second aspect of this is being 100% online and as a very young company, we have to make sure we do absolutely stellar job to win the trust of our clients. We are small so we can work mainly on word of mouth marketing and personal recommendation of our clients.
K: I would like to stay in Japan for a while, tell me, how do you find the data science market there? Japan is very closed market. How is it different to Europe?
A: The most striking advantage Japanese data scientists have is that they are very technically skilled. They know their tools very well, they are fantastic programmers. On the other hand, they are not business driven and lack certain agility, creativity and fast thinking and improvising. They also seem to be quite far from the business itself. In summary: the craftsmanship is great, quality of delivered products is superb, but rarely you will find innovative approach.
K: What sort qualifications should someone interested in data science have? Is statistics and mathematics more important than programming? A: When we look for data scientists the technical requirements are of course working knowledge of the data analytical tools and statistical software as well as coding ability and an expertise in machine learning algorithms. What’s more important however, is an appetite for learning new things because we really work with clients from different industries. Technical education is a plus since we work most often with opensource tools like Python or R, which is something people can learn early at school. This tool set is great for lots of smaller companies, who can’t afford to buy IBM Modeler for example. Another really important skills are data preparation and data visualisation.
K: We will get back to the data preparation. One trend we see is surge in demand for data science services. Where do you think this will go?
A: In my opinion, only large companies will be able to retain data scientists full time in the long run, because only the large ones will be able to create an environment where data scientists will not get bored. For small companies on demand space is where I see lots of potential for companies like aLook.
Interesting behaviour is the distrust of the management in many organisations based on the lack of understanding and the complexity of data science. It gets easier when you talk to business in extremely competitive margins or turbulent market changes.
I would draw an analogy. If you are rushed into ER you will not care about understanding of the procedures and names and qualification of all the doctors, you just want to have your life saved. Same thing applies in extreme market situations for the application of data science.
K: What about the data science use in smaller companies?
A: I have to say it’s easier to use data science to its full potential in the smaller companies as the process, decision making and political climate are typically easier to deal with. They also tend to be ready to take risk as a big corporate presumably always has more to lose. Let me clarify, I don’t look at companies too much as a small or large based on number of employees but rather the size volume and diversity of their data. Small companies worry less about cloud and can leverage new and up and coming technologies which generates the advantage of much faster time to value. I will go as far as to say that for small and starting ecommerce businesses the data, in cloud, is their only valuable asset.
K: Measuring quality in data science is a tough one for most business people or domain experts outside of data science. How can we do that? Is there anything else than the mythical ROI?
A: I understand Data Science as statistics applied to business. You can measure quality of the model, quality of a prediction, probability reached or with self learning models you need to do quality evaluation in time so you see the model doesn’t learn ‘bad things’. Business people don’t tend to understand this, hence you must simply demonstrate that based on information you or your model provides, they are able to make better decisions than yesterday. The ideal business stakeholder understands that there is more information in the data than he has available and it can help him to make better decisions at scale.
K: What are the prerequisites for making this work?
A: In order to make data science work, you have three key roles in the process, I call it “holy data trinity”, without which it will not work. First you need someone who has domain knowledge and understands what the business needs, second you need someone who is able to create data science models and finally someone who ‘owns’ and understands the data itself and is able to prepare it into the desired shape and form. We typically work in a way where we come in as the data science expert and work with the business problem owner and someone who manages and consolidates the data itself in Keboola Connection platform or another environment.
K: I want to poke the ‘BIG DATA’ hype. Is it all about petabytes or is perhaps the size of the data and its potential usefulness to the organisation in inverse relationship?
A: From data science perspective, it is quite easy. The quality and performance of our models is directly linked to the quality of the underlying data and quality has nothing to do with size. If we are able to bring in more diverse data sets into the solution, we will by definition create better decision making and have more precise and better performing model.
K: Can you share an example?
A: Typically when we created propensity models in the bank, to figure out which group of our customers has higher probability of being interested in new product, we would work with data available to us. Basic demographics, but also past purchasing history. These models would get better if we were able to model in other features, for example online behaviour, understanding what ads they click on, what type of content they are interested in reading etc.
This all would then be taken in consideration and contribute to better performing models. It’s essentially about breaking the functional silos and being able to bring more relevant data from different sources.
K: Data preparation and data hygiene is something we at Keboola talk about on daily basis. What’s data scientist’s take on this?
A: Oh, data preparation is fundamental. When I earlier mentioned the ‘holy data trinity’ it is really helpful when new clients already have their data in good shape and prepared for analysis. This is crucial because for data science you need to have different structure of data than for reporting. I don’t mean just raw and aggregated data, I mean knowing you don’t need to go back to data source and verify the data is correct. Ideally the client has someone who understands the data, the data specialists or has a transparent data integration platform like Keboola Connection in place. This greatly influences how fast we can get our job done. And the time is money.
K: Is this the norm?
A: Not at all. The worst example and actually very common one is a company that has data spilled around different data warehouses and storages or even data sources directly, without any structured integration.
K: Do you see data integration as an important function in organisation’s IT structure?
A: I see every day that the ‘data paralysis’ doesn’t only apply to the size of data set but also to the variety of data sources and IT inability to integrate these new sources of data on the fly. The number of data sources is always growing with new ad platforms and measurement tools being introduced every year. So yes, data integration as part of the IT scope is increasingly more important to enable the business people do their job.
K: aLook Analytics is Keboola partner. I usually describe your offering as data science as a service, do you identify with that?
A: Yeah, that’s a good way to put it. What we really do is trying to address the opportunity for data science in mid size companies and startups. We often see companies bringing in serious data scientist on board and burying them in mundane reporting. This will typically quickly result in the specialist feeling skill-wise unappreciated and he or she leaving soon. We are addressing this issue with an on demand basis offering.
K: We are right now working on data integration project for e-commerce customer where in a first stage we are helping them to get all their relevant data integrated in one place and help them automate their reporting. What would be the first couple of things how data science could help here?
A: Assuming that some level of personalised marketing (like recommendation engine or direct campaigns) is in place, I would definitely first look to evaluate performance of marketing channels and their impact using algorithmic multi-channel attribution and use its extension which would be predictive budget allocation based on the prior while taking sales targets into consideration. Marketing tends to be single biggest expense in e-commerce, so making sure the money is spent well is a very good place to start.
K: Anything else?
A: I am currently researching and developing ‘dynamic pricing’ module which would be deployed in similar scenario assuming the store is working with large amount of SKU. Once this number is in hundreds, such technique can help logistics and set the pricing right for how much inventory needs to be moved, again intelligently based on data and automated decision making.
K: We recently see also an uptake in subscription based business model whether this is app based business or software as a service. What immediate benefit can data science bring here?
A: The most obvious example is once the company spends significant amount of money on the customer acquistion, they need to retain their clients as long as possible so they get the money back and create profit. Churn prediction algorithms can be deployed here to provide the customer engagement teams with red flags for customers who show the signs of likelihood of leaving the service and can act upon it to try to reverse the action.
K: If anyone wants to know more about data science or how you can help their business, how to best get in touch?
A: Good place to start is our website or connecting me via email firstname.lastname@example.org
K: Thank you Adam, for the chat. We are looking forward to our joint projects in Asia.
If you enjoyed the read, please do share it on your social networks and if you have feedback, we would love to hear it!