Vin Vashishta is a big name in data science.
He believes a huge percentage of the advice you’re given is wrong, that there’s a huge difference between interesting and useful in data science and that cupcakes can be the best thing for your analytics.
Listen in!
I’ve been called an idiot by a lot of PhDs
Cindy Tonkin: Ladies and gentlemen with me today I have Vin Vashishta. He is phoning in from Reno. I am in Sydney, in Australia. We’re going to talk about what makes for smarter data people.
Vin, I looked at your LinkedIn profile. You’re a big name in the data science world. Tell us about what you do.
Vin Vashishta: Well I’ve been in technology for, what’s it been now, 24 years and I spent the last almost 10, wow, it’s been a while, spend the last going on 10 years in data science and machine learning, a bit of deep learning over the last couple of years, as well. Not entirely sure how I got to be a big name. I am often more opinionated than most people like and most of my opinions are not things mainstream data sciences like. Not entirely sure why somebody doesn’t like me, because I’ve been called an idiot by a lot of different PhDs.
Cindy Tonkin: I think that’s the badge of honour in the data science world, though, to be called an idiot means someone’s actually thinking about your stuff isn’t it?
Vin Vashishta: That’s true. You do have to make a PhD pretty angry for them to acknowledge your existence.
Cindy Tonkin: What are your habits and routines, in terms of working smarter? Are there things that you do to keep yourself together and functioning?
Work smarter: know the business objective
Vin Vashishta: There are. It’s a data scientist’s hole. It’s like an entire rabbit hole every day that you could decide to go down or you could remain focused on, not so much the task at hand, because a lot of the times the task at hand requires rabbit holes. But really understanding what the business objective is or, in some cases, what the personal objective of the project is. That one goal, keeping that in mind, and being able to understand it in the first place is really how you can work smarter, because there are a million different things you can do. Everything from what datasets that you use, to your choice in algorithms, to how you design an architecture. I mean, I’m stuttering because it’s just so much you can do in anyone of those.
Cindy Tonkin: There is so much, absolutely.
Vin Vashishta: Anyone of those could lead you down a rabbit hole that takes up, not just one day, but one or two weeks, and doesn’t really lead to anything fruitful.
Cindy Tonkin: Yes, you’ve got to, basically, make decisions early on about knowing what the outcome is and sticking to the things that are going to get you towards that outcome. Is that the basic concept?
Vin Vashishta: Yeah, it really is keep the goal in mind, because a lot of the activities you can do have a cool factor. It would lead to something interesting, but it wouldn’t lead to something productive. It wouldn’t lead to something that you could actually use from a business sense. There’s a huge difference between interesting and useful in data science. Almost everything’s interesting very, very, very, very few things are useful.
Routines to keep smart: the single person stand up
Almost everything’s interesting very, very, very, very few things are useful.
Cindy Tonkin: Interesting. What about your personal routines? How do you keep well? How do you keep smart? How do you keep nice?
Vin Vashishta: I start my day at about 5:00, 5:30 in the morning. It’s a wonderful time, because the world hasn’t woken up yet.
I’m not getting emails.
My phone’s not ringing.
I can focus on basically planning out my day.
I’ll do that every morning. I keep a pretty detailed log of what I was supposed to do yesterday. It’s almost like my own personal scrub, where I have a single person stand-up, and I look at what I accomplished yesterday, write up a good summary of what I did, what I would have liked to have done better. Then, write up the deliverables for the day, both, personal deliverables as well as work deliverables. I think it’s important-
Cindy Tonkin: Wow, you did it every day. That’s-
The joy of ticking things off a list
Vin Vashishta: Every day. Yeah, it’s the first thing in the morning, because it keeps me focused. I like crossing things off of lists. There’s a strange satisfaction-
Cindy Tonkin: Yeah, me too. Sometimes I write things on the list, purely to tick them off.
Vin Vashishta: Yes. It is a strange satisfaction for me there, but I enjoy crossing things off of a list, or adding things that I’ve done to the list that I didn’t plan on doing that day. That I had some spare time to do. It also helps me understand.
That beginning of the day is really a time for me to focus on, not only learning about what I’m going to be doing today, but also learning about what I didn’t do right the day before. I’m spending so much time on this piece of it, because I really think, for a data scientist, this is the most important thing that you can do is, every morning, get up and get to what did I learn from yesterday, and what do I want to do today?
You will find that the little incremental progress that you make from day-to-day, whether that’s reading, and I do a ton of reading. I’ll read in that early morning time.
Cindy Tonkin: I’ll ask you in a second what you read.
Vin Vashishta: I’ll read 30 to 40 articles. I’ll go through arXiv and read, basically, any relevant papers to machine learning, and deep learning, that were published the day before.
I skip a lot of them, because as soon as I read the title, or read the abstract, I’m thinking, “I’m not interested or not relevant.” But I’ll read through a lot of different papers and try at least to get the gist of each one of them to understand if there’s something in there I might be able to use now or in the future.
Breakfast is a must, don’t skip meals. Breakfast, lunch and dinner, always keep time for those. Then I start my day. It starts, basically, at whatever time I can get through that first few pieces. I’ll work for about seven hours. That’s the limit. At the end of seven hours, it’s time to stop.
Cindy Tonkin: Right.
Vin Vashishta: That’s as long as you can be functional as a data scientist. There’s people that work 12 hours, 14 hours and you can just look at their productivity and it precipitously declines after about the first six, or seven, hours. It just falls off. So, I’ll typically work about seven hours. I’ll take a good long break, typically, three to four hours. Relax, do some fun things for myself. Check off the personal to-dos. Then I’ll work one more hour.
Cindy Tonkin: Right.
Vin Vashishta: It’s amazing how much I get done in that one hour. There’s, pretty much, my day. Then I spend evening time just relaxing, catching up on my favourite shows. I try to get outside a little bit. I’ve got a gym at home, so I’ll get a workout in, at some point.
Cindy Tonkin: Nice. That’s quite a structured day.
Lessons: don’t skip breakfast
Now, there were a couple of lessons in there.
Lessons learned is one of the questions I always ask, and I notice there’s a few within that you just opened, like don’t skip breakfast, and only work seven hours. Are there any other big lessons you’ve learned in your career, maybe, from a manager, or a leader, or a mentor, at some point, that stood you in good step?
Vin Vashishta: Lessons learned, I’ve actually had … I’ve been fortunate to be a leader and work with some excellent mentors. I’ve gotten the advice from both sides. I’ve seen countless pieces of advice over the years. Then, on the flip side, I’ve been a leader, so I’ve actually gotten to look and see what pieces of advice are actually good ones and which ones stink.
Vin Vashishta: Yeah, what’s been amazing to me is that, most advice that you are given, as a data scientist, is bad advice. Yes, it is very strange. It’s not a typical field. It’s not one of those fields that the generic-type work advice really works in, because you need to be creative. I spend a lot of my workday in front of a whiteboard or in front of nothing at all. People will watch my workday sometimes and look at me like, “Are you actually working? What are you doing?”
Vin Vashishta: So much of our job is thinking.
Cindy Tonkin: Yeah.
Vin Vashishta: If you spend two, or three, hours thinking before you start solving a problem, before you come at something, even though you feel like, “I’ve got the perfect solution, I’m just going to go run that.” Think about it, I find a lot of my solutions are best thought over, best simmered for a little while, before I actually implement them.
Cindy Tonkin: Interesting.
Vin Vashishta: And, in most cases, I will find a better solution, or I will find a way to simplify what I had initially created as some complex model in my head. I’m able to simplify it or I’m able to think through different data sources. There’s always a different angle that you can come at it from. So a lot of the typical engineering, where it’s grind, code, get your head down, get to know this, get to know that. A lot of that advice is horrible. Some of the best things that you can do, as a data scientist, are get outside, take a walk, and think through what it is that you’re doing, or go to lunch, go grab some cupcakes, go do something.
Some of the best things that you can do, as a data scientist, are get outside, take a walk, and think through what it is that you’re doing, go grab some cupcakes
Get Cupcakes
Cindy Tonkin: How do you have cupcakes?
Vin Vashishta: Let your mind … Yeah, exactly. Go do something you enjoy. Let your mind think and wonder around to the solution that you’ve come up with. Then, when you get back, start building and see where it goes, see where it takes you. Never, really, never take your eye off of there being a better solution. I think that’s really the best advice that I’ve ever gotten. The best lesson learned, that I really ever gotten is to spend the time thinking, rather than slamming fingers into keyboard or fingers onto whiteboard. There’s a lot to be said for a thoughtful implementation, rather than a quick implementation.
Cindy Tonkin: Yeah, because, ultimately, what you want is an algorithm that’s going to work and stay. So, don’t play with a budgie one, when you could have the best one, yeah.
Vin Vashishta: Well, we also tend to over complicate our problems to begin with. We come at our problems and think, “This is going to require this. This will require that,” and we start slamming pieces on together. If you spend an hour, you find yourself simplifying. That’s really what a data scientist’s mind naturally does, especially with machine learning, is we naturally simplify things. A lot of times people will get three, four, five days, sometimes weeks, into a solution before they realize that they’ve over complicated their life. Then they go back and they simplify it or, in some cases, they just never go back. They say, “Well, I’m halfway through this path, why not?”
Cindy Tonkin: Yeah.
Vin Vashishta: If you spend that few hours up front simplifying, it’s faster to code, faster to train, faster to implement.
Cindy Tonkin: Because, ultimately, there’s been some research, probably, 15 years-old now, about actual insight. FMRI puts you in the FMRI and get people to get to potential solutions and then go, “Now I know the answer.”
That’s like that shower moment, when you’re in the shower, and you go, “Now I know. Why did I not start thinking about it that way?”
They talk about the need to let go of analysing it, and let go of trying to make sense of it, and just letting your brain not think about it, and that brings you to the answer that’s so obvious. It’s like, “Why didn’t I start this way?”
Yeah, I can see that giving it some time to think through the solution is probably a very useful way of being more effective and efficient with the solutions you come up with.
Yesterday has a lesson for you
Vin Vashishta: The only lesson learned is really from what I was going over with my day is, look back at yesterday.
Yesterday has a lesson for you, whether it’s big, whether it’s small, you can learn something from yesterday. Spend the time to look at it, because the more you can reduce from your life, that’s stressful, because it’s a repetitive error, the happier you’ll be, the more satisfied you are, the more productive you are.
Vin Vashishta: There’s really … The easiest way to forgive yourself from making a mistake yesterday is not to make that same mistake today.
Cindy Tonkin: Nice.
Vin Vashishta: That is really a great way to just get past a lot of … I guess, a lot of the stress of being a data scientist, because, I don’t know, maybe it’s me, I shouldn’t speak for everyone, but I make a lot of mistakes. The reason why I’ve always gotten better is because I remember them all.
Cindy Tonkin: And you’ve learned from them, you’re nice. It’s nice. Talk to me about data people, data scientists. What makes a better or worse one? You’ve already said some things, but are there other insights you have about what makes a better, or worse, data person?
Vin Vashishta: You know, there’s the traditional skills, and I think those are over focused on. I think what really makes a great data scientist is the ability to learn and assimilate new concepts, complex ones, mathematical concepts, scientific foundational concepts and concepts in, and around, machine learning methodologies. The ability to learn programming languages. The ability to pick up new data structures and databases. So, it’s really that ability to learn, because I can teach somebody how to code. I can teach someone a new database. They need to learn bond building. If they need to learn MySQL, they can pick up MySQL pretty quickly. These are easy skills.
Vin Vashishta: If someone has a foundation in software engineering and software design, it’s very, very easy to teach them the fundamentals, same thing with mathematics, same thing with science, same thing with machine learning. If you have that foundational knowledge, everything that comes after it, is just how fast can you learn?
Being able to learn new concepts is critical
So, being able to learn new concepts is critical. That’s what I want to know. I don’t want to know, how well you memorize something. I want to give you something brand new and see how well, and how fast, you learn it. How did that new concept stick? Is that something that you can use now, now that you’ve learned it?
Vin Vashishta: I also want somebody who’s curious and creative. There’s a whole lot of these coding exams that we do in data science, or we’ll do these technical interviews, we call them, technical interviews. They simply don’t find good data scientists, because it’s hard to assess creativity. It’s hard to assess the ability to want to learn, that desire. There’s a lot better ways. I look at a lot of our interviews, and it seems backwards, because what we should really be doing is giving somebody a new concept. Learn this, use this to do something, then, let’s talk about it, and you assess creativity. You assess curiosity. You assess the ability to learn and assimilate new information. And it’s no longer a, “Hey, can I trip you up on something.”
Vin Vashishta: I’ve been coding in Java for … I was having a conversation at a conference a few weeks back, where we were talking about this. I mean, I’ve been coding in Java forever, and you can still ask me a question that trips me up. You just can’t memorize everything. A lot of interviews, and candidate assessments, are really this process of trying to trip somebody up and see what happens. We really need to be assessing candidates in a completely different way. We need to assess our colleagues in a completely different way. What you know today will be meaningless in five years, so why do we care about what you know today?
Recruiting
Cindy Tonkin: Interesting. When you recruit, do you actually get people to do what you just said, in terms of, is a new concept taken on and talk to me about it?
Vin Vashishta: Yeah.
Cindy Tonkin: Wow.
Vin Vashishta: That’s one of my best interviewing techniques. It’s one that I’ve come up with just over the last couple of years. I’ve used a lot of other different interview technique and tactics. I’ve tried to refine it over the years. Like I said, over the last couple years, that’s the one that works best. I ask the candidate about five, or six, different topics and say, “Have you ever heard of this? Have you ever heard of this?” When I find one they’ve never heard of, I say, “Why don’t you do this, go out, study it, learn it.”
Cindy Tonkin: Come back and talk to me.
Vin Vashishta: “Do something with it.” Yeah.
“Build something with it, and then come back and talk to me about what you built. We can go over it from there.” Now, it’s not me asking them questions about something they may, or may not, know. It’s them teaching me something.
Cindy Tonkin: Interesting. And that’s been successful in finding people you think are the correct people for the job?
Vin Vashishta: Obviously, to my bias, but, yes, I think it’s great. I think it’s worked very well.
Cindy Tonkin: So far it hasn’t been any less effect, because, essentially, interview techniques have been proven to be reasonably ineffective, but people think they’re effective, because they’re different, and they have a good gut feel, but everybody else gets used to the role, according to the research.
Vin Vashishta: Well, what’s incredible is that, as data scientists, very few of us have read the research.
Cindy Tonkin: Yes.
Vin Vashishta: I mean, you would think that would be the first thing that you did. When I started interviewing, that’s what I did, and this is back before I was a data scientist, when I was interviewing engineers. I actually read the research, and I said, “Okay, your college degree is meaningless. Your experience is nearly meaningless.” And you check off all of the things that most interviews, and interviewers, focus on. When you peel away that onion, you get to the bottom, and it’s nothing that anyone’s really doing, as far as interviews go, outside of companies like Google. Google has some very structure interview processes. I think Amazon and Facebook do too.
Vin Vashishta: There’s a few companies around, in the corporate world, that actually make use of the research, but it’s so rare.
Professional Development
Cindy Tonkin: Yes, it is. Which, of course, as you say, as data scientists we should be paying attention to that stuff. Yeah, totally. What about you? We’ve spoken about your professional development. Do you go to conferences? Do you listen to … What do you do, professional development, in more detail than you’ve already given us?
Vin Vashishta: It’s really reading the articles, the papers. From time-to-time I’ll pick up a book about data science. I relearn programming languages all the time. I joke that I’ve forgotten more programming languages than I know right now. It’s just the reality of a long career. For me, skills-wise, professional development, in a lot of cases, is either, one, keeping up on what’s coming out, and I like to look at new approaches. I like to look at … You know, the approach, itself, may not be a great approach, but, in a lot of cases, it triggers thought on my part.
Cindy Tonkin: Yeah.
Vin Vashishta: It makes me think about how I could apply this and, maybe a variation of it, to something that I’m working on. I think those are important things to do from a professional development standpoint. I do go to conferences, but, normally, I’m a speaker. I’ll be honest, a lot of the presentations at conference, when they’re presenting a paper, it’s very interesting.
But, for the most part, a lot of conferences that aren’t doing that hard data science, or aren’t doing the strategy side of data science and machine learning, a lot of those conferences I don’t get a lot out of, expect for, really, those two types; the ones that are highly content specific, or research specific. They’re the ones that are highly strategy specific. I think those two are the most interesting. Everything else seems to devolve into a sales pitch, at some point. I, generally, stay away from those.
Cindy Tonkin: Yes, that’s annoying isn’t it?
Vin Vashishta: Yeah.
Cindy Tonkin: Do you do online conferences? Are there any that you like?
Vin Vashishta: From time-to-time I’ve been a speaker at them, but, I’ll be honest, most of my speaking engagements, now, are private engagements for companies. For the most part, I’m off the conference circuit. I’m going to be going to Unleash, which is an HR tech conference here in about three weeks down in Las Vegas.
It’s tangential to what I do, but it’s an interesting area of AI research, the HR personnel field, recruiting, that sort of thing. That’s something I’m pretty closely involved in.
I go to a lot of, I guess I would call them, industry conferences, as an attendee to get the domain knowledge about a place that I maybe working in. That’s another area I get some professional development, but, when it comes to conferences just, in general, I don’t get out to as many as I’d like to.
Cindy Tonkin: I found I used to go to a lot in my 20s, and then it became, as you say, it’s a bit of like, “This is repetition. I’ve heard this before. This is just that book you’ve taken and written a topic paper on it.” It’s like there’s nothing new under the sun anymore. Yeah, I imagine it depends where you are in your career and how much you’ve already read, whether that you actually get anything novel from a conference anyway.
Vin Vashishta: That’s true. But hearing other people think, powerful thinkers. It doesn’t matter what setting it’s in, whether it’s a podcast, or them writing a book, or catching one of their lectures online.
Powerful thinkers stimulate thought.
I’ve never stopped learning from powerful thinkers.
There are even courses that I’ll retake, like Andrew Ng’s got his course online, Machine Learning. I’ll go back to that. Because the way that he explains things is so clear that just going back to it and picking up fundamentals again, and making sure my fundamentals are still strong.
I find myself doing that from time-to-time too, just going back over my education and making sure it’s still in there. That I’m not just going through the motions. I still understand the fundamentals behind everything, because it’s easy to leave them behind.
It’s easy to forget what’s happening under the covers.
Vin Vashishta: When you’re running at 300 miles an hour and chasing bleeding edge stuff, it’s easy to forget what’s really going on under the covers and how you can take some of the complexity of new approaches, and simplify them, and make them effective, and also a whole lot easier to implement and maintain.
Explaining complexity
Cindy Tonkin: Wow. You mentioned that you like going back to powerful thinkers, because of the way they explain. Is there anything you do, when you’re trying to explain something complex to a data naïve stakeholder, for example? Are there any particular habits or practices you use to make the complex more simple?
Vin Vashishta: In a lot of cases, I will grab somebody else’s explanation and use it. I do that because, along the way, I’ve found a lot of great explanations scattered out there. I which I could throw one out. There’s nothing off the top of my head, but this list of explanations of complex concepts.
Vin Vashishta: I’ve got a whole folder of them, where I will send somebody a link to something. Because there are so many people out there who have done this so well, that, in a lot of cases, I’ll say, “Let me send you a link. Let me send you a video. Let me send you an article that I read about this that explains it very, very well.”
I think, in a lot of cases, we try to take too much onto ourselves.
We try to spend the time explaining something to someone, when, really, we’ve got somebody else who’s done it so much better than we could. I believe, in a lot of cases, of using other people’s work, and using other people’s brilliant simplified explanations, and using those as my crutches.
Vin Vashishta: But there are a lot of times when I’m speaking to senior executives, C-suiters, who are … They don’t care about the details, but, in a lot of cases, they need to understand what’s going on under the covers and, if they don’t, they’re missing a significant part of the project. Why the project went in one direction or another? Why the project isn’t possible, in some cases? Why the project is going to cost as much as it is?
There’s a whole lot of why underneath and, depending upon what level the executive’s at, it’s useful for them to understand some of those things that are happening under the covers. What I’ll try to do, in a lot of cases, is just stick to the business meeting, when I’m talking to senior executives, because that’s what they understand. That’s what connects with them.
Vin Vashishta: When I’m talking with users, I’ll just stick to the thing they care about. What is it that you need to work?
I don’t talk about machine learning. I don’t talk about the algorithms or anything that’s complicated or ugly and messy underneath. I’ll just say, “This is what it’s going to do.” When they say, “How?” I want to explain to them how it’s going to meet their need.
Cindy Tonkin: Yeah.
Less of how does it work and more of how is it going to meet your needs?
Vin Vashishta: Less of how does it work and more of how is it going to meet your needs?
Cindy Tonkin: Right. So, you reinterpret their questions by the filter you believe they’re coming from.
How do I get what I need to have? Rather than, how did you do this? Which is, really, the filter that the data science would usually come from.
How do you do this? Well, I analyse this, and I took that, and I multiplied this. When, in fact, the answer is, “Well, you’re going to put the answer in the box. Then you’re going to put this question here and push this button, and things will work”.
Vin Vashishta: It’s interesting that a lot of the time, as data scientists and machine learning engineers, we answer the question we hear, not the question that was asked.
Cindy Tonkin: Interesting. Do you want to say more about that?
Vin Vashishta: Well, we hear all of our questions with the slant of our field, with the slant of what we do. We have this bias. We are listening for questions that sound like what we normally get. Those are, typically, domain specific questions, or how are we going to approach this methodology, so on, and so forth.
But when you’re talking to a different audience, talking to an audience outside of the data science team, or the machine learning team, it’s easy to, once again, hear the questions that we want to hear, hear the questions as we are, rather than as the person who’s asking is.
It’s really important, and I gave a talk about this at Metis a while back, to stop and think about person, not just the question, but who just asked me that question? What do they do?
There’s a question behind the question
Vin Vashishta: And, a lot of cases, just that one piece. “What does this person do” will help me answer the question better than any other piece of information I can get. In a lot of cases they’re asking a question, but they really want to know something else. There’s a question behind the question.
Cindy Tonkin: Totally.
Vin Vashishta: In a lot of cases, with data science, most people don’t know how to ask us questions. We almost have to interpret in some places and say, “This is what you asked me, but this is what you’re really interested in.” We have to rephrase the question and say, “Is this what you really want to know about?” In a lot of cases, that makes the person more comfortable, because there’s a connection there. You understand where they’re coming from.
Favourite Charity
Cindy Tonkin: Nice, excellent. My final question, today, is essentially about your favourite charity. What’s your favourite charity and why?
Vin Vashishta: Habitat for Humanity is a charity that builds. They just build houses. It’s an amazing charity. I absolutely love it.
The Food Bank of Northern Nevada is the other one. I like basics. Give people food, access to shelter, and you can really turn people’s lives around, with the most basis pieces. If they have food security, if they have a stable place to live, they, in a lot of cases, can do the rest.
Most people that have fallen on hard times are completely capable and charities, like Habitat for Humanity, and the Food Bank of Northern Nevada, really just focus on giving people the basics and allowing them to just use their talents now. Now they don’t have to focus on the difficulties of their hardship. They can focus on just living.
Cindy Tonkin: Yeah, nice. Is there anything you want to say to conclude?
Vin Vashishta: No, this has been a great conversation. I love your questions.
Cindy Tonkin: Thank you. Well, look, thank you so much for taking the time, Vin. I know that you have a busy live. We just heard about lots of those things. I’d love to do it again in a couple months, when I come up with some more questions. Let me know if there’s anything I can do to help you. We’ll talk again.
Cindy Tonkin: This is Cindy Tonkin. I’m the Consultants’ Consultant and you’ve been listening to Smarter Data People.
This is part of what I do to understand how it is that data scientists can be more effective in the workplace, smarter, faster, and nicer.
If you have a team, and you find them harder to manage than they could be, if you’re constantly trying to squeeze more out of your budget and out of their time, and if you’ve got stakeholders or they’ve got stakeholders who are less than happy sometimes, maybe, a lot more than sometimes, it can be really annoying and it can make you feel incompetent.
I can help you, help them, get to the important problems faster, target the waste in time, and save you time and money, and ultimately delight stakeholders, so that you can feel competent again. It’s such a good feeling.