Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
When you think of the perfect data science team, are you imagining 10 copies of the same professor of computer science and statistics, hands delicately stained with whiteboard marker? I hope not!
Googleâs Geoff Hinton is a hero of mine and an amazing researcher in deep learning, but I hope youâre not planning to staff your applied data science team with 10 of him and no one else!
Applied data science is a team sport thatâs highly interdisciplinary. Diversity of perspective matters! In fact, perspective and attitude matter at least as much as education and experience.
If youâre keen to make your data useful with a decision intelligence engineering approach, hereâs my take on the order in which to grow your team.
#0 Data Engineer
We start counting at zero, of course, since you need to have the ability to get data before it makes sense to talk about data analysis. If youâre dealing with small datasets, data engineering is essentially entering some numbers into a spreadsheet. When you operate at a more impressive scale, data engineering becomes a sophisticated discipline in its own right. Someone on your team will need to take responsibility for dealing with the tricky engineering aspects of delivering data that the rest of your staff can work with.
#1 Decision-Maker
Before hiring that PhD-trained data scientist, make sure you have a decision-maker who understands the art and science of data-driven decision-making. This individual is responsible for identifying decisions worth making with data, framing them (everything from designing metrics to calling the shots on statistical assumptions), and determining the required level of analytical rigor based on potential impact on the business. Look for a deep thinker who doesnât keep saying, âOh, whoops, that didnât even occur to me as I was thinking through this decision.â Theyâve already thought of it. And that. And that too.
#2 Analyst
Then the next hire is⊠everyone already working with you. Everyone is qualified to look at data, the only thing that might be missing is a bit of familiarity with software thatâs well-suited for the job. If youâve ever looked at a digital photograph, youâve done data visualization and analytics.
Itâs the same thing: when you look at a photograph in, say, MS Paint, youâre taking that vast matrix of numbers about how red-green-blue each pixel is and youâre visualizing it with software that paints a picture your eyes and brain can make sense of. Turns out R and Python (and others!) can also print out the digital photo for you and they can show you other kinds of data too. Theyâre simply more versatile tools for looking at a wider variety of datasets. Luckily it doesnât take long to learn them: Iâve taught high school kids with no programming background how to get started with them in under a week. There are plenty of free tutorials online, so dive right in and have fun. Youâre just learning to use MS Paint all over again, essentially, but this one lets you look at more stuff.
And hey, if all you have the stomach for is looking at the first five rows of data in a spreadsheet, well, thatâs still better than nothing. If the entire workforce is empowered to do that, youâll have a much better finger on the pulse of your business than if no one is looking at any data at all.
Nessie 1934: This is data. Make conclusions about it wisely.
The important thing to remember is that you shouldnât come to conclusions beyond your data. That takes specialist training. Just as with the photo above, hereâs all you can say about it: âThis is what is in my dataset.â Please donât use it conclude that the Loch Ness Monster is real.
#3 Expert Analyst
Enter the lightning-fast version! This person can look at more data faster. The game here is speed, exploration, discovery⊠fun! This is not the role concerned with rigor and careful conclusions. Instead, this is the person who helps your team get eyes on as much of your data as possible so that your decision-maker can get a sense of whatâs worth pursuing with more care.
The job here is speed, encountering potential insights as quickly as possible.
This may be counterintuitive, but donât staff this role with your most reliable engineers who write gorgeous, robust code. The job here is speed, encountering potential insights as quickly as possible, and unfortunately those who obsess over code quality may find it too difficult to zoom through the data fast enough to be useful in this role.
Iâve seen analysts on engineering-oriented teams bullied because their peers donât realize what âgreat codeâ means for descriptive analytics. Great is âfast and humbleâ here. If fast-but-sloppy coders donât get much love, theyâll leave your company and youâll wonder why you donât have a finger on the pulse of your business.
#4 Statistician
Now that weâve got all these folks cheerfully exploring data, weâd better have someone around to put a damper on the feeding frenzy. Itâs safe to look at that âphotoâ of Nessie as long as you have the discipline to keep yourself from learning more than whatâs actually there⊠but do you? While people are pretty good at thinking reasonably about photos, other data types seem to send common sense out the window. It might be a good idea to have someone around who can prevent the team from making unwarranted conclusions. (Lifehack: donât make conclusions and you wonât need to worry. Iâm only half joking. Inspiration is cheap, but rigor is expensive. Pay up or content yourself with mere inspiration.)
Statisticians help decision-makers come to conclusions safely beyond the data.
For example, if your machine learning system worked in one dataset, all you can conclude is that it worked in that dataset. Will it work when itâs running in production? Should you launch it? You need some extra skills to deal with those questions.
If weâre want to make serious decisions where we donât have perfect facts, letâs slow down and take a careful approach. Statisticians help decision-makers come to conclusions safely beyond the data.
#5 Applied Machine Learning Engineer
An applied AI / machine learning engineerâs best attribute is not an understanding of how algorithms work. Their job is to use them, not build them. (Thatâs what researchers do.) Expertise at wrangling code that gets existing algorithms to accept and churn through your datasets is what youâre looking for.
Besides quick coding fingers, look for a personality that can cope with failure. You almost never know what youâre doing, even if you think you do. You run the data through a bunch of algorithms as quickly as possible and see if it seems to be working⊠with the reasonable expectation that youâll fail a lot before you succeed. A huge part of the job is dabbling blindly, and it takes a certain kind of personality to enjoy that.
Perfectionists tend to struggle as ML engineers.
Because your business problemâs not in a textbook, you canât know in advance what will work, so you canât expect to get a perfect result on the first go. Thatâs okay, just try lots of approaches as quickly as possible and iterate towards a solution.
Speaking of ârunning the data through algorithmsâ⊠what data? The inputs your analysts identified as potentially interesting, of course. Thatâs why analysts make sense as an earlier hire.
Although thereâs a lot of tinkering, itâs important for the machine learning engineer to have a deep respect for the part of the process where rigor is vital: assessment. Does the solution actually work on new data? Luckily, you made a wise choice with your previous hire, so all you have to do is pass the baton to the statistician.
The strongest applied ML engineers have a very good sense of how long it takes to apply various approaches. When a potential ML hire can rank options by the time it takes to try them on various kinds of datasets, be impressed.
#6 Data Scientist
The way I use the word, a data scientist is someone who is a full expert in all of the three preceding roles. Not everyone uses my definition: youâll see job applications out there with people calling themselves âdata scientistâ when they have only really mastered one of the three, so itâs worth checking.
This role is in position #6 because hiring the true three-in-one is an expensive option. If you can hire one within budget, itâs a great idea, but if youâre on a tight budget, consider upskilling and growing your existing single-role specialists.
#7 Analytics Manager / Data Science Leader
The analytics manager is the goose that lays the golden egg: theyâre a hybrid between the data scientist and the decision-maker. Their presence on the team acts as a force-multiplier, ensuring that your data science team isnât off in the weeds instead of adding value to your business. This person is kept awake at night by questions like, âHow do we design the right questions? How do we make decisions? How do we best allocate our experts? Whatâs worth doing? Will the skills and data match the requirements? How do we ensure good input data?â If youâre lucky enough to hire one of these, hold on to them and never let them go. Learn more about this role here.
#8 Qualitative Expert / Social Scientist
Sometimes your decision-maker is a brilliant leader, manager, motivator, influencer, or navigator of organizational politics⊠but unskilled in the art and science of decision-making. Decision-making is so much more than a talent. If your decision-maker hasnât honed their craft, they might do more damage than good.
Donât fire them, augment them. You can hire them an upgrade in the form of a helper. The qualitative expert is here to supplement their skills.
This person typically has a social science and data backgroundâââbehavioral economists, neuroeconomists, and JDM psychologists receive the most specialized training, but self-taught folk can also be good at it. The job is to help the decision maker clarify ideas, examine all the angles, and turn ambiguous intuitions into well-thought-through instructions in language that makes it easy for the rest of the team to execute on.
The qualitative expert doesnât call any of the shots. Instead, they ensure that the decision-maker has fully grasped the shots available for calling. Theyâre also a trusted advisor, a brainstorming companion, and a sounding board for a decision-maker. Having them on board is a great way to ensure that the project starts out in the right direction.
#9 Researcher
Many hiring managers think their first team member needs to be the ex-professor, but actually you donât need those PhD folk unless you already know that the industry is not going to supply the algorithms that you need. Most teams wonât know that in advance, so it makes more sense to do things in the right order: before building yourself that space pen, first check whether a pencil will get the job done. Get started first and if you find that the available off-the-shelf solutions arenât giving you much love, then you should consider hiring researchers. If theyâre your first hire, you probably wonât have the right environment to make good use of them in any case. Donât bring them in right off the bat. Itâs better to wait until your team is developed enough to have figured out that what they need a researcher for. Wait till youâve exhausted all the available tools before hiring someone to build you expensive new ones.
Before you invent pens that work in space, check that existing solutions donât meet your needs already.
#10+ Additional personnel
Besides the roles we looked at, here are some of my favorite people to welcome to a decision intelligence project:
- Domain expert
- Ethicist
- Software engineer
- Reliability engineer
- UX designer
- Interactive visualizer / graphic designer
- Data collection specialist
- Project / program manager
Many projects canât do without themâââthe only reason they arenât listed in my top 10 is that decision intelligence is not their primary business. Instead, they are geniuses at their own discipline and have learned enough about data and decision-making to be remarkably useful to your project. Think of them as having their own major or specialization, but enough love for decision intelligence that they chose to minor in it.
Huge team or small team?
After reading all that, you might feel overwhelmed. So many roles! Take a deep breath. Depending on your needs, you may get enough value from the first few roles.
Revisiting my analogy of applied machine learning as innovating in the kitchen, if you personally want to open an industrial-scale pizzeria that makes innovative pizzas, you need the big team or you need to partner with providers/consultants. If you want to make a unique pizza or two this weekendâââcaramelized anchovy surprise, anyone?âââthen you still need to think about all the components we mentioned. Youâre going to decide what to make (role 1), which ingredients to use (roles 2 and 3), where to get ingredients (role 0), how to customize the recipe (role 5), and how to give it a taste test (role 4) before serving someone you want to impress, but for the casual version with less at stake, you can do it all on your own. And if your goal is just to make standard traditional pizza, you donât even need all that: get hold of someone elseâs tried and tested recipe (no need to reinvent your own) along with ingredients and start cooking!
Top 10 roles in AI and data science was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.