{"id":4344,"date":"2020-12-24T03:48:46","date_gmt":"2020-12-24T03:48:46","guid":{"rendered":"https:\/\/techclot.com\/index.php\/2020\/12\/24\/2021-predictions-data-science\/"},"modified":"2020-12-24T03:48:46","modified_gmt":"2020-12-24T03:48:46","slug":"2021-predictions-data-science","status":"publish","type":"post","link":"https:\/\/techclot.com\/index.php\/2020\/12\/24\/2021-predictions-data-science\/","title":{"rendered":"2021 Predictions: Data Science"},"content":{"rendered":"<p><a href=\"https:\/\/www.google.com\/url?rct=j&#038;sa=t&#038;url=https:\/\/www.datanami.com\/2020\/12\/23\/2021-predictions-data-science\/&#038;ct=ga&#038;cd=CAIyHDkyYmU1MGQ5NjY1NjYxZTA6Y28udWs6ZW46R0I&#038;usg=AFQjCNGEvFK8ageedXzGxeUyybwNrPQqJQ\">2021 Predictions: Data Science<\/a><\/p>\n<p><div class=\"post-thumbnail\">\n<img data-recalc-dims=\"1\" decoding=\"async\" width=\"300\" height=\"200\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg?resize=300%2C200&#038;ssl=1\" class=\"attachment-medium size-medium wp-post-image lazyload\" alt data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-768x512.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-200x133.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-100x67.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-120x80.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg 1000w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/200;\"> <\/p>\n<p class=\"caption\">(winyuu\/Shutterstock)<\/p>\n<\/div>\n<p>It\u2019s that time of year again \u2013 time for predictions! Thank you for patiently waiting while <em>Datanami<\/em> compiled 2021 predictions from the assorted predicters. We\u2019ll kick things off with predictions about a most pertinent topic: data science.<\/p>\n<p>If there\u2019s one thing that the <strong>COVID-19 pandemic<\/strong> in 2020 made clear, it\u2019s that organizations are relying on data more than ever before. To get the most out of that data, shops are going to need to increase their spending on data science, argues <a href=\"http:\/\/www.dominodatalab.com\">Domino Data Lab<\/a> CEO Nick Elprin.<\/p>\n<p>\u201cOrganizations are making dramatic budget cuts in many areas in an effort to overcome the effects of COVID-19 and keep their business viable,\u201d Elprin says. \u201cYet, in 2021 we predict that many will sustain or actually increase their investment in data science to help drive the critical business decisions that may literally make the difference between survival and liquidation.\u201d<\/p>\n<p>You will see more people with the title of <strong>chief data scientist<\/strong> (CDS), says Ira Cohen, who is the co-founder and (naturally) CDS at <a href=\"http:\/\/www.anodot.com\">Anodot<\/a>. In fact, Cohen says that, by 2022, 90% of large global companies will have a CDS in place. CDSs will also allocate their time differently in 2021. \u201cFifty percent will be more focused on value creation and revenue generation while 28% will focus on cost savings and 22% on risk mitigation,\u201d he says.<\/p>\n<p>Josh Patterson, the senior director of RAPIDS engineering at <a href=\"https:\/\/www.nvidia.com\">Nvidia<\/a>, says 2021 will bring <strong>empowerment<\/strong> to data scientists.<\/p>\n<p>\u201cFor too long, enterprise data scientists have been relegated to sampling data or only pre-production development. People with titles such as data engineer and machine learning engineer are the ones who scale workflows into production, often translating code from Python to Java,\u201d Patterson says. In 2021, \u201cdata scientists will be able to process massive amounts of data quickly, drastically reducing the need to have code translators.\u201d<\/p>\n<p>Alan Jacobson, chief data and analytics officer at <a href=\"http:\/\/www.alteryx.com\">Alteryx<\/a>, is bullish on the potential to <strong>upskill data analysts<\/strong> into full-blown data scientists.<\/p>\n<div id=\"attachment_37729\" class=\"wp-caption alignright\" readability=\"32\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-37729\" class=\"wp-image-37729 size-medium lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg?resize=300%2C158&#038;ssl=1\" alt width=\"300\" height=\"158\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-768x405.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-200x105.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-100x53.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-120x63.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg 1000w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/158;\"><\/a><\/p>\n<p id=\"caption-attachment-37729\" class=\"wp-caption-text\"><em>COVID-19 put data science to the test in 2020<\/em><\/p>\n<\/div>\n<p>\u201cWhile it is always important for companies to offer training to employees, the fields of data science and digital transformation are challenging companies to break the mold and deliver new and constantly evolving&nbsp; ways to upskill and deliver ROI\u201d Jacobson says. \u201cData science has evolved to the point where people don\u2019t need to go back to college to learn. They\u2019ll learn on the job or while at home by encountering new tools and technologies. And with a huge shortage of those with analytic skills, many will start new jobs and careers based on the new skills.\u201d<\/p>\n<p><strong>Tracking changes<\/strong> in data generated by SaaS-based business applications will be the feedstock for more intelligent AI and machine learning, says Joe Gaska, CEO of <a href=\"http:\/\/www.grax.com\">GRAX<\/a>.<\/p>\n<p>\u201cOrganizations with a focus on artificial intelligence and machine learning will continue to hunger for meaningful training datasets that can be fed into their ML algorithms to spot cause-and-effect change patterns over time,\u201d Gaska says. \u201cTo do this, they will turn to their ever-changing datasets in 3rd party cloud\/SaaS applications as inputs into these algorithms. This will create pressure for them to capture and ingest every single change in that data over time into their DataOps ecosystem.\u201d<\/p>\n<p>Rachel Roumeliotis, vice president of AI and data content at <a href=\"https:\/\/www.oreilly.com\/\">O\u2019Reilly Media<\/a>, says machine learning operations, or MLOps, will be important in 2021, as organizations look to <strong>connect the last mile<\/strong> in data science.<\/p>\n<p>\u201cML presents a problem for CI\/CD for several reasons,\u201d she writes. \u201cThe data that powers ML applications is as important as code, making version control difficult; outputs are probabilistic rather than deterministic, making testing difficult; training a model is processor intensive and time consuming, making rapid build\/deploy cycles difficult. None of these problems are unsolvable, but developing solutions will require substantial effort over the coming years.\u201d<\/p>\n<div id=\"attachment_16871\" class=\"wp-caption alignleft\" readability=\"34\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-16871\" class=\"wp-image-16871 lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg?resize=322%2C188&#038;ssl=1\" alt width=\"322\" height=\"188\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-768x448.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-200x117.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-100x58.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-120x70.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg 1000w\" data-sizes=\"auto, (max-width: 322px) 100vw, 322px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 322px; --smush-placeholder-aspect-ratio: 322\/188;\"><\/a><\/p>\n<p id=\"caption-attachment-16871\" class=\"wp-caption-text\"><em>Data scientists will be in high demand in 2021, if prognostications are correct (Sergey Nivens\/Shutterstock)<\/em><\/p>\n<\/div>\n<p>Instead of spending time and money building machine learning systems, in 2021, organizations will take a big step forward in terms of <strong>using ML systems<\/strong>, says Clemens Mewald, director of product management for machine learning and data science at <a href=\"https:\/\/www.databricks.com\">Databricks<\/a>.<\/p>\n<p>\u201cIn the future, we\u2019ll see enterprise customers moving away from building their own machine learning platforms, recognizing that it\u2019s not their core competency,\u201d Mewald says. \u201cThey\u2019ll realize that more value comes from applying ML to business problems versus spending the time to build and maintain the tools themselves.\u201d<\/p>\n<p>We\u2019re still a ways from having intelligent AI robots walking among us. But the conjunction of neuroscience and data science is a<strong> rich playground<\/strong> for new ideas, says Biju Dominic, the chief evangelist at <a href=\"http:\/\/www.fractal.ai\">Fractal Analytics<\/a> and chairman at FinalMile Consulting.<\/p>\n<p>\u201cAs AI makes rapid strides into unsupervised learning, one-shot learning, and artificial general intelligence, the field will seek inspiration and validation from system neuroscience and computational neuroscience,\u201d Dominic says. \u201cThe interaction between the fields of AI and neuroscience will help the rapid growth of both these fields of knowledge.\u201d<\/p>\n<p>Here is a very specific data science prediction from James Bednar, senior manager of technical consulting at <a href=\"http:\/\/www.anaconda.com\/\">Anaconda<\/a>: Python data visualization <strong>libraries will synch up<\/strong>.<\/p>\n<div id=\"attachment_19849\" class=\"wp-caption alignright\" readability=\"32\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-19849\" class=\"size-medium wp-image-19849 lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg?resize=300%2C171&#038;ssl=1\" alt width=\"300\" height=\"171\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-768x437.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-1024x583.jpg 1024w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-200x114.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-100x57.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-120x68.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg 1823w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/171;\"><\/a><\/p>\n<p id=\"caption-attachment-19849\" class=\"wp-caption-text\"><em>Does neuroscience hold answers for data scientists? (kmlmtz66\/Shutterstock)<\/em><\/p>\n<\/div>\n<p>\u201cWe\u2019re finally starting to see Python data visualization libraries work together, and this work will continue in 2021,\u201d Bednar says. \u201cPython has had some really great visualization libraries for years, but there has been a lot of variety and confusion that make it difficult for users to choose appropriate tools. Developers at many different organizations have been working to integrate Anaconda-developed capabilities like Datashader\u2019s server-side big data rendering and HoloViews\u2019 linked brushing into a wide variety of plotting libraries, making more power available to a wider user base and reducing duplication of efforts. Ongoing work will further aid this synchronization in 2021 and beyond.\u201d<\/p>\n<p>Which is more valuable: <strong>metadata or the data<\/strong> itself? The answer from Petteri Vainikka, the president of product marketing for <a href=\"http:\/\/www.cognite.com\">Cognite<\/a>, may surprise you.<\/p>\n<p>\u201cAs the cost and value of data storage continues to gravitate towards zero, and data science teams simultaneously scrambling to convert their existing data warehouse and data lakes into business value, the mountain of evidence pointing to \u2018no correlation\u2019 between volume and value of data keeps growing,\u201d Vainikka says. \u201cWhether through manual tagging of images, AI-driven data set matching to uncover data relationships, or OCR\/NLP methods to convert unstructured data into structured data, the focus and value of metadata will exceed that of the data itself. Data contextualization will be at the centre of metadata curation.\u201d<\/p>\n<p><strong>Big problems<\/strong> require big tools to solve, and that will be how data science differentiates itself in 2021, says Alicia Frame, lead data science product manager at <a href=\"http:\/\/www.neo4j.com\/\">Neo4j<\/a>.<\/p>\n<div id=\"attachment_19528\" class=\"wp-caption alignleft\" readability=\"32\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-19528\" class=\"size-medium wp-image-19528 lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg?resize=300%2C180&#038;ssl=1\" alt width=\"300\" height=\"180\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/04\/big-data_shutterstock_606840716_700x420-200x120.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/04\/big-data_shutterstock_606840716_700x420-100x60.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/04\/big-data_shutterstock_606840716_700x420-120x72.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg 700w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/180;\"><\/a><\/p>\n<p id=\"caption-attachment-19528\" class=\"wp-caption-text\"><em>Data or metadata: Which is more important to data science? (whiteMocca\/Shutterstock)<\/em><\/p>\n<\/div>\n<p>\u201cWith the computation power to crunch data getting cheaper and easier to access, and severless technology making it easier to develop and deploy code, we\u2019ll see data scientists getting back to focusing on the basics: solving big problems more effectively than anyone else,\u201d she says.<\/p>\n<p>Amid data science practitioners, there will be a strong emphasis on the possibilities of <strong>feature engineering<\/strong> in 2021, predicts Ryohei Fujimaki, Ph.D., founder and CEO of <a href=\"http:\/\/www.dotdata.com\/\">dotData<\/a>.<\/p>\n<p>\u201cWhile predictions are one of most valuable outcomes, AI and ML must produce actionable insights beyond predictions, that businesses can consume,\u201d Fujimaki says. \u201cAutoML 2.0 automates hypothesis generations (a.k.a. feature engineering) and explores thousands or even millions of hypothesis patterns that were never possible with the traditional manual process. AutoML 2.0 platforms that provide for automated discovery and engineering of data \u2018features\u2019 will be used to provide more clarity, transparency and insights as businesses realize that data features are not just suited for predictive analytics, but can also provide invaluable insights into past trends, events and information that adds value to the business by allowing businesses to discover the \u2018unknown unknowns, trends and data patterns that are important, but that no one had suspected would be true.\u201d<\/p>\n<p>Stay tuned for our next batch of 2021 predictions, on <strong>advanced analytics<\/strong>.<\/p>\n<p><strong>Related Items:<\/strong><\/p>\n<p><a href=\"https:\/\/www.datanami.com\/2020\/12\/18\/2020-a-big-data-year-in-review\/\">2020: A Big Data Year in Review<\/a><\/p>\n<p><a href=\"https:\/\/www.datanami.com\/2019\/12\/30\/20-ai-predictions-for-2020\/\">20 AI Predictions for 2020<\/a><\/p>\n<p><a href=\"https:\/\/www.datanami.com\/2019\/12\/23\/big-data-predictions-what-2020-will-bring\/\">Big Data Predictions: What 2020 Will Bring<\/a><\/p>\n<p><span class=\"et_social_bottom_trigger\"><\/span> <\/p>\n<p>Published at Wed, 23 Dec 2020 22:41:15 +0000<\/p>\n<p><a href=\"https:\/\/www.google.com\/url?rct=j&#038;sa=t&#038;url=https:\/\/www.datanami.com\/2020\/12\/23\/2021-predictions-data-science\/&#038;ct=ga&#038;cd=CAIyHDkyYmU1MGQ5NjY1NjYxZTA6Y28udWs6ZW46R0I&#038;usg=AFQjCNGEvFK8ageedXzGxeUyybwNrPQqJQ\">2021 Predictions: Data Science<\/a><\/p>\n<p><div class=\"post-thumbnail\">\n<img data-recalc-dims=\"1\" decoding=\"async\" width=\"300\" height=\"200\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg?resize=300%2C200&#038;ssl=1\" class=\"attachment-medium size-medium wp-post-image lazyload\" alt data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-768x512.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-200x133.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-100x67.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/12\/2021_ball_shutterstock_winyuu-120x80.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg 1000w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/200;\"> <\/p>\n<p class=\"caption\">(winyuu\/Shutterstock)<\/p>\n<\/div>\n<p>It\u2019s that time of year again \u2013 time for predictions! Thank you for patiently waiting while <em>Datanami<\/em> compiled 2021 predictions from the assorted predicters. We\u2019ll kick things off with predictions about a most pertinent topic: data science.<\/p>\n<p>If there\u2019s one thing that the <strong>COVID-19 pandemic<\/strong> in 2020 made clear, it\u2019s that organizations are relying on data more than ever before. To get the most out of that data, shops are going to need to increase their spending on data science, argues <a href=\"http:\/\/www.dominodatalab.com\">Domino Data Lab<\/a> CEO Nick Elprin.<\/p>\n<p>\u201cOrganizations are making dramatic budget cuts in many areas in an effort to overcome the effects of COVID-19 and keep their business viable,\u201d Elprin says. \u201cYet, in 2021 we predict that many will sustain or actually increase their investment in data science to help drive the critical business decisions that may literally make the difference between survival and liquidation.\u201d<\/p>\n<p>You will see more people with the title of <strong>chief data scientist<\/strong> (CDS), says Ira Cohen, who is the co-founder and (naturally) CDS at <a href=\"http:\/\/www.anodot.com\">Anodot<\/a>. In fact, Cohen says that, by 2022, 90% of large global companies will have a CDS in place. CDSs will also allocate their time differently in 2021. \u201cFifty percent will be more focused on value creation and revenue generation while 28% will focus on cost savings and 22% on risk mitigation,\u201d he says.<\/p>\n<p>Josh Patterson, the senior director of RAPIDS engineering at <a href=\"https:\/\/www.nvidia.com\">Nvidia<\/a>, says 2021 will bring <strong>empowerment<\/strong> to data scientists.<\/p>\n<p>\u201cFor too long, enterprise data scientists have been relegated to sampling data or only pre-production development. People with titles such as data engineer and machine learning engineer are the ones who scale workflows into production, often translating code from Python to Java,\u201d Patterson says. In 2021, \u201cdata scientists will be able to process massive amounts of data quickly, drastically reducing the need to have code translators.\u201d<\/p>\n<p>Alan Jacobson, chief data and analytics officer at <a href=\"http:\/\/www.alteryx.com\">Alteryx<\/a>, is bullish on the potential to <strong>upskill data analysts<\/strong> into full-blown data scientists.<\/p>\n<div id=\"attachment_37729\" class=\"wp-caption alignright\" readability=\"32\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-37729\" class=\"wp-image-37729 size-medium lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg?resize=300%2C158&#038;ssl=1\" alt width=\"300\" height=\"158\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-768x405.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-200x105.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-100x53.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2020\/11\/shutterstock_covid_ab12-120x63.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R2floZ.jpg 1000w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/158;\"><\/a><\/p>\n<p id=\"caption-attachment-37729\" class=\"wp-caption-text\"><em>COVID-19 put data science to the test in 2020<\/em><\/p>\n<\/div>\n<p>\u201cWhile it is always important for companies to offer training to employees, the fields of data science and digital transformation are challenging companies to break the mold and deliver new and constantly evolving&nbsp; ways to upskill and deliver ROI\u201d Jacobson says. \u201cData science has evolved to the point where people don\u2019t need to go back to college to learn. They\u2019ll learn on the job or while at home by encountering new tools and technologies. And with a huge shortage of those with analytic skills, many will start new jobs and careers based on the new skills.\u201d<\/p>\n<p><strong>Tracking changes<\/strong> in data generated by SaaS-based business applications will be the feedstock for more intelligent AI and machine learning, says Joe Gaska, CEO of <a href=\"http:\/\/www.grax.com\">GRAX<\/a>.<\/p>\n<p>\u201cOrganizations with a focus on artificial intelligence and machine learning will continue to hunger for meaningful training datasets that can be fed into their ML algorithms to spot cause-and-effect change patterns over time,\u201d Gaska says. \u201cTo do this, they will turn to their ever-changing datasets in 3rd party cloud\/SaaS applications as inputs into these algorithms. This will create pressure for them to capture and ingest every single change in that data over time into their DataOps ecosystem.\u201d<\/p>\n<p>Rachel Roumeliotis, vice president of AI and data content at <a href=\"https:\/\/www.oreilly.com\/\">O\u2019Reilly Media<\/a>, says machine learning operations, or MLOps, will be important in 2021, as organizations look to <strong>connect the last mile<\/strong> in data science.<\/p>\n<p>\u201cML presents a problem for CI\/CD for several reasons,\u201d she writes. \u201cThe data that powers ML applications is as important as code, making version control difficult; outputs are probabilistic rather than deterministic, making testing difficult; training a model is processor intensive and time consuming, making rapid build\/deploy cycles difficult. None of these problems are unsolvable, but developing solutions will require substantial effort over the coming years.\u201d<\/p>\n<div id=\"attachment_16871\" class=\"wp-caption alignleft\" readability=\"34\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-16871\" class=\"wp-image-16871 lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg?resize=322%2C188&#038;ssl=1\" alt width=\"322\" height=\"188\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-768x448.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-200x117.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-100x58.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2017\/07\/data-scientist_shutterstock_Sergey-Nivens-120x70.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/HpPDN4.jpg 1000w\" data-sizes=\"auto, (max-width: 322px) 100vw, 322px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 322px; --smush-placeholder-aspect-ratio: 322\/188;\"><\/a><\/p>\n<p id=\"caption-attachment-16871\" class=\"wp-caption-text\"><em>Data scientists will be in high demand in 2021, if prognostications are correct (Sergey Nivens\/Shutterstock)<\/em><\/p>\n<\/div>\n<p>Instead of spending time and money building machine learning systems, in 2021, organizations will take a big step forward in terms of <strong>using ML systems<\/strong>, says Clemens Mewald, director of product management for machine learning and data science at <a href=\"https:\/\/www.databricks.com\">Databricks<\/a>.<\/p>\n<p>\u201cIn the future, we\u2019ll see enterprise customers moving away from building their own machine learning platforms, recognizing that it\u2019s not their core competency,\u201d Mewald says. \u201cThey\u2019ll realize that more value comes from applying ML to business problems versus spending the time to build and maintain the tools themselves.\u201d<\/p>\n<p>We\u2019re still a ways from having intelligent AI robots walking among us. But the conjunction of neuroscience and data science is a<strong> rich playground<\/strong> for new ideas, says Biju Dominic, the chief evangelist at <a href=\"http:\/\/www.fractal.ai\">Fractal Analytics<\/a> and chairman at FinalMile Consulting.<\/p>\n<p>\u201cAs AI makes rapid strides into unsupervised learning, one-shot learning, and artificial general intelligence, the field will seek inspiration and validation from system neuroscience and computational neuroscience,\u201d Dominic says. \u201cThe interaction between the fields of AI and neuroscience will help the rapid growth of both these fields of knowledge.\u201d<\/p>\n<p>Here is a very specific data science prediction from James Bednar, senior manager of technical consulting at <a href=\"http:\/\/www.anaconda.com\/\">Anaconda<\/a>: Python data visualization <strong>libraries will synch up<\/strong>.<\/p>\n<div id=\"attachment_19849\" class=\"wp-caption alignright\" readability=\"32\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-19849\" class=\"size-medium wp-image-19849 lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg?resize=300%2C171&#038;ssl=1\" alt width=\"300\" height=\"171\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-768x437.jpg 768w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-1024x583.jpg 1024w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-200x114.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-100x57.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/05\/data_scientist_brain_shutterstock_kmlmtz66-120x68.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/PiBd8a.jpg 1823w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/171;\"><\/a><\/p>\n<p id=\"caption-attachment-19849\" class=\"wp-caption-text\"><em>Does neuroscience hold answers for data scientists? (kmlmtz66\/Shutterstock)<\/em><\/p>\n<\/div>\n<p>\u201cWe\u2019re finally starting to see Python data visualization libraries work together, and this work will continue in 2021,\u201d Bednar says. \u201cPython has had some really great visualization libraries for years, but there has been a lot of variety and confusion that make it difficult for users to choose appropriate tools. Developers at many different organizations have been working to integrate Anaconda-developed capabilities like Datashader\u2019s server-side big data rendering and HoloViews\u2019 linked brushing into a wide variety of plotting libraries, making more power available to a wider user base and reducing duplication of efforts. Ongoing work will further aid this synchronization in 2021 and beyond.\u201d<\/p>\n<p>Which is more valuable: <strong>metadata or the data<\/strong> itself? The answer from Petteri Vainikka, the president of product marketing for <a href=\"http:\/\/www.cognite.com\">Cognite<\/a>, may surprise you.<\/p>\n<p>\u201cAs the cost and value of data storage continues to gravitate towards zero, and data science teams simultaneously scrambling to convert their existing data warehouse and data lakes into business value, the mountain of evidence pointing to \u2018no correlation\u2019 between volume and value of data keeps growing,\u201d Vainikka says. \u201cWhether through manual tagging of images, AI-driven data set matching to uncover data relationships, or OCR\/NLP methods to convert unstructured data into structured data, the focus and value of metadata will exceed that of the data itself. Data contextualization will be at the centre of metadata curation.\u201d<\/p>\n<p><strong>Big problems<\/strong> require big tools to solve, and that will be how data science differentiates itself in 2021, says Alicia Frame, lead data science product manager at <a href=\"http:\/\/www.neo4j.com\/\">Neo4j<\/a>.<\/p>\n<div id=\"attachment_19528\" class=\"wp-caption alignleft\" readability=\"32\"><a href=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" aria-describedby=\"caption-attachment-19528\" class=\"size-medium wp-image-19528 lazyload\" data-src=\"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg?resize=300%2C180&#038;ssl=1\" alt width=\"300\" height=\"180\" data-srcset=\"https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg 300w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/04\/big-data_shutterstock_606840716_700x420-200x120.jpg 200w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/04\/big-data_shutterstock_606840716_700x420-100x60.jpg 100w, https:\/\/2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com\/wp-content\/uploads\/2018\/04\/big-data_shutterstock_606840716_700x420-120x72.jpg 120w, https:\/\/techclot.com\/wp-content\/uploads\/2020\/12\/R9xToi.jpg 700w\" data-sizes=\"auto, (max-width: 300px) 100vw, 300px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 300px; --smush-placeholder-aspect-ratio: 300\/180;\"><\/a><\/p>\n<p id=\"caption-attachment-19528\" class=\"wp-caption-text\"><em>Data or metadata: Which is more important to data science? (whiteMocca\/Shutterstock)<\/em><\/p>\n<\/div>\n<p>\u201cWith the computation power to crunch data getting cheaper and easier to access, and severless technology making it easier to develop and deploy code, we\u2019ll see data scientists getting back to focusing on the basics: solving big problems more effectively than anyone else,\u201d she says.<\/p>\n<p>Amid data science practitioners, there will be a strong emphasis on the possibilities of <strong>feature engineering<\/strong> in 2021, predicts Ryohei Fujimaki, Ph.D., founder and CEO of <a href=\"http:\/\/www.dotdata.com\/\">dotData<\/a>.<\/p>\n<p>\u201cWhile predictions are one of most valuable outcomes, AI and ML must produce actionable insights beyond predictions, that businesses can consume,\u201d Fujimaki says. \u201cAutoML 2.0 automates hypothesis generations (a.k.a. feature engineering) and explores thousands or even millions of hypothesis patterns that were never possible with the traditional manual process. AutoML 2.0 platforms that provide for automated discovery and engineering of data \u2018features\u2019 will be used to provide more clarity, transparency and insights as businesses realize that data features are not just suited for predictive analytics, but can also provide invaluable insights into past trends, events and information that adds value to the business by allowing businesses to discover the \u2018unknown unknowns, trends and data patterns that are important, but that no one had suspected would be true.\u201d<\/p>\n<p>Stay tuned for our next batch of 2021 predictions, on <strong>advanced analytics<\/strong>.<\/p>\n<p><strong>Related Items:<\/strong><\/p>\n<p><a href=\"https:\/\/www.datanami.com\/2020\/12\/18\/2020-a-big-data-year-in-review\/\">2020: A Big Data Year in Review<\/a><\/p>\n<p><a href=\"https:\/\/www.datanami.com\/2019\/12\/30\/20-ai-predictions-for-2020\/\">20 AI Predictions for 2020<\/a><\/p>\n<p><a href=\"https:\/\/www.datanami.com\/2019\/12\/23\/big-data-predictions-what-2020-will-bring\/\">Big Data Predictions: What 2020 Will Bring<\/a><\/p>\n<p><span class=\"et_social_bottom_trigger\"><\/span> <\/p>\n<p>Published at Wed, 23 Dec 2020 22:41:15 +0000<\/p>\n","protected":false},"excerpt":{"rendered":"<p>2021 Predictions: Data Science (winyuu\/Shutterstock) It\u2019s that time of year again \u2013 time for predictions!&#8230;<\/p>\n","protected":false},"author":3,"featured_media":4339,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[3],"tags":[],"class_list":["post-4344","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/techclot.com\/wp-content\/uploads\/2020\/12\/v6aG8U.jpg?fit=300%2C200&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3orZX-184","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/posts\/4344","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/comments?post=4344"}],"version-history":[{"count":0,"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/posts\/4344\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/media\/4339"}],"wp:attachment":[{"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/media?parent=4344"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/categories?post=4344"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techclot.com\/index.php\/wp-json\/wp\/v2\/tags?post=4344"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}