The Topic view for WPID= simulated rendering in new page.

Visual layout may differ depending on browser and as rendered by Older view in Website

We Don’t Need Data Engineers, We Need Better Tools for Data Scientists

ItemDate=2021-07-14 00:08:25 Status=publish

TopicTaglist=['H13', 'G15']

#Discussion(IoTStack) [ via IoTGroup ]

These articles focus on the number of available job positions for the title of “Data Engineer” vs “Data Scientist”. Let’s put aside the fact that the hiring managers who post these positions often don’t know the difference between the two jobs and use them interchangeably (or use whatever is in style at the moment). The question then becomes: Is the surplus of available Data Engineer positions solely a personnel problem? Data Science is messy because it reflects the real world Data Scientists are domain experts (on top of knowing statistics) and they don’t often have a strong background in programming. I’ve seen this expertise discounted in multiple Twitter and forum threads with software engineers and other “technical people” asking questions like “Why don’t they just learn Spark?”. This type of mentality completely misses the fact that Data Scientists can already do what they want to do at smaller scales with their existing tools. Data Scientists want to gain insights not worry about building elegant pipelines. Popular Data Science tools are also criticized by more technical people and academics: “Why would anyone use pandas?”. pandas must be the most popular tool to hate by people who have no use for it. It is loved (or at least appreciated) by the Data Scientists who use it daily however. If pandas is so bad why has nothing unseated it? pandas among other tools was built to handle the messiness of the real world. If pandas is so bad why has nothing unseated it as the standard dataframe for Python Data Science? Data Engineers have to handle the messiness that scalable tools can’t The scalable systems (e.g. Apache Spark) that are robust enough for production use can’t handle the messiness of the real world as-is. It’s difficult to scale without clean and simple assumptions and the messier the problem the harder it is to scale. Data Engineers handle the messiness because scalable tools can’t. Scaling with messiness is extremely difficult. Data Engineers handle the

Read More..
AutoTextExtraction by Working BoT using SmartNews 1.03976957683 Build 04 April 2020

Footer info Your browser may cache and not show current data. On windows use CNTRL+F5 key and on Mac Shift+Refresh(browser). See more details. You may need to rotate small screen phones to landscape mode for using some menu or some views.You may contact us here if needed.