Principal Data Engineer | The Principal Data Engineer will own all aspects of engineering including technical design, architecture, implementation, quality assurance, deployment, and operations. You will be responsible for scaling our machine learning pipeline, including requirements, architecture, design & development. You will establish the ins and outs of building a highly available, scalable, distributed, and robust system that uses all the modern cloud computing paradigms, techniques and tools.
Responsibilities:
Core responsibilities will be to help scale large scale machine learning models Own and drive processing of tens of petabytes of unstructured and structured data Provide leadership to the data science and engineering teams in terms of big data processing Enable machine learning systems to become more real-time in terms of decisions but also large scale data ingestion Working with the latest open source technology on highly distributed, scalable products
Requirements:
Master's Degree in Computer Science or related field Experience working with peta-byte level, real-time datasets Must have build applications in the past (at the start/mid career point) Multiple large scale distributed systems or data platforms, including Spark, Flink, Kafka, Dataflow, BigQuery, BigTable, Dataproc etc Experience with Scala and/or Java+Python Experience building large scale crawlers, using Nutch, Gora, MapReduce, HBase, Elasticsearch etc Strong algorithm & data structure knowledge Excellent communication skills and the ability to work well in a team
Email Akshar: adave (at) demandbase (dot) com
Apply here: https://grnh.se/6mv5ixon1