CMP.jobs
The Lead Data Engineer must have a broad and deep data skillset as well as people management experience.
In addition to being a hands on contributor, this role is responsible for setting the technical direction of the team.
Along with being a strong technical expert, this role requires a natural leader adept at earning the trust and respect of the team.
Responsibilities include: Develop people through coaching, mentoring, and management support Ensure your team members are happy by providing them with career development guidance as it relates to both hard skills as well as the soft ones.
Help them exploit their strengths, handle their weaknesses and make sure they are properly equipped to perform their job the best they can Hold regular one on ones, conduct performance evaluations and manage the hiring process as well as other administrative tasks for your team Create a culture on your team where anyone can speak up and share his or her concerns when its in the interest of the mission You and your direct reports hold each other accountable for the commitments you both make Grow the technical expertise of your team on performance, scalability and maintainable architecture Participate in technology architecture discussions for product development.
Translate the business requirements into strategy and make sure it is well transferred to all team members Advocate for software best practices within your team as well as across engineering Work closely with dev teams, product owners, and stakeholders to ensure we deliver innovative products to our users Make technical decisions based on business needs without sacrificing product quality, integrity and security Work on unique and interesting data challenges around architecting, building and managing pipelines that securely process hundreds of terabytes of data Work closely with analysts and statisticians to ensure the validity of our processes Required Skills Our engineers are expected to wear a number of hats and have the opportunity to touch all parts of the stack.
Our stack includes Apache Spark, Scala, Redshift and an ever-growing list of many other cool technologies.
Who you are You care about developing people as a team leader Experience wrangling terabytes of big, complicated, imperfect data Practical experience with Apache Spark Experience with AWS products (Redshift, EMR, S3, IAM, RDS, etc) Basic understanding of statistics and experience with statistical packages such as R, Matlab, SPSS, etc You have a deep understanding of scalable systems and you have large-scale engineering experience in an Agile development environment Bachelors degree in Computer Science or a related field (or 4 additional years of relevant work experience) A strong understanding of data structures, algorithms, and effective software design Significant development experience with a major modern language (e.g.
Java, Scala, Python, Ruby, C/C++, etc.) Significant experience working with structured and unstructured data at scale and comfort with a variety of different stores (key-value, document, columnar, etc.) as well as traditional RDBMSes and data warehouses Experience with or interest in AWS Glue, Redshift Spectrum and any other tools that enable data querying at scale Experience writing unit and functional tests.
Comfort with version control systems (e.g.
Git, SVN) Excellent verbal and written communication skills; must work well in an agile, collaborative team environment Valuable Skills Masters in Computer Science or a related field Practical experience with supervised machine learning techniques Strong background with test-driven development