Data Engineer

Job Description

Diesel Labs is looking for a data engineer to build and manage the pipelines that power our analytics products. This is an exciting and unique opportunity to join a rapidly growing company guiding key decisions in today’s ever-changing media landscape.

The ideal candidate is excited about applying modern techniques to manage a complex system of data. This person will be involved in every aspect of the data pipeline, from collection and categorization to ensuring data quality, scalability, reliability and reusability. This role will be responsible for developing, constructing, testing and maintaining data architectures for data platform, database and analytical/reporting systems.

About Diesel Labs

Diesel Labs is a Content Intelligence company that measures audience attention across the entire entertainment landscape to help answer the toughest questions facing media companies today: what to make, where to market and how to measure success. We analyze millions of audience members’ engagements with content on all the major social and video platforms including (but not limited to) Twitter, YouTube, TikTok, Facebook and Instagram. Our data solutions provide a comprehensive layer of insight that media companies depend on when making key content and marketing decisions in the effort to build audience engagement and subscribership.

Responsibilities

– Partner with the business, product and data science teams to automate processes to create data sets for analytical and reporting needs

– Build data pipelines to assemble large, complex sets of data that meet functional and non-functional business requirements

– Experience with big data technologies, including Hadoop, Hive and Spark

– Design, build and improve scalable and resilient ETL pipelines and integrate with cloud-native data warehouses (Redshift) and relational or NoSQL databases

– Manage multiple layers of SQL processing to convert the data from raw, staging to production BI views, which includes a lot of optimization code to provide failsafes and efficient response times for users

– Manage best practices and standards for data quality, scalability, reliability and reusability

– Responsible for data integrity checks, performing deployments and releases

– Debug production issues across data platform services

Qualifications

– 3-5+ years experience with Data Modeling and Python development

– Proactive and strategic, able to understand abstract requirements, analyze data, discover opportunities, address gaps and communicate to multiple individuals and stakeholders

– Experience with SQL, scripting languages, data transformations, statistical analysis and troubleshooting across multiple database platforms (PostgreSQL, Redshift, etc.)

– Software development experience working with Apache Airflow, Spark, DynamoDb

– Deep knowledge of at least one compiled language (e.g., Scala, C++, Java, Go)

– Experience with cloud platforms (Amazon Web Services (AWS) preferred) and designing and building solutions utilizing various cloud services (EC2, S3, EMR, Kinesis, RDS, Redshift/Spectrum, Lambda, Glue, Athena, API gateway, etc)

– Experience building high-performance and scalable applications

– Applicants in the US must be 18 years and older with unrestricted work authorization (does not require sponsorship of a visa)

The Diesel Engineering Team Environment

– Opportunity to be among the first engineers at an early stage startup

– Scala server side and Javascript (react) frontend. Open to the best technology for the problem.

– Deploy to AWS using Docker containers and Cloud Formation

– Data and machine learning pipeline is built on Spark

– Remote team

– As we grow, we’re looking to build a culture of autonomy where engineers are encouraged to own problems end to end, develop specializations and share ownership

Benefits

– Compensation package includes competitive base salary and meaningful equity

– Full healthcare benefits

– Company 401k

– Unlimited PTO

Apply

    Resume/CV*