The Data Engineer role is responsible for building and maintaining various data stores and data processing pipeline components, in physical, virtual, and cloud-based environments. The Data Engineer will work closely and coordinate with developers, operations engineers, team leads, VP of Architecture, and the CTO to keep data flowing, to insure the proper tools are used for new development, to report on metrics related to uptime and throughput of the various stores and pipelines, and to plan future data migrations to cloud-based infrastructure in a secure and reliable way.
Work with a variety of databases technologies on a day-to-day basis including PostgreSQL, MongoDB, Riak, Cassandra, DynamoDB, InfluxDB, ElasticSearch, Hadoop/Hive/Pig, and Redis
Perform routine maintenance on production database systems
Provision new and manage existing Amazon RDS database instances, keeping a keen eye on managing resource utilization optimizing instance sizing, storage capacity, and performance
Create and manage ETL tasks from production OLTP databases for data warehousing and the population of new standalone microservices
Enhance our existing database analysis and monitoring infrastructure
Create and modify PostgreSQL stored procedures written in PL/PgSQL and PL/Python
Help identify and implement optimizations to improve performance, increase structural consistency, and reduce on-disk size of existing databases.
Work with engineering team members to review DDL, performing production database modifications, writing stored procedures, and helping optimize queries
Participate in on-call rotation for all database related systems
Work with existing database related Python, Perl, and PHP code to improve performance or fix behaviors
Routinely evaluate logs looking for problematic and abnormal queries
Perform ad hoc queries and data analysis requests as needed
The ideal candidate...
Has at least 2-5 years experience as database administrator
Has worked with PostgreSQL in a production environment
Is very familiar with multiple dialects of SQL
Has worked with a testing framework like pgTap for the testing of DDL, including stored procedures
Is familiar with open source and has contributed to or created multiple open source projects
Works well with others
Has a high desire to learn and be mentored, yet is opinionated about systems, tools, and how data should be structured
Has a working knowledge of writing maintainable Python code
Can program in more than one programming language
Has operational familiarity with GNU/Linux in a production environment
Is open to new technology choices but tempers such choices responsibly
Understands data normalization and common data structures (bitmaps, graphs, trees, etc) including how usage of such things can impact performance positively and negatively
Is passionate about database technologies
100% Company Paid premiums for medical, dental, vision, insurance. (including domestic partner benefits).
Company Paid Short Term Disability Insurance.
Company Paid Life Insurance.
Fully Reimbursed Gym Memberships.
Paid time off.
Shuttle to/from local SEPTA station.
401K retirement benefits with company match.
Free lunches every day.
Break rooms stocked with soda, juices, coffee and teas.
MacBook Pro laptops and 27" monitors.
Multiple high definition theater rooms fully equipped with Xbox, Wii & Blu-Ray players.
A game room with competition billiard, foosball, and ping-pong tables.