The Data Engineer role is responsible for building and maintaining various data stores and data processing pipeline components, in physical, virtual, and cloud-based environments. The Data Engineer will work closely and coordinate with developers, operations engineers, team leads, VP of Architecture, and the CTO to keep data flowing, to insure the proper tools are used for new development, to report on metrics related to uptime and throughput of the various stores and pipelines, and to plan future data migrations to cloud-based infrastructure in a secure and reliable way.
- Work with a variety of databases technologies on a day-to-day basis including PostgreSQL, MongoDB, Riak, Cassandra, DynamoDB, InfluxDB, ElasticSearch, Hadoop/Hive/Pig, and Redis
- Perform routine maintenance on production database systems
- Provision new and manage existing Amazon RDS database instances, keeping a keen eye on managing resource utilization optimizing instance sizing, storage capacity, and performance
- Create and manage ETL tasks from production OLTP databases for data warehousing and the population of new standalone microservices
- Enhance our existing database analysis and monitoring infrastructure
- Create and modify PostgreSQL stored procedures written in PL/PgSQL and PL/Python
- Help identify and implement optimizations to improve performance, increase structural consistency, and reduce on-disk size of existing databases.
- Work with engineering team members to review DDL, performing production database modifications, writing stored procedures, and helping optimize queries
- Participate in on-call rotation for all database related systems
- Work with existing database related Python, Perl, and PHP code to improve performance or fix behaviors
- Routinely evaluate logs looking for problematic and abnormal queries
- Perform ad hoc queries and data analysis requests as needed
The ideal candidate...
- Has at least 2-5 years experience as database administrator
- Has worked with PostgreSQL in a production environment
- Is very familiar with multiple dialects of SQL
- Has worked with a testing framework like pgTap for the testing of DDL, including stored procedures
- Is familiar with open source and has contributed to or created multiple open source projects
- Works well with others
- Has a high desire to learn and be mentored, yet is opinionated about systems, tools, and how data should be structured
- Has a working knowledge of writing maintainable Python code
- Can program in more than one programming language
- Has operational familiarity with GNU/Linux in a production environment
- Is open to new technology choices but tempers such choices responsibly
- Understands data normalization and common data structures (bitmaps, graphs, trees, etc) including how usage of such things can impact performance positively and negatively
- Is passionate about database technologies
- 100% Company Paid premiums for medical, dental, vision, insurance. (including domestic partner benefits).
- Company Paid Short Term Disability Insurance.
- Company Paid Life Insurance.
- Tuition reimbursement.
- Fully Reimbursed Gym Memberships.
- Paid time off.
- Paid holidays.
- Shuttle to/from local SEPTA station.
- 401K retirement benefits with company match.
- Profit Sharing.
- Free lunches every day.
- Break rooms stocked with soda, juices, coffee and teas.
- MacBook Pro laptops and 27" monitors.
- Multiple high definition theater rooms fully equipped with Xbox, Wii & Blu-Ray players.
- A game room with competition billiard, foosball, and ping-pong tables.
**Relocation Assistance Available**
**Visa Sponsorship Available**