Overview

MLB Advanced Media (MLBAM), the interactive and mobile company of Major League Baseball, is the developer of several award-winning products, apps and services. Some of these include: MLB Gameday, MLB.TV, Fantasy Baseball, Beat the Streak, and the #1 sports app and a Hall of Fame inductee for iPhone, iPad and Macworld โ€“ MLB At Bat! We are also the industry leader in the development and distribution of live streaming video events over multiple platforms.

Description

Core Engineering is a distributed team that owns internal tools used to deploy the services that make up MLBAMโ€™s products. Built for AWS with a variety of open source software, our tools are used by dozens of engineering teams across the company. We strive to act as a productivity multiplier by offering our customers rich primitives for delivering their services, allowing them to focus more on product.

System Reliability Engineers fulfill a cross-functional role by driving the delivery of services through to production. Within Core Engineering, you will help design and operate services to support exponential growth in MLBAMโ€™s product and partner portfolios. Youโ€™ll also collaborate with other engineers to pave way for the future of infrastructure in AWS, moving beyond traditional practices. You should have a passion for systems engineering, monitoring & observability, and automation.

This position can be worked remotely, or from our locations in NYC, SF, and Raleigh.

Responsibilities

– Maintain, and improve, the reliability and operability of services
– Design systems to enable rapid development, high availability, and clear observability
– Write tools, and leverage open source, to automate tasks with an emphasis on safety and repeatability
– Troubleshoot and resolve performance and reliability issues across the stack, including cloud resources
– Collaborate with engineers to ensure services are designed to be cloud-native, scalable, and easily operated

Requirements

– BS or MS degree in Computer Science, or equivalent experience
– 3+ years experience writing software on, or operating, \*nix platforms
– Youโ€™re a self-learner, independent, and have excellent problem-solving skills
– You care deeply about code craftsmanship and operational excellence
– You have strong written and verbal communication skills

Nice to have, but not required

– Experience with software containers (e.g. Docker, rkt, runC) and schedulers (e.g. ECS, Kubernetes, Nomad)
– Youโ€™ve directly impacted the reliability and availability of large-scale distributed systems
– Deep understanding of networking, especially routing and the IP stack
– Youโ€™ve deployed and operated geographically distributed, redundant services
– Engagement with open source communities

Technologies we love

– Languages: Go, Ruby, Bash
– Tools: Docker, Git, Graphite, GraphQL, Jenkins, Logstash, Packer, Puppet, Sensu
– Data stores: DynamoDB, Elasticsearch, PostgreSQL, Redis