MLB Advanced Media (MLBAM), the interactive and mobile company of Major League Baseball, is the developer of several award-winning products, apps and services. Some of these include: MLB Gameday, MLB.TV, Fantasy Baseball, Beat the Streak, and the #1 sports app and a Hall of Fame inductee for iPhone, iPad and Macworld – MLB At Bat! We are also the industry leader in the development and distribution of live streaming video events over multiple platforms.
Core Engineering is a distributed team that owns internal tools used to deploy the services that make up MLBAM’s products. Built for AWS with a variety of open source software, our tools are used by dozens of engineering teams across the company. We strive to act as a productivity multiplier by offering our customers rich primitives for delivering their services, allowing them to focus more on product.
System Reliability Engineers fulfill a cross-functional role by driving the delivery of services through to production. Within Core Engineering, you will help design and operate services to support exponential growth in MLBAM’s product and partner portfolios. You’ll also collaborate with other engineers to pave way for the future of infrastructure in AWS, moving beyond traditional practices. You should have a passion for systems engineering, monitoring & observability, and automation.
This position can be worked remotely, or from our locations in NYC, SF, and Raleigh.
– Maintain, and improve, the reliability and operability of services
– Design systems to enable rapid development, high availability, and clear observability
– Write tools, and leverage open source, to automate tasks with an emphasis on safety and repeatability
– Troubleshoot and resolve performance and reliability issues across the stack, including cloud resources
– Collaborate with engineers to ensure services are designed to be cloud-native, scalable, and easily operated
– BS or MS degree in Computer Science, or equivalent experience
– 3+ years experience writing software on, or operating, \*nix platforms
– You’re a self-learner, independent, and have excellent problem-solving skills
– You care deeply about code craftsmanship and operational excellence
– You have strong written and verbal communication skills
Nice to have, but not required
– Experience with software containers (e.g. Docker, rkt, runC) and schedulers (e.g. ECS, Kubernetes, Nomad)
– You’ve directly impacted the reliability and availability of large-scale distributed systems
– Deep understanding of networking, especially routing and the IP stack
– You’ve deployed and operated geographically distributed, redundant services
– Engagement with open source communities
Technologies we love
– Languages: Go, Ruby, Bash
– Tools: Docker, Git, Graphite, GraphQL, Jenkins, Logstash, Packer, Puppet, Sensu
– Data stores: DynamoDB, Elasticsearch, PostgreSQL, Redis