Skip to content

Engineering Decisions

Documentation

Link Notes
Design Docs at Google Anatomy of a good design doc
Architecture decision record (ADR) An architectural decision record (ADR) is a document that captures an important architectural decision made along with its context and consequences
What is the best way to write a PRD?
  • Anatomy of a good PRD (Product Requirement document), a document that tells you what exactly you are building
  • Plus, PRD examples from top companies
Scaling Engineering Teams via RFCs: Writing Things Down The power of writing things down, and spreading knowledge across the organization
Technical Writing Courses for Engineers from Google
S.P.A.D.E. Toolkit: How to implement Square's famous decision-making framework A decision-making framework, alternative to consensus built on accountability and clarity, where the person responsible for executing the decision is the one who decides

Cloud

Link Notes
AWS App-Layer Encryption in AWS
AWS Network access for private clusters Very interesting article going into the problem of providing network connectivity between Kubernetes clusters and other internal tools (like deployment pipelines)
AWSSquare Adopting AWS VPC Endpoints at Square Secure communication between data centers and the cloud
AWSSquare Providing mTLS Identities to Lambdas Writeup on how Square added support for mutual TLS calls from AWS Lambda into their data center
AWS Cloud Encryption is worthless! Click here to see why... When evaluating your cloud security posture priorities, encryption should be at the bottom of your list. First, get your IAM house in order
AWS Building the Next Evolution of Cloud Networks at Slack How Slack has gone through an evolution of their AWS infrastructure from running a few hand-built EC2, all the way to provisioning thousands of them across multiple AWS regions

Infrastructure

Link Notes
Automating Our Infrastructure to Empower Engineers
  • Syncing Dev Environments
  • Mirroring Dev and Prod Environments
  • Developing Locally
  • Deploying to Production
Why We Leverage Multi-tenancy in Uber's Microservice Architecture
  • Testing in production: make the current production stack multi-tenant and allow both test as well as production traffic to flow through it
  • Canary deployments: a canary can be treated as yet another tenant in a multi-tenant architecture
  • Capture/replay and shadow traffic: replaying previously captured live traffic or replaying a shadow copy of live production traffic in a hermetically safe environment is another use case of multi-tenancy
Container technologies at Coinbase: Why Kubernetes is not part of our stack Container technologies also create a large set of challenges that must be overcome to prevent failures
How we use HashiCorp Nomad Reliability model of services running in our more than 200 edge cities worldwide
Uber Introducing Domain-Oriented Microservice Architecture
Design Considerations at the Edge of the ServiceMesh Set of design patterns around inbound and outbound traffic to and from a service mesh