Must-read books and papers for developers
These are the essentials; any first-year junior developer should have read them:
- Thomas & Hunt, The Pragmatic Programmer: Probably the first book anyone should read. Lots of solid advice on how to become a professional software developer who gets stuff done.
- Martin, Clean Code: Writing code so that not only the compiler understands it. Essential practices that will shape your own style and required reading in many development organizations.
- Bloch, Effective Java: If you're programming Java, you have to read this book. Best Practices on how to use the language by Josh Bloch, who designed many features in modern Java.
Pick and choose depending what you need and/or are interested in:
- Gamma et al, Design Patterns. Elements of Reusable Object-Oriented Software: An old book, but still the definitive reference on design patterns (when first reading it, pick the top 10 patterns, don't read it cover to cover). Alternatively, you can read - Head First Design Patterns:, which is easier, twice as long, and doesn't cover as many patterns.
- Goetz, Java Concurrency in Practice: Anyone who builds Java services needs a basic understanding of concurrency. If you haven't read the book, you will most likely have already introduced concurrency bugs.
- Nygard, Release It! Design and Deploy Production-Ready Software: Best practices for delivering robust software that doesn't fail in the production environment.
- Fowler, UML Distilled: A Brief Guide to the Standard Object Modeling Language: Learn how to use UML in a pragmatic way. This very short book covers everything you ever need to know on the subject.
- Fowler, Patterns of Enterprise Application Architecture
- Grikorik, High Performance Browser Networking: This online book covers networking in general (not just browser-based) and gives you all the basics you'd typically learn in a two-semester networking course.
- Evans, Domain-Driven Design: Tackling Complexity in the Heart of Software: How to model your business domain in alignment with your business stakeholders. One of the most influential books in software design to date.
- Hohpe, Woolf: Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions: Enterprise Integration Patterns provides an invaluable catalog of sixty-five patterns, with real-world solutions that demonstrate the formidable of messaging and help you to design effective messaging solutions for your enterprise.
- SERVERLESS HANDBOOK: a resource teaching frontend engineers everything they need to know to dive into backend
- Hands-on Scala Programming: Hands-on Scala teaches you how to use the Scala programming language in a practical, project-based fashion
- Git Book: the entire Pro Git book
- The outstanding developer: boost your soft skills to become a better developer
- Scala from Scratch
- OAuth 2.0 Simplified: a guide to building an OAuth 2.0 server. Through high-level overviews, step-by-step instructions, and real-world examples, you will learn how to take advantage of the OAuth 2.0 framework while building a secure API.
- Dev Concepts: a 12 volumes e-book collection explaining every concept of Software Development
- Getting Real: A must read for anyone building a web app.
- Versioning in Event Sourced System
You can find many of the following papers also here: Papers we love
- Carl Hewitt: Actor Model of Computation: The Actor model is a mathematical theory that treats “Actors” as the universal primitives of concurrent digital computation.
- Carl Hewitt, Peter Bishop, Richard Steiger: A Universal Modular Actor Formalism for Artificial Intelligence: This paper proposes a modular ACTOR architecture and definitional method for artificial intelligence that is conceptually based on a single kind of object: actors.
- Lin, Ma et al.: Explaining AlphaGo: Interpreting Contextual Effects in Neural Networks: Google Alpha Go
- Superhuman AI for multiplayer poker: Poker AI
- Dynamo: Amazon's highly available key-value store: This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience.
- Eugen W. Myers: An O(ND) Difference Algorithm and Its Variations: finding a longest common subsequence of two sequences A and B and a shortest edit script for transforming A into B have long been known to be dual problems
- Keil: : Efficient Bounded Jaro-Winkler Similarity Based Search
- Large-scale cluster management at Google with Borg: Google's Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines.
- The Google File System: a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients.
- Bigtable: A Distributed Storage System for Structured Data: Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers.
- Dapper, a Large-Scale Distributed Systems Tracing Infrastructure: Modern Internet services are often implemented as complex, large-scale distributed systems. These applications are constructed from collections of software modules that may be developed by different teams, perhaps in different programming languages, and could span many thousands of machines across multiple physical facili- ties. Tools that aid in understanding system behavior and reasoning about performance issues are invaluable in such an environment.
- Spanner: Spanner is Google’s scalable, multiversion, globally distributed, and synchronously replicated database.
- Why Google Stores Billions of Lines of Code in a Single Repository: Google's monolithic repository provides a common source of truth for tens of thousands of developers around the world.
- Spanner, TrueTime and the CAP Theorem: Spanner is Google's highly available global-scale distributed database. It provides strong consistency for all transactions. This combination of availability and consistency over the wide area is generally considered impossible due to the CAP Theorem. We show how Spanner achieves this combination and why it is consistent with CAP. We also explore the role that TrueTime, Google's globally synchronized clock, plays in consistency for reads and especially for snapshots that enable consistent and repeatable analytics.
- MapReduce: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key.
- Borg, Omega, and Kubernetes: Lessons learned from three container-management systems over a decade
- Paxos Made Simple: Paxos algorithm for implementing a fault-tolerant distributed system
- The Part-Time Parliament: Paxos algorithm for implementing a fault-tolerant distributed system
- Fallacies of distributed computing: a set of assertions made by L Peter Deutsch and others at Sun Microsystems describing false assumptions that programmers new to distributed applications invariably make.
- The Eight Fallacies of Distributed Computing
- CQRS
- Sagas Pattern
- The Limit of Sagas
- Scrum-Guide
- There is no Now
- CUPID—for joyful coding
- arXiv.org: arXiv is a free distribution service and an open-access archive for scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.
- Google Research
- Resources on Software Architecture