DevOps Industry Updates #25
25 issues just like that! What started as an update segment at my Meetup has turned into something bigger. Thanks to all my loyal readers out there, I’ve heard you loud and clear: you love the rich technical content and lack of vendor spam. True to that brand, issue #25 is loaded with only the most impactful stories and has a theme of debugging systems at scale, including gPRC, JVM and PostgreSQL. Ready to open a bunch of new tabs? Here we go:
🔥 Top Cream
This issue’s top 4 stories:
- Why load balancing gRPC is tricky?
- Tricks of the Trade: Tuning JVM Memory for Large-scale Services
- Debugging random slow writes in PostgreSQL
- The Full Story of the Stunning RSA Hack Can Finally Be Told
🌎 Society
- Long working hours are a killer, WHO study shows: working long hours is killing hundreds of thousands of people a year in a worsening trend that may accelerate further due to the COVID-19 pandemic, the World Health Organization said.
-
Vamp is joining CircleCI!: “together we will combine continuous integration, continuous deployment, release orchestration and continuous validation, and bring it to the next level. Our unified platform will enable software developers, engineers, DevOps teams, and business stakeholders alike to “shift-right”, and deliver better software even faster and more frequently.”
-
The Next Step after DevOps and GitOps Is Cloud Engineering, Pulumi Says: if we are going to treat infrastructure as code, shouldn’t infrastructure engineers have access to the same tools that make software engineers productive and even the same languages? That’s the theory behind Pulumi, which has just released version 3 of its open source platform.
📟 DevOps
- Tricks of the Trade: Tuning JVM Memory for Large-scale Services: Uber’s growth over the last few years exponentially increased both the volume of data and the associated access loads required to process it, resulting in much more memory consumption from services. Increased memory consumption exposed a variety of issues, including long garbage collection (GC) pauses, memory corruption, out-of-memory (OOM) exceptions, and memory leaks.
-
Debugging random slow writes in PostgreSQL: takes you through the journey and shows some tools & processes that can help you dig into SQL performance issues.
-
Extreme HTTP Performance Tuning: “this post will walk you through the performance tuning steps that I took to serve 1.2 million JSON “API” requests per second from a 4 vCPU AWS EC2 instance.”
-
Ansible 4.0.0 final has been released!: this version of Ansible is based on Ansible Core 2.11 which is a new major update of the
ansible-core
package. It may contain backwards incompatible changes to the playbook language and command line programs - see the porting guide for more information. -
Thundering herds, noisy neighbours, and retry storms: a list of operational patterns that every DevOps engineer should know.
-
GitOps Con 2021 recordings: learn about GitOps until you’re blue in the face.
🛠️ DevOps Tools
-
Sublime Text 4: new features include a redesigned UI, Apple Silicon support, tab multi-select, context-aware auto complete, GPU rendering and much more!
-
terraform-docs: generate documentation from Terraform modules in various output formats.
☸️ Kubernetes
- Implementing Kubernetes: The Hidden Part of the Iceberg: a story about personal and team challenges when implementing a production-grade fleet of Kubernetes clusters at GumGum.
-
Kubernetes capacity planning: How to rightsize your cluster: don’t be greedy! Learn how to identify unused resources and how to rightsize the capacity of your Kubernetes clusters.
-
Service Mesh Wars, Goodbye Istio: after using Istio in production for almost 2 years, we’re saying goodbye to it. Learn why, as well as the current state of the Service Mesh Wars.
-
Introducing
PodTopologySpread
: managing Pods distribution across a cluster is hard. There is a common need to distribute Pods evenly across topologies, so as to achieve better cluster utilization and high availability of applications. ThePodTopologySpread
scheduling plugin (originally proposed asEvenPodsSpread
) was designed to fill that need.
-
Database migrations on Kubernetes using Helm hooks: leverage the
pre-install
andpre-upgrade
Helm hooks to run database migrations before your application is installed or updated. Keep reading to understand why simpler solutions might not be the best idea and a couple of gotchas when using Helm hooks. -
Using Finalizers to Control Deletion: deleting objects in Kubernetes can be challenging. You may think you’ve deleted something, only to find it still persists. While issuing a
kubectl delete
command and hoping for the best might work for day-to-day operations, understanding how Kubernetes delete commands operate will help you understand why some objects linger after deletion.
🔐 Security
- The Full Story of the Stunning RSA Hack Can Finally Be Told: in 2011, Chinese spies stole the crown jewels of cybersecurity—stripping protections from firms and government agencies worldwide. Here’s how it happened.
- Vulnerabilities in billions of Wi-Fi devices let hackers bypass firewalls: dubbed the FragAttacks, they allow people within radio range to inject frames of their choice into networks protected by WPA-based encryption.
- Over 40 Apps With More Than 100 Million Installs Found Leaking AWS Keys: most mobile app users tend to blindly trust that the apps they download from app stores are safe and secure. But that isn’t always the case.
💻 Programming
-
An early look at Postgres 14: Performance and Monitoring Improvements
-
encode/httpx: next generation HTTP client for Python.
📖 Machine Learning
- MUM: A new AI milestone for understanding information: Google’s Multitask Unified Model is trained across 75 different languages and can understand many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge than previous models. MUM is multimodal, so it understands information across text and images and, in the future, can expand to more modalities like video and audio.
🐧 Linux
- Error handling in Bash scripts: let your Bash script help you find its errors with error handling.
🚢 Leadership
- Naming names in incident writeups: an interesting challenge to blameless postmortems.
☁️ Cloud
-
Why load balancing gRPC is tricky?: using a binary protocol with structured data as the communication medium among services is indeed attractive, but there are some considerations when using gRPC, most important of all is how to handle load balancing.
-
Service-Oriented vs. Monolith: most teams do choose the microservices path since that’s the “industry standard” these days. However, monolithic designs still have their use and space, especially at an early stage of an idea or a product.
- The Architecture of Uber’s API gateway: Uber developed a feature-rich API gateway that is capable of complex operations on the incoming and outgoing data payload across multiple protocols. This article takes a deeper dive into the technical components of Uber’s custom API gateway system.
AWS
-
Introducing CloudFront Functions: built for lightweight HTTP(S) transformations and manipulations, allowing you to deliver richer, more personalized content with low latency to your customers.
-
Amazon VPC Announces Pricing Change for VPC Peering: starting May 1st 2021, all data transfer over a VPC Peering connection that stays within an Availability Zone (AZ) is now free. All data transfer over a VPC Peering connection that crosses Availability Zones will continue to be charged at the standard in-region data transfer rates.
-
Amazon EKS managed node groups adds support for Kubernetes node taints
-
Introducing Incident Manager from AWS Systems Manager: a new capability of AWS Systems Manager that enables faster resolution of critical application availability and performance issues. Incident Manager helps you prepare for incidents with automated response plans that bring the right people and information together.
-
Amazon EC2 Auto Scaling Introduces Predictive Scaling as a Native Scaling Policy: proactively scale out your ASG to be ready for upcoming demand, avoiding the need to over-provision capacity, resulting in lower EC2 cost while ensuring your application’s responsiveness.
-
Amazon EKS and EKS Distro now support Kubernetes version 1.20: the 1.20 release includes
RuntimeClass
and Process ID Limits reaching stable status, API Priority and Fairness being enabled by default, andkubectl
debug reaching beta status.
Azure
- Azure Static Web Apps goes GA: a serverless web app hosting service for static web apps.
Article version: 1.0.1