99.99% Uptime Goal for 2024
During late Q3 perhaps early Q4, My company introduced us to their new uptime goal of “Five Nines” which written with numbers looks like 99.999%. Overall this implies that the company would only tolerate 5.26 minutes of downtime all year. Considering we have over 300 locations across the country to manage, It’s what I would call a stretch goal. As it stands, I am not even sure we have a way to measure that metric. Although it is refreshing to have something to work towards as a team.
Considering that I love doing what I do for work, I’d like to practice my Engineering-craft and implement an uptime/availability goal here on my own web server. Since I am only a one-man operation with limited availability, I do not think Five Nines is feasible. Instead I will opt for 99.99% instead.
Availability is generally calculated based on how long a service was unavailable over some period. Assuming no planned downtime, Table 1-1 indicates how much downtime is permitted to reach a given availability level.
Percent | Year | Quarter | Month | Week | Day | Hour |
99.95 | 4.38 Hours | 1.08 Hours | 21.6 Minutes | 5.04 Minutes | 43.2 Seconds | 1.8 Seconds |
99.99 | 52.6 minutes | 12.96 minutes | 4.32 minutes | 60.5 seconds | 8.64 seconds | 0.36 seconds |
99.999 | 5.26 minutes | 1.3 minutes | 25.9 seconds | 6.05 seconds | 0.87 seconds | 0.04 seconds |
Architecture and Design
On December 25th 2023 - I rebuilt this web server to meet one of my favorite princples, keep it simple. Moving from WordPress to serving plain HTML5/CSS3 greatly reduces the complexity of the system. This removes the backend database requirement and also reduces resource overhead. I had learned that Wordpress (being PHP-based) runs a build process for each page requested by a visitor. This means that the server is doing more CPU-intensive work to serve the same content. This also implies that a large spike in normal traffic would have higher potential to DoS the backend server.
To further reducing the likelyhood of a DoS event, I’ll need a Content Delivery Network. During my not so long ago WordPress days, I would have used QUIC.Cloud. They have a plugin that has fantastic integration with WordPress and the backend OpenLightSpeed webserver. However I’ve learned that QUIC.Cloud struggles to effectivly cache plain HTML. I could go with Cloudflare and their free tier, but it does not have the SLO/SLA that I am looking to achieve. I’m also not willing to spend $30/month for their CDN. Instead I have opt-ed to use Bunny.net. They have about 40 more PoPs (Points of Presents) than QUIC.Cloud which should reduce latency to my website in some parts of the world. The big selling factor was their documentation and ability to integrate painlessly with basic HTML5.
Another important note, is the location of my webserver. While it is generally okay to host a website from home; I will not. Instead this webserver lives in the Akamai Data Center (formally Linode) in Fremont, California. This removes my needs to worry about redundant cooling, electrcity, and networks. It also allows me to scale my server both vertically and horizonally as my needs change. Inside this Fremont Data Center, I am also performing rolling backups. This way even if the server is broken beyond repair for whatever reason; I can rollback and restore to a known good state in about 10 minutes.
Observability
Now that we are comfortable with our hardware and operational software stack, we need to properly monitor these underlying services. Let’s think about this… I’ll need to have decent visibility into….
- Resource utilization.
- Error rates.
- SSL certificate status.
- Webserver latency.
- CDN latency.
NewRelic will be my primary monitoring solution. Using their locally installed agent, I can monitor resource utilization, error rates, and backend latency. NewRelic will also have the ability to alert me by email and a personal PagerDuty account in the event of a full-scale outage. The Linode Cloud Manager will act as my seconday method for alerting against unusually high resource utilization for an extended period of time.
UptimeKuma will be used to monitor SSL certificate status, webserver latency, and CDN latency. It will also alert me to outages via my personal PagerDuty account. This UptimeKuma instance is operated within the Oracle Cloud Infrastructure in a region seperate from myself and the webserver. By leveraging this third geographical point of monitoring, I can collect and analyze data to find latency and areas of improvement that otherwise would have been hidden to myself.
Conclusion
2024 is here and this website/project finally for the first time in its existence has a practical goal. Hope to see you later in the year!
2024
Secure Your Linux Box
Matt’s Guide to Securing a Linux Box for Production.
My Website Architecture
Quick overview of my websites architecture.
Exploring Glacier National Park
One Night in Glacier NP - 2024
Images from the Nebraska DLC
Exploring and capturing the scenery in American Truck Simulator, Nebraska DLC
Sail High Seas Safely!
how-to be safe while downloading linux isos.
Jackson-Faulkner Family Trip 2024
Exploring South Dakota with the Jacksons.
Serving Up WebP instead of PNG
how I reduced my home page 610 percent.
Javascript Cat!
how-to add oneko.js to the minimal-mistakes jekyll template.
Growing Cannabis Notes
My personal running notes for growing cannabis.
SMB Mount Errors found in dmesg
Dealing with CIFS errors between TrueNAS and Debian.
Bounce a Juniper Switchport
how-to bounce a Juniper JunOS switchport.
Fixing apt error, ‘list of sources could not be read’
how-to fix ‘the list of sources could not be read.’ when using apt.
Basic Network Troubleshooting
how-to troubleshoot a home network, by a Network Engineer.
Moving to Caddy
Moving my webserver from OpenLiteSpeed to Caddy
Could Not Resolve Error in apt
how-to resolve, could not resolve packages.adoptium.net
Responding to XZ-Utils Vulnerability
how-to validate XZ-Utils impact.
Ninite is Awesome
How and Why I use Ninite
Certbot Renewal on OpenLiteSpeed
Manually renewing Certbot on OpenLiteSpeed
YABS Results
Yet Another Benchmark Results
Basic Linux Administration
Linux Basics and Core Concepts by Matt F.
How to Setup and Manage a Web Domain
how-to Buy and Manage a Web Domain
Learn Linux in 5 Days
My Udemy Course Completion Certification.
2013 Scion FRS Service Manual
Scion FRS Service Manual Download and Sources
My Discord Server
My Discord Server Widget
Migrating to BunnyCDN
How I moved from QUIC.Cloud to BunnyNet CDN.
2023
99.99% Uptime Goal for 2024
My High Uptime Plan for 2024.
Magic The Gathering Notes
Personal notes for Magic the Gathering
HTML Hobbiest Webring
HTML Hobbiest Webring Landing Page/Post
Ditching WordPress
Method of Procedure for migrating from WordPress to plain HTML.
W900 Tuning Pack
W900 Tuning Pack DLC Review.
Goodbye Google Domains
Google Domains is Ending.
Experience OpenLiteSpeed
Deep dive into OpenLiteSpeed webserver.
Struggles with Jekyll and Cloudflare Pages
how-to resolve my Jekyll/Cloudflare Pages deployment error.
Mom Said Redefine Success
In High School I had one dream that stands out. Own a Porsche by the time I was 26. Looking back, I have no idea where this dream came from; because I was ra...
Cow Town Hoe Down - 2023
Personal ramblings about my new town.
Knowledge Sharing
Knowledge Filled PDF Bundle
Jellyfin Guide for Friends and Family
how-to Jellyfun.
My ProtonMail Review
ProtonMail Review - 1 Year
2022
Managing Pi-Hole - A Guide for Beginners in 2022
how-to manage Pi-Hole.
Matt’s Desktop Build in 2022
My new Gaming PC. Its boring but it’ll do.
Ad-Blocking on the Go using Pi-Hole and Pi-VPN in the Cloud
how-to setup Pi-Hole and Wireguard on Linode.
How To Change The Hostname of a Raspberry Pi
how-to update the hostname of a Raspberry Pi.
2021
Using A Raspberry Pi Zero To Host a VPN Server
Can a Raspberry Pi Zero host a family VPN Server? Yes.
Logitech G413 Carbon - Keyboard Review
Logitech G413 Keyboard review.
Razer Huntsman Mini - My First Keyboard Review
Razer Huntsman Mini review.
Weekend with the Bois - June 2021 Video
YouTube video cruising through Colorado!
PiAware - One Month of Ownership
Ramblings about PiAware after one month of operation.
Setup a Headless Raspberry Pi - For Beginners
Guide to setup a Raspberry Pi from start to finish!
Setting the Timezone on your Raspberry Pi 4
Guide to configuring the Timezone on a Raspberry Pi.