Johnnei’s world

How Agoda Handles Load Shedding in Private Cloud

2024-06-05T00:00:00+00:00

Please read the article over at https://medium.com/agoda-engineering/load-shedding-private-cloud-first-81ddd5ab53ac

How Agoda Transitioned to Private Cloud

2022-11-17T00:00:00+00:00

Please read the article over at https://medium.com/agoda-engineering/private-cloud-and-you-736d8d99a51e

Nexus on ARM with Java 11

2022-04-08T15:00:00+00:00

Nexus Repository Manager is one of the popular options to handle artifact storage in the JVM world. As my condo didn’t have the most modern connection, (Who the hell built decided telephone jacks were sufficient?!) I aimed to use Nexus as a local Maven Central mirror. Thus, as I run everything on a bunch of Raspberry PIs now I grabbed a new PI, thought of a new Sword Art Online character name, and deployed Nexus 3.

This is where the story becomes less fun. So I start Nexus, grabbed some water, and it was still starting. Looking at the logs it didn’t seem to do much, but the CPU was pegged. So surely it’s still progressing. Now let’s be clear, I run my PIs pretty funky. Most of my data storage is on an NFS server, and I haven’t seen the best performance with it. Not sure whether I’m just missing some configuration tweaks or if I just expected too much from RAID 10 with spinning rust. Nonetheless, I was quite stunned that it took almost an hour to start Nexus. Clearly, something was up here, but whatever. If it’s just the start-up, I can live with it.

Sadly, this is not where the sad tales stopped. My connection was only 60 mbit/s, so the bar to beat wasn’t all that high. I took my local Maven settings, configured Nexus as the Maven Central mirror, cleared my cache, and tried to compile one of projects. CPU went up, downloads were very slow, but it worked. Most importantly, on the second try it was faster than pulling it from the internet again. Still, why did Nexus spike to 100% CPU on user time. I expected my NFS setup to be the bottleneck here, not the PI CPU.

One issue with Nexus is that it’s quite an old project using a modular runtime, which tends to mean upgrading beyond JRE 8 is a major hassle. Java 9’s major change was the (in)famous Project Jigsaw. Strong isolation of Java Modules, disallowing access to internal modules by default, sprinkled with the moving of a few Java EE specs out of the JDK and into their own artifacts. This is not why I care about running Nexus on a newer JRE though. Both Java 9 and Java 11 had a few improvements for AARCH64/ARMv8, so potentially this better platform integration could improve performance.

Of to google it is! Let’s see if Nexus supports deploying on Java 11. No. The answer is No: NEXUS-19183 (It finally seems to gain some traction though). But I’m a guy who likes suffering and doing stupid things. There were comments in the JVM configuration telling me to enable a few flags to adjust some Java 9 Module settings. So, why wouldn’t it work?

A word of warning here, this is all me tinkering with the Nexus deployment. This is in no way a supported or endorsed way of running Nexus. I’m a very light user and don’t really keep important data in there. If you want to mimic this, be sure to not store data you care about in the deployment. With that out of the way, let’s start tinkering.

Boy did they trick me good. So I think these comments are just some defaults from whatever packaging plugins they use. As their start-up script specifically checks that you’re running JRE 8, and nothing newer. I can change a bash script though, so off to the races it was. Note: I wrote this post a “while” after I actually did this, at the time Nexus 3.37.0-01 was current. Let’s modify the start-up script to allow Java 11+ as well.

--- nexus       2021-11-20 00:40:48.000000000 +0700
+++ nexus       2022-04-07 22:05:53.324782442 +0700
@@ -158,7 +158,9 @@
     return;
   fi
   if [ "$ver_major" -gt "1" ]; then
-    return;
+    if [ "$ver_major" -lt "11" ]; then
+      return;
+    fi
   elif [ "$ver_major" -eq "1" ]; then
     if [ "$ver_minor" -gt "8" ]; then
       return;

I had a bunch of issues with the script’s very clever JRE lookup. So I ended up uninstalling JRE 8 while trying to get Nexus to pick up JRE 11.

Let’s also uncomment all those --add-reads, --add-opens, --add-exports and --patch-module (expect for org.apache.karaf.specs.locator-4.3.2.jar, I’ll come back to that one). This is where they wasted another good amount of my time. The sharp-eyed may have noticed that some flags have an = and some don’t. Even for the same flags. Which doesn’t work. Funnily the error Unrecognized option: --patch-module sounds like I’m running on JRE 8 still. But I removed it. There’s only a JRE 11 available. After pulling my hair out and reading the JEPs around Jigsaw, I realized that all of them need to have an = to work… Thanks. With JVM creation errors out of the way, I could see what errors cropped up during startup. After a few iterations of starting and looking at the module path errors, I ended up adding the following lines:

--add-exports=java.base/org.apache.karaf.specs.locator=java.xml,ALL-UNNAMED
--add-exports=java.base/javax.activation=java.datatransfer,ALL-UNNAMED
--add-reads=java.base=java.datatransfer
--add-opens=java.base/java.security=ALL-UNNAMED

As instructed we’ll also need to comment out the endorsed dirs. The endorsed lib mechanism is no longer through a folder with random JARs that override the JDK. With the module path we’re expected to use --patch-module to specifically alter a module. After a bit of toying around, I found that the default patch for java.base seems incomplete. The classes from locator seem to need activation classes. Now, this is a bit of a pickle as we can only apply a single patch per module.

Luckily the jar format is a fairly simple format: It’s just a zip file with “stored” files (ie, no compression). The only issue might be if we have duplicate files in the META-INF folders between the 2 jars that we want to merge. So, let’s try it and see what happens.

pushd lib/endorsed/
mkdir ultra-patch
cp org.apache.karaf.specs.locator-4.3.2.jar ultra-patch/
cp ../boot/activation-1.1.jar ultra-patch/
pushd ultra-patch/
unzip -n org.apache.karaf.specs.locator-4.3.2.jar
unzip -n activation-1.1.jar
zip -0 -r base-patch.jar * -x org.apache.karaf.specs.locator-4.3.2.jar -x activation-1.1.jar
popd
popd

Now that we got a jar that could maybe patch our java.base, let’s configure it:

--patch-module=java.base=lib/endorsed/ultra-patch/base-patch.jar

And with that, I got a functioning Nexus running on JRE 11. The results are quite stunning though. Startup time? 2 minutes… A “minor” improvement here. Downloading artifacts? Much faster too. Let’s compare.

Overall a very good result. But as this is all quite hacky it’s kinda a barrier to upgrade to newer versions. Will it still work? Probably but downgrading Nexus isn’t supported, so it’s a nice way to get into a broken state. If you’re running Nexus on a Raspberry PI, I can highly recommend trying this as well though. Night and day difference in performance. Even the web UI is a lot more responsive.

Btw I run Arch. On ARM.

2021-08-08T05:00:00+00:00

Ever since I rebuilt my system, I decided it was time to go all-in on Linux. I don’t game as much as I used to while programming more, so what is Windows still doing for me? So I went ahead and changed my OS drive into a dual boot drive with ArchLinux as the default OS.

As a downside, I expected to regularly switch back to Windows whenever I wanted to play a game. However, I’m pleasantly surprised with Steam’s Proton compatibility layer making most games I like, run near-seamless on Linux.

As I mentioned in the previous post, I’m now running my services on a PI 4 cluster on the 2GB and 4GB ram models. Boy did I underestimate how much memory server applications like nowadays. Some applications like GitLab (4GB), ElasticSearch (4GB) and SonarQube (2GB) are some of the applications that are severely constrained in performance because I had to limit them in memory usage. Initially, I even ran GitLab and ElasticSearch on the same 4GB PI. GitLab’s nightly backup would push the memory demand too far and cause the OOM-killer to shoot the ElasticSearch cluster.

To alleviate the memory pressure on Yui (The node running GitLab and ElasticSearch), I configured a second 4GB PI as secondary ElasticSearch node (On 64-bit ARM this time!). But as ElasticSearch on Yui was heavily constrained in memory to share its place with GitLab, it actually ran out of memory while trying to replicate the data to the new ElasticSearch node on Agil. To get it done I had to shut down some of GitLab’s components in order to free up extra memory for ElasticSearch to replicate the data.

But let’s point out another thing. Using a package manager is super convenient on Linux. That is if the packages you need are in there. As I used ArchLinux on ARM, I could only use whatever they packaged for me. Many of the things I run, were not part of these. So my dabbling into Arch’s build system began. For some packages, I could create packages using the published ARM binaries. Yet for other packages, I had to compile from source as they did not support ARM out of the box or only provided Raspbian packages. To make it even more fun, the ArchLinux image for ARM was still 32-bit (ARMv7) even though the PI 4 is 64-bit (ARMv8) So once I started with the second ElasticSearch node, I had to learn about cross-compiling to different architectures. This impacted packages built from the source the most. As depending on the language’s compiler cross-compiling is set up slightly different.

Sawatdee Khrap!

2020-04-11T01:00:00+00:00

Let’s start on a grim note as the world is currently not the happiest of places. I hope everyone stays healthy and that’ll be able to return to our usual lifestyles as soon as possible. I certainly can’t wait to go back to the office and interact with people in person again.

By now, I’ve settled down into my condominium unit and passed my probation period. So I started working on restoring my usual services. In The Netherlands, I used to have quite a capable server. An Intel Xeon E3 something v3 with 32GB ram and an HBA to extend the storage capabilities of the motherboard. When I moved to Thailand, I only packed the HBA as it’s quite a costly part, and it was relatively rare in the usual consumer part shops.

So at this point, I had to decide. How would I rebuild my server? With the number of services I ran on it, I could do with some more cores and ram. Because I like pain, I decided that having a server park is more fun than having a single beefy machine doing everything. That is how I came to my current set up. One ‘small’ machine that would re-use the HBA and provide data storage as NFS. With the compute power coming from a bunch of Raspberry PI 4s. With the newly released PI 4 series, it became feasible. The biggest concern I had with the PI 3B+ was it’s mear 1GB of memory. With the four series allowing up to 4GB all of my services should be fine running on that, except for maybe the Kubernetes cluster. This decision did not come without its drawbacks, but that is a story for another time.

My current calculations said I’d need about 8 PIs to run all the services again. So far, I’ve only miscalculated on the PI which was supposed to Jira and SonarQube. Jira is surprisingly more memory-hungry than I anticipated. It caused me to shift SonarQube to a dedicated PI. When I rolled out SonarQube, that decision was the right one. To allow for full-text search SonarQube employs ElasticSearch, which takes quite a chunk of memory. And SonarQube eats quite some as well across the various modules it starts. By now, I only need to configure the second PI, which will act as a GitLab CI runner.

Farewell Netherlands

2019-10-18T18:55:00+00:00

As some might have read in the about page, I have the intention to move to Thailand. I’m happy to tell you that I’ve accepted a job in Thailand and will be flying out this year. So with that news, I’ve switched gears into preparing to be ready to leave once my visa comes around. So over the last few weeks, I’ve spent very little time on my pet projects as I have to spend as much time as I can on preparing to leave. Which I can tell you now, you’re never as ready as you thought. It’s been a few tiring weeks, and I still have quite some to come until it’s time to fly out.

I’m very excited to move out. But at the same time, the reality of leaving my home country behind is starting to hit. Also, the time to see everyone for the last time is quickly reducing. I’m still considering in which area of Bangkok I want to be living. But with the current busyness, I have little time to think about it. One choice I had was if I’m going to ship a container there or pack some bags. I decided to pack a few bags. So I won’t be able to carry over my server hardware. So now you might be wondering, then where is this hosted?

My flight sim tooling is hosted in the Google Cloud. As my credit card will be revoked in November (as it is provided by my current job), I have to shut down that project down until I have my thai bank account. Meaning I can’t use those resources to host my small blog thingy. I needed a cloud provider who accepts credit-based deployment. After a little searching, I found that Digital Ocean accepts prepayments through PayPal. So I’ve asked a friend for a referral link and charged some credits on the account.

I’ve created a simple droplet which will host as a web server through nginx, a quick Lets Encrypt setup and the flight sim data collector. As these few services will be the surviving services, I’ve named the droplet “Noah’s Ark”. Once I’m settled in Thailand, I’ll be restoring the usual services. Likely with worse availability (although my current provider is taking a piss lately too) and lower bandwidth. But that will likely be either near the end of the year or in the first two months of 2020.

Farewell Comodo, Welcome Let’s Encrypt

2019-08-25T12:00:00+00:00

Recently, I received a notification telling me my SSL certificate is expiring near the end of the month. Thus I have a choice: shell out 90 euros again (wildcard certificate for *johnnei.org) or switch to Let’s Encrypt’s free ones. If you check your browser, you’ll see I’ve chosen the latter.

Now my setup is quite… funky, let’s call it that. At the core, my setup sounds simple. I use Nginx as a reverse proxy for my domains and subdomains. Some are proxied to the PHP FPM daemon, a few to a dedicated application and the rest to Kubernetes. Now you might wonder, why is it difficult? You may have noticed I said domains. I also host johnnei.io on the same host. So even though I have a wildcard certificate, I don’t know which one to serve out until I know which host is requested. Thankfully there is Server Name Indication (SNI) to support this use case. SNI isn’t available on all devices, but anything made after the Windows XP age should be fine. That is more than enough for a personal bloggy thing.

So now you might be thinking: This sounds fairly standard, what’s the problem then? For my ‘development domain’ I didn’t purchase a Comodo certificate. For one subdomain, I did want an SSL connection to an application deployed in Kubernetes. So along with cert-manager, I configured it to retrieve a certificate. As I already handle the SSL-termination in Nginx, I had to extract that certificate from the cluster. As those certificates change quite regularly, I wrote a small cronjob to extract the certificate nightly. So that works fine again, but that’s not all. As the final cherry on top, I have set up GitLab Pages as a fallback in Nginx for when none of the server names match.

So off I went onto the journey to configure Let’s Encrypt certificates. As a bonus, I also looked into splitting up my Nginx configuration file into a file per domain. I remembered a site that helps to configure Nginx with some default settings based on the used technologies, simply called Nginx Config. They also include small config files to trigger https upgrades while integrating with ACME challenges for Let’s Encrypt, and also the commands for Certbot to request the certificates.

With that information, I went to install Certbot on my server and split the first domain into a separate file. To verify the configuration, I invoked a few requests with cURL to see if the HTTPS upgrade is working, if the ACME challenge path doesn’t upgrade to HTTPS and if the final response is still the proxied application. It all seemed fine, so I told Certbot to request my certificate and that all went fine. After that, I swapped out the certificates paths in the subdomain configuration file, and there it was: Let’s Encrypt was live. I repeated this process for the rest of my domains (why do I have so many?). They all went fine. All that is left is to see if the first certificate roll-over works fine.

Farewell AngularJS, Welcome Jekyll

2019-08-17T17:00:00+00:00

So a long time ago, I wanted to learn about AngularJS. For that reason, I build my site in AngularJS to showcase my silly projects which I did over time. Now, about two years later, I felt it was long overdue to phase out the severely dated AngularJS.

Which raises the question: What is going to replace it? Am I going to simply ~~upgrade~~ rebuild it in Angular 8, or will I use a different stack at all? As you can see, I didn’t use a SPA-framework again. Looking at what I needed, it was quite weird to use a system which shines at building dynamic web applications. All I need is some simple static HTML pages and some blogging capabilities.

So here we are. I used the same tactic as with my BitTorrent library: static site generation. As jekyll has features to create a simple blog, it’s more than enough to suit my needs. And having code-highlighting features is great as well :)

This move also meant that I can finally turn off my Wildfly instance, which was serving the AngularJS app plus the backend for the old services page. As this is all just static HTML, there is no need for an app server which can make the page dynamic. So how is this deployed, you ask? It’s even more overkill than before. The site is hosted in my Kubernetes playground through an Nginx container along with cert-manager to update Let’s Encrypt certificates automagically.