What I did in April

The world is going through a tumultuous phase, but that doesn't stop the progress of humanity. Things go on, people move, prices fluctuate, we eat fruit and live a healthy life. April is hot, so hot, that even the AC fails to cool her.

April was a month of deployments, CI/CD pipelines and a headache of a feature.

Never run celery without concurrency = 1 on a kubernetes pod, if you want to increase the number of workers just make replicas of that pod, it will help with the OOM error. Learnt this the hard way. K8s has always intrigued me, it seems too complex and I don't see a need to deploy things there unless you are facing a lot of load and have to dynamically scale stuff. But if you have a good idea about how much load you are going to experience then a simple server would do. I am not downplaying k8s, it is a powerful platform, but tech problems are also infra centric, like why would you deploy your internal tool using k8s? I mean if a cluster already exists then it makes sense, otherwise it doesn't.

I learned this the hard way: Celery + high concurrency + a single pod = OOM and chaos.
The fix wasn’t clever. It was boring. Concurrency = 1. Scale with replicas.

So I had gotten a task to build a feature, we didn't do much tech side planning for it, like flows and examples and all. I just dived head first into it. But then it turned out, data can be a bitch. The feature was centred around manipulating data, but data here was processed via ML model, so well we didn't know how many edge conditions would hit us in the face, and sure did. It was a gruelling journey to handling edge cases, missing spaces, missing punctuations, incorrect sentence joining, and what not. But we persevered, thanks to Opus 4.7 and Gpt 5.5. They both did generate slop. The easiest fix here would have been some manipulation on the FE side, but we didn't do it. Maybe in the future we will change that.

I have been using Opus 4.7 and gpt 5.5 and pitting them against each other, gpt 5.5 is a bit more agreeable, but both are good at doing things. I appreciate their help. There is code that I don't fully understand right now, but I do review things and have load of tests that need to PASS before any model commits anything. Guardrails are important, otherwise I'd get fired for pushing bugs to prod.

I use Cursor and Codex (CLI) as of now. Codex provides with generous limits and I love it. Codex also uses a lot less RAM too. I have multiple .md files which tells the AI what to do and how to run things, test things, what to commit, what not to commit etc. etc. It takes a bit of time to get a hand of working with AI and getting to a point where the flow is customised to your liking and comfortability.

I am not going too deep into tech here, but I also helped with tech upgrades, like bumping up Python 3.8 to 3.12 and Django upgrades, AI helped a lot there, migrations have become pretty fast and easy to do with the help of AI. I also worked on timezone stuff. Man timezones are so difficult some times, you have to pay attention otherwise things will go haywire.

I also wrote implementation docs, some architecture and did other chores like cleaning the server, reducing memory usage etc. Nobody really likes doing chores, but with AI it really fastens the whole process, but make sure you aren't just willy nilly running delete commands on your server. Look and then delete. Many times a lot of irrelevant cache gets accumulated. I think this is one of the wins for k8s maybe?

A lot of package supply chain attacks this year, huh? Thankfully, we pin the versions. But all these attacks are pretty sophisticated, and obviously, you need some level of social engineering to introduce the malware into trusted systems. So I am not sure if an AI can just do social engineering attacks this sophisticated. Maybe it can, who knows. We are entering into an era where security will become paramount. Every feature will have to pass the brutal critique and security testing. There are so many notorious actors on the internet. You have to be careful because they just got empowered.

You need to keep your data pipelines clean, if your feature depends on manipulation of data then do test a few out of the way cases because you might think that these edge cases would never happen, but trust me at some point of time they will. Also keep an eye on 3rd party data, you cannot also willy nilly trust it because they have a few badges on their website.

Pretty cynical right? But when you are jolted into reality by mistakes made by your feature, then you understand that happy paths are happy paths for a reason, but your sleep will be interrupted by sad paths. Which will make you sad. A developer/engineer does get sad when something they built doesn't work, it happens, you learn from the experience and move forward.

Resilient systems will come into fashion for all kinds of use cases.

We got audited and now we have to work with the auditors to implement things. I am looking forward as to how it goes.

I have bought a book on inferencing, I am interested in inferencing for LLMs, I think it will be a big topic of discussion, so I will read the book, implement some things and then participate in the discussion.

I know that this is pretty vanilla for most people who work on cutting edge tech, but work is mostly you doing redundant things, which will be now done by AI, again and again but sometimes you encounter problems that are way too difficult, but they teach you a lot. Log data rotation really helps though. A mere typo can ruin your life.

You have to make your code easily runnable on any local device without much fuss, because that contributes to dev experience, we don't want people shying away from running things locally just because the setup seems complicated. If you have time on your hands then there should be a script or a command that when run setups everything locally for a repo and then the repo just works.

If your repo needs a README, a doc, and a prayer to run locally, it’s already broken.

What else, hmm. Everyday I become a better engineer. My systems become better, my thinking becomes better. Also when you sleep with a bug you need to resolve in your head, you magically wake up next day with a solution. Boredom is underrated. I have to build another feature where it involves pulling a lot of data from the db then transforming it and making it downloadable, seems pretty easy no? Yeah. I have written about how RAM is the bottleneck so you have to handle things in batches, but testing is a headache here, because for one successful test it takes 2 hours. :P I have made my frustrations known in the previous blogs about how a 2 hour test gets completed 99% then it fails and then I cry.

So what will we do with a lot of slop code? I dunno, but maybe we will have to build strong code foundations so they can handle any kind of slop code extensions, like build the platform to be resilient and easily extensible then your slop code would just be an extension of your system. If your system is good, then slop code can be replaced, Abstractions, Abstractions, Abstractions.

You are the context, boyo, you, always remember. You fall, the AI falls, the system falls, and then the world falls.

Read my previous blog: https://tech.peculiarvivek.com/what-did-i-do-in-march

If you are a codex lover like me, do use this, made by yours truly, gpt 5.4 + 5.5 + me: https://pypi.org/project/codex-stats/