<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Vivek's Tryst with Tech]]></title><description><![CDATA[builder of APIs, reducer of latencies, optimiser of code, fixer of issues

Backend Engineer | On my way to become 10x engineer]]></description><link>https://tech.peculiarvivek.com</link><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 15:42:45 GMT</lastBuildDate><atom:link href="https://tech.peculiarvivek.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[What did I do in March?]]></title><description><![CDATA[Well, I celebrated my birthday. :P
I worked on a functionality end to end using AI. It was a simple feature and AI wrote everything correctly in a single pass. I used Codex. After that, I had to ask i]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-in-march</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-in-march</guid><category><![CDATA[tech ]]></category><category><![CDATA[codex]]></category><category><![CDATA[march]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Fri, 03 Apr 2026 11:06:01 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/621ca065c07e2ceec785e7b8/5e2646d6-3935-4c6c-a87d-6391fe29cd49.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Well, I celebrated my birthday. :P</p>
<p>I worked on a functionality end to end using AI. It was a simple feature and AI wrote everything correctly in a single pass. I used Codex. After that, I had to ask it to adjust some edge cases, add more validations, and improve error messages. I also asked it to write an end to end test script, and now I have a script to test the whole thing.</p>
<p>After raising the PR, my boss suggested some changes. So I went back to Codex, asked it to make those changes, and it did. I again had to adjust some edge cases, but the tests were already there, so it was easy to verify everything. One thing I noticed was that the code had access guardrails, but it did not reason about determinism. For example, it did not check whether a variable could only have two possible values and still adjusted for a third condition. I think context is still a bottleneck, or at least a difficult problem.</p>
<p>I also did some systems thinking :P and helped a few colleagues with their issues.</p>
<p>I completed a service migration end to end using AI. Did it make mistakes? Sure. Did it correct those mistakes? Yes. I had to provide some logs and data, and it fixed the code. It was honestly life saving. It reduced a lot of stress and cognitive load for me.</p>
<p>As the data in the database increases, you start facing different kinds of problems. I think a heavily used service is a good problem to have, but you also need multiple guardrails and self healing mechanisms just to maintain consistency.</p>
<p>AI has taken a lot of load off my back. Otherwise I would have spent multiple hours on code migrations. Now I can spend that time looking at the system and thinking about what can be improved. If you are a software engineer and want to sleep well at night, you need self healing mechanisms and ways to guarantee consistency. But nothing is perfect, so there is always some uncertainty.</p>
<p>I also designed the full flow for the functionalities I built and made some changes to offload rate limit healing to Celery. I also fixed some very small bugs.</p>
<p>What else... hmm.</p>
<p>I set up my projects on a completely new server because we were doing tech stack migrations. It was fun. I had to take care of many things, but then I realized the project is almost fully self contained. Just add the environment variables and it works.</p>
<p>I think keeping things simple will become even more important now that AI is going to do a lot of the coding. You have to explicitly ask it to keep things simple. Otherwise when things fall apart, you will not be able to fix them easily. And at some point, things will fall apart.</p>
<p>I submitted all my self review documents. Let's see if I get a pay bump and a good bonus. I hope so.</p>
<p>I am trying to become a research engineer now. I am interested in inference infrastructure, and I think moving toward systems engineering would be a worthwhile pivot. With AI, we will have more time to experiment, and I plan to do exactly that. I am not very interested in training LLMs because I do not have that level of mathematical background, but I think I can engineer better systems. Making inference cheaper can add a lot of value to a company. If a company could build its own nuclear power plant, its electricity costs would drop significantly and margins would improve. More profit is always better, right?</p>
<p>Am I anxious about my career? Yes. Do I want to work on cool things? Yes. Am I qualified to work on cool things? Who cares. I will learn. I think it is time to take risks and maybe even cold email Sam Altman :P</p>
<p>I do not have much more to write. March was a bit slow. There were many production issues and we were focused on streamlining processes. This blog probably does not make me stand out as engineering talent, but I was mostly working on making things more stable and slightly faster. I did make one API faster though, and that felt good.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do in February]]></title><description><![CDATA[Let me think. So I worked on a new functionality and raised a PR in just 2 days for the whole end-to-end correct flow. I did not focus much on performance because I wanted to first get it right and th]]></description><link>https://tech.peculiarvivek.com/what-i-did-in-feb</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-i-did-in-feb</guid><category><![CDATA[progress]]></category><category><![CDATA[Software Engineering]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 01 Mar 2026 14:38:45 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/621ca065c07e2ceec785e7b8/e3768f41-bf60-4455-b6e4-ebe5cafc7254.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let me think. So I worked on a new functionality and raised a PR in just 2 days for the whole end-to-end correct flow. I did not focus much on performance because I wanted to first get it right and then get it fast.</p>
<p>AI has been helpful, but one thing I noticed was that the requirements were also written by AI. Now that AI suggests things to the client, they start asking us to implement those same things, which does not always make much sense when you look at the actual product or service.</p>
<p>Most of the work involved creating an implementation doc, getting it reviewed, building it, getting that reviewed, and then deploying it. I had to build a matching algorithm where a person gives a few details and the system has to come up with objects that have a greater than 50 percent confidence score based on how similar the texts are. I think it is for duplicate text matching or similar text tracking.</p>
<p>They wanted it in real time. But when you start comparing text one by one, it takes time. So before comparison, you have to exclude a lot of things and carefully choose candidates to compare with. Otherwise, you end up spending too much compute on it, the latency gets hammered, and then the client asks, “Why is this happening?”</p>
<p>The client does not want to get into the technical details. They just want things to work. So you have to build a monitoring system that alerts you before the client notices something is not working correctly. A better way is to run sanity checks on the most common functionalities in production. Test things preemptively and alert the responsible person if something goes wrong. Monitoring is important.</p>
<p>With a ton of data comes a ton of new problems. If your service deals with a lot of data being sent by a third party service, you need to build guardrails so that a simple error does not cascade the entire system into failure. I have started thinking about edge cases more. It is not like me, though. You have to think long terms as to how so much data is going to hammer your API latency and then you will be asked difficult questions. :P</p>
<p>With AI, a lot of my work has become easier. I am not complaining, but I do see how some people would view it as, “Oh, AI can do your job, why do we need you?” Valid question. I think I have become more of a programmer now. I code and fix small issues that would take AI more time and context to understand, but I do not code entire functionalities by hand anymore. I make a spec doc, give context, explain how it should be done, and then monitor things.</p>
<p>Doing two different things at once does become difficult because our brain is linear. Multiple context switches hurt productivity. But as AI gets better, I think a lot of things will be done by AI without supervision. Still, you cannot hold a machine accountable, right?</p>
<p>Coming back to what else I did in February. It was a short month, pun intended. Building a tool for 3 to 5 people can seem like it is not worthwhile, but you do it because you get paid for it. And the tool makes a difference in those 3 to 5 people’s lives.</p>
<p>I also worked on a library migration, but Codex did most of it for me. There were a lot of deployment issues, but it successfully made it semi working. Companies still need to come up with documentation that is LLM friendly. I do not want my LLM switching between URLs to connect the dots. LLMs are sensitive. I would like to make their lives easier.</p>
<p>I also filled out my self assessment form. You know, new financial year, bonus and increment on the way. I am a bit excited. Everyone loves more money. I do not know how it will turn out, though.</p>
<p>Back to technical work. I did some POCs, investigated issues, took interviews, etc.</p>
<p>I think I have started becoming more interested in self healing systems. When we create systems, we usually have a happy path in mind. But sometimes that path breaks down, and that is where self healing comes into the picture. I wrote several crons that heal the system if they detect something broken. For example, if file generation for a data export has been in progress for more than 3 hours, the cron checks whether anything is actually generating. If it is not, it redoes the file generation. Things like that.</p>
<p>If you are dealing with a third party service, keep all logs with you. Everything should be logged. Whenever something breaks down, you should have all the data points before sending them an email. It is the most you can do to make their lives easier and clearly convey your issues. Be a good person :P</p>
<p>What else, what else. Let me think. Cross system communication is always difficult to get right because there are so many chances for things to break. And if they do break, you need a resilient retry mechanism. Have a good retry mechanism. It will always help you.</p>
<p>Build resilient systems. You might get it right the first time, but you still need to make the system resilient, self healing, and robust. The customer depends on you.</p>
<p>I know there is all this talk about AI and AI agents, and as a software engineer you might be feeling dread. But it is part and parcel of life. Evolve. Become a fire breathing dragon and burn all the data centers. I am just kidding.</p>
<p>You need to evolve. You need to learn. You need to find something you like. The big AI labs are going to eat you alive, but at least you will have a tale to tell. I think optimizing life for stories is much more fun.</p>
<p>“Once upon a time…”</p>
<p>I do not want to get more technical. At some point, you become one with the system.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do in December (2025) and January (2026)]]></title><description><![CDATA[Well, I did not build a new feature. Most of my tasks were client-facing, focused on solving client issues. And the client had many issues. Things were escalated, and escalated again. Now the trust that I had built with higher management is kind of g...]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-in-december-2025-and-january-2026</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-in-december-2025-and-january-2026</guid><category><![CDATA[tech ]]></category><category><![CDATA[journey]]></category><category><![CDATA[Software Engineering]]></category><category><![CDATA[december]]></category><category><![CDATA[software development]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 25 Jan 2026 12:39:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1769344709169/42176e40-678c-47ab-a1cb-c252c3ebd0a4.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Well, I did not build a new feature. Most of my tasks were client-facing, focused on solving client issues. And the client had many issues. Things were escalated, and escalated again. Now the trust that I had built with higher management is kind of gone. The autonomy that I was offered is also kind of being rolled back. I am not getting that tech lead promotion. The biggest worry is getting fired, lmao.</p>
<p>I feel you can only put the blame on process to a certain extent. At the end of the process, there is always a human. I have learned a lot and gotten good at using AI for coding. I feel Anxiety Driven Development is the root cause. The thing is, if you are a good developer, you can also be good at Anxiety Driven Development. Alas, the whole experience did hurt my self-confidence. I don’t have answers right now. The astrologer said this year would be good for me professionally, but as it seems, it has been a not-so-good patch.</p>
<p>Okay, this is turning into a feelings-based blog. Ahahah. But I try to inculcate some career-building things too. You have to preserve trust. You will be given multiple chances, but you have to make sure that expectations are being met. At some point, the stress became too much to handle and I kind of unfolded. I still need to learn how to detach life from work. Work affects my life too much.</p>
<p>So what did I fix? The report format. You remember I was working on report generation? Yeah, the format of the report was incorrect and a few sheets were missed. Product gap. I missed a whole thing :(. So I worked at warp speed to fix it. That might have introduced some bugs, read two bugs. Fixed them too. Modularisation is your best friend. We added a lot of logs too. A lot.</p>
<p>Over-logging can sometimes be a problem, but when your client is asking questions and you don’t have answers, you log everything. Never trust a third-party vendor. Keep a parallel process to verify their data, like with webhooks. You receive a webhook and say, very good, nice. But verify. Use their API at some point to cross-check your system’s data integrity. Get good rate limits. 100 requests per minute is too little when you are dealing with a lot of data. Trust, but verify. Nobody knows what it means. Eh.</p>
<p>To handle heaps and heaps of data, which will only increase in the future, you have to be careful. Things can spiral out of control at any time. Your database gets full, your database crashes, etc. You need to keep a script ready to synchronise your system with third-party data. I also helped some colleagues with their tasks. Not a big deal, you are supposed to do that. I also took some interviews.</p>
<p>When you know the data is going to keep increasing, you have to plan accordingly. You have to design your system so that the amount of data never becomes a bottleneck. Will your system slow down when data becomes 10x or 20x? Yes, there are chances of that. Then you go and optimise things. Premature optimisation never helps anyone. You design your system around the data. Never plan too much for the future, because there is a good chance that the future you are planning for no longer exists.</p>
<p>If you are an engineer, your best bet to survive the AIpocalypse is to build product sense. You should talk to customers and understand their point of view. Sometimes you build what you think is correct. Most times, you build what is correct for the client. You have to keep the client aware of the system, but also not show them the entire system, including errors and internal details. They shouldn’t get confused. The features you build should speak for themselves.</p>
<p>Now that coding is automated, deciding what to code has become more important than ever. Every time a layer abstracts out coding, more code is written. My opinion is that a lot more code is about to enter servers.</p>
<p>Personally, I can only work on a maximum of two tasks in parallel with AI. Otherwise, I lose context and miss things. You also have to be specific with the AI about what to build. But I also feel you have to give some leeway to the AI to make its own decisions, like a coder. A software engineer should never be overly constrained. AI should be constrained, but not too much. Otherwise, it will follow your orders word for word and still introduce bugs just to please you :P</p>
<p>Testing is the key to success, especially testing that is deterministic in nature. The AI will not have full context about your business, so you have to be kind towards it and also make sure the context is sufficient to arrive at good solutions. Context, but not too much. AI doesn’t need to know your name. I love how it generates commit messages though, and how it helps me with sequential commands to resolve Git branching issues.</p>
<p>Solving client or customer issues is not just technical. It also depends on how well you can communicate and how easily you can make things understandable for the client. If they don’t understand, you have to make them feel like they understood. Most clients don’t care about technical details. They care about what went wrong, how you are going to fix it, and how you will make sure it never goes wrong again.</p>
<p>They are paying you to abstract the system for them and show only what drives ROI, not the mess behind the curtain. Customers love it when you fix issues quickly. It drives trust and happiness. You also have to set SLAs. Otherwise, every problem becomes high priority, and that is how things get messy and flow breaks. If everything is a priority, then nothing is a priority.</p>
<p>Automate everything. Your data integrity checks should be automated. Your deployments should be automated.</p>
<p>I added a Celery worker to the infra repo, which uses Argo CD to deploy things, and now most of the infrastructure is just YAML. You write YAML and bring infrastructure into existence. I am a simple man. I like SSH-ing into a server and deploying things. But automated deployments are cool. Automatic rollbacks are cool. Autoscaling is cool too. But all of this depends on your scale. To serve 10 users, you don’t need Kubernetes. Just SSH into your 8 GB RAM server and do what you want.</p>
<p>Simplicity is your best friend. Never overcomplicate solutions. If something can be solved easily by introducing a process change, do that instead of writing code. Some problems are better solved by not writing code at all.</p>
<p>Well, this blog was more of an update on my professional life than something technically cool. Maybe if I build a cool feature next, I will write about it. As of now, I am just resolving bugs and making sure the client is satisfied.</p>
<p>Okay, bye.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do in November]]></title><description><![CDATA[Well, I am going to be honest. I do not remember much right now, so I will try to recall things as I write.
I did a lot of pair programming to fix production issues, and I changed a webhook implementation that receives a lot of data periodically from...]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-in-november</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-in-november</guid><category><![CDATA[Developer]]></category><category><![CDATA[progress]]></category><category><![CDATA[november]]></category><category><![CDATA[Outage]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Fri, 05 Dec 2025 09:58:39 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1764928555252/098a77b4-e98c-4a18-a99c-81ac7b5ea311.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Well, I am going to be honest. I do not remember much right now, so I will try to recall things as I write.</p>
<p>I did a lot of pair programming to fix production issues, and I changed a webhook implementation that receives a lot of data periodically from a third party. It is a lot of data. I was also reminded of how important retries are, and how you have to adjust for deployment gaps. Some events might get missed, so they need to be replayed.</p>
<p>I handled some DevOps work around deploying a new service on our Kubernetes cluster. We use ArgoCD. Infrastructure as YAML still scares me a little, but it is manageable. The world is changing. I spent most of my time building scripts to automate sending data exports to clients and fixing production bugs.</p>
<p>We had made many changes directly on production, so we merged the production branch back into the develop branch. There were a lot of conflicts and I resolved them. We made many logical changes across the codebase to make it more stable. I also changed the logic behind one of our sync mechanisms. We sync data from a source of truth table, aggregate it, and display it to the user.</p>
<p>I need to improve modularity and code reuse. When two or three people work on similar tasks it becomes very difficult. When you are creating a new project, you should first add some basic functions that handle the core work of the project. I also want to solve more difficult technical problems. But I am afraid of issues that happen in distributed systems because they are so difficult to trace, and the fixes are usually trade offs. Maybe I should try working on developer tooling or even start my own company (ehehe just kidding). I do not have the technical expertise to get employed at companies that build databases or compilers or similar systems. I would love to write code that powers a satellite someday.</p>
<p>I have become the old developer who writes specification documents for everything. I list out the classes, the functions, and the logic flow. I think it also helps the LLM understand the problem. I have always been too eager to start writing code. I think I need to slow down and think more. Writing a spec document helps with that.</p>
<p>What else did I do? I also updated by my website. Well, not the resume, but the website. It looks cool now.</p>
<p>I am not sure if any of this directly generates revenue. Building revenue generating features is fun because the impact is measurable. I think I will be working on more production issues now. Dealing with customer issues, finding root causes, fixing them, and doing hot patches.</p>
<p>The absence of technical metrics hurts. We should have a dashboard where we can see which API was hit, how many times, and the failure rate. We are working on metrics but they are not on production yet. Since we are moving everything to Kubernetes, we do not want to spend time adding metrics to our current production servers. The production server I ssh into is huge, with thirty two cores and sixty four gigabytes of RAM. Everything runs so fast on it. I love it.</p>
<p>I get bored easily if I do not have interesting things to do, but I also know that the most redundant tasks generate the most value sometimes. I would also like to add Black formatter in our codebase because the code is very unstructured. Some cleanliness would help.</p>
<p>I was in a lot of meetings, which I usually do not prefer. They drain my social energy. Most calls can easily be emails.</p>
<p>A big trade off I have observed is deciding whether information is needed urgently or whether a delay is acceptable. Adding redundancy never hurts. Losing events can be a headache. Also, not all events are important. We are creating a new environment (like staging env) and that is fun in a way. Maybe not entirely fun, but still fun. It is all YAML. I do not like unstructured DevOps. Most use cases do not need Kubernetes. Basic automation is more than enough for ninety nine percent of scenarios.</p>
<p>I also audited the system for data discrepancies. We found some, which were fixed by the user. Then we found a few more. It is okay. What else? Let me think. I did a lot of small things that I do not fully remember. I helped my teammates remove blockers and handled deployments. November was mostly production issues.</p>
<p>I’d have written this earlier but damn viral cold. As I was writing this Cloudflare had an outage lmao. I was also thinking about developing a solution to minimise the lost baggage claims, but I have to deep dive into why it happens and everything. I dunno if there can be a software fix for this. Anyway, that’s it. Bye. Tata.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do (and think about) in October]]></title><description><![CDATA[Hmm, the world is a different place now. I used to write a lot of code to build features, but now I’m doing something else, something that’s more technical in nature and design oriented. I unblock people, resolve production issues, take on customer r...]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-and-think-about-in-october</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-and-think-about-in-october</guid><category><![CDATA[october]]></category><category><![CDATA[engineering]]></category><category><![CDATA[progress]]></category><category><![CDATA[software development]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 09 Nov 2025 06:14:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1762668721789/6688b565-c3cc-4860-884a-959781bf10fa.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Hmm, the world is a different place now.</strong> I used to write a lot of code to build features, but now I’m doing something else, something that’s more technical in nature and design oriented. I unblock people, resolve production issues, take on customer requests, and focus on the system’s durability. But I still like building shiny new things. It makes me feel like I’m doing something, and that something is somehow generating revenue.</p>
<p>So yeah, what did I do in October? I took a workshop on <strong>“AI/Prompting for Non-Tech People.”</strong> I added a lot of memes to my slides, and I think it went well. By “non-tech”, I mean people who don’t write code for a living. I think it was the highlight of the month. Why? Because it gave me visibility in the org.</p>
<p>The rest of the month went into fixing things - deployments, CI/CD pipelines, retry mechanisms, writing documents for POCs, and a few other tasks.</p>
<p>I also got two courses that I need to complete before Feb 2026. It’s a company requirement. The courses are about ArgoCD and building AI agents. What are AI Agents? Why are AI Agents? Will the AI Agents charge money to get things done? Agents. I’ve always thought I’m more interested in base system engineering like distributed systems, kernels, compilers, and all that. I always wanted to work on long-term problems, and now that I finally got one, well, it’s long. There’s no clear solution, just a bunch of trade-offs I’m still thinking about. It’s about decoupling a join dependency across different databases. Sounds pretty simple, right? But decoupling systems also adds multiple failure points, so you have to add restart mechanisms to make sure events don’t get missed. And even if they do, the system should be able to correct itself. <strong>Self-healing systems. OMG.</strong></p>
<p>Writing and committing code feels like the best path to job satisfaction, but how much code can you really write? To move ahead in your career as a techie, you have to eventually step back from writing code and focus on bigger problems. I see many people in Staff Engineer roles, and I don’t think they’re writing much code every day. Maybe reviewing a lot of it, but not necessarily writing it.</p>
<p>Anyway, I also fixed a production issue. The issue was pretty silly, instead of running a <em>create-or-update</em> query, only the <em>update</em> query was being executed, breaking the webhook ingestion flow. Lack of testing leads to such problems. As a dev, one needs to make testing easier. There were no unit tests, integration tests, or anything. The gap was wide enough for bugs to slip through. And when you depend on third-party services, you’ve got to cover all your tracks. Things can break in weird ways anytime.</p>
<p>Oh, and I also optimized a legacy API. It was running about 170 queries and taking 15 seconds to complete. There were no tests or assertions of any kind, so I created some, then optimized it down to 13 queries. Now it takes just one or two seconds. <em>Ehehe.</em></p>
<p>I also audited production data and created reports explaining what the gaps were, why they existed, and how they could be fixed. I spent a fair bit of time on calls too. As a dev, I don’t enjoy long meetings, but brainstorming can be fun sometimes. I also pointed out a few cases where our servers’ disks could fill up because of some video processing tasks. And I moved the report creation and download feature I built to S3. Once the reports are created, they’re uploaded there and that’s it :P Not much, but it was honest work.</p>
<p>I worked on bugs and a bunch of other things. I think my scope as an engineer is changing. I don’t know if that’s good or bad. I am becoming a better engineer though. Still, I want to create an algorithm that makes distributed systems more robust. Or build a kernel. Or make some OSS contribution to Linux. All of this feels like a pipe dream right now, but maybe someday.</p>
<p>I also need to update my website. I don’t know what to write there. I want to make it look nice, maybe give it a retro vibe. About OSS contributions - I already look at code from Monday to Friday, so looking at more code on weekends feels like a drag. I love coding, but sometimes you need to clear your mind. A lot of solutions come to you when you sleep and wake up with a ground-breaking idea for your problems.</p>
<p>This blog is slowly turning into a feelings blog. <em>Ugh.</em> I also want to try sales, just for fun, to see how it all works. I mean, we already sell ourselves in interviews, but selling an enterprise B2B SaaS must be a whole different experience.</p>
<p>I’ve also lost all my interviewing skills, and I was never really good at DSA-style interviews anyway. Man, sometimes the future looks bleak. Damn, this was supposed to be a technical post, but now we’re walking the thin line between real and delusional.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do in August, September]]></title><description><![CDATA[Yeah, I forgot to write in August, my bad. Sometimes I forget. But I’m back now.
I’ll also add my thoughts on using AI. Be patient, my friends this will be a bit longer than usual.

August
I optimised an API. In short, it was about pre-calculation an...]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-in-august-september</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-in-august-september</guid><category><![CDATA[August 2025]]></category><category><![CDATA[september ]]></category><category><![CDATA[tech ]]></category><category><![CDATA[progress]]></category><category><![CDATA[Django]]></category><category><![CDATA[Reports]]></category><category><![CDATA[work]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 28 Sep 2025 06:20:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1759040371711/30dae601-d064-438a-bd92-41bf0c980f82.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Yeah, I forgot to write in August, my bad. Sometimes I forget. But I’m back now.</p>
<p>I’ll also add my thoughts on using AI. Be patient, my friends this will be a bit longer than usual.</p>
<hr />
<h3 id="heading-august">August</h3>
<p>I optimised an API. In short, it was about pre-calculation and storage instead of running a huge join each time. This reduced the API response time from 23 seconds to 7 seconds. It’s still a join, but a much smaller one. :P</p>
<p>I spent quite a bit of time on this because it included backfilling data. The backfilling command took a lot of time. I didn’t work on optimising it because it was a one time thing.</p>
<p>Then I resolved a race condition issue caused by a webhook call. <code>select_for_update</code> worked well. Essentially, I just locked the row I was updating/creating. The problem was that the webhook calls were arriving milliseconds apart, creating two rows with the same data.</p>
<p>We could have added a <code>unique_together</code> constraint at the DB level, but I didn’t want to do that. I’m not that familiar with that part of the system and didn’t want to risk changes at the database level.</p>
<p>What else did I do… hmm.</p>
<p>Yeah, some CI/CD integrations and a bunch of other things. Refactored code, made it modular, deleted a lot of unused code, investigated production issues.</p>
<p>Most of my time went into testing the optimization. We had no unit tests, still don’t. I know that needs to be fixed, but migrations fail every time Django tests run on a fresh DB. I tried fixing it but couldn’t. There’s a migration file that doesn’t align with the DB schema narrative. :(</p>
<hr />
<h3 id="heading-september">September</h3>
<p>Remember how I said I’d been working on a reporting feature with data export? Well, it went OOM when selecting a large date range, and then it went OOM again when writing to Excel files.</p>
<p>Testing was brutal. Each large-scale test took 30–40 minutes. Every iteration felt like pulling teeth.</p>
<p>So I rolled out chunking:</p>
<ul>
<li><p>Fetch data in chunks.</p>
</li>
<li><p>Write each chunk to CSV.</p>
</li>
<li><p>Run in-memory aggregations.</p>
</li>
<li><p>Write final aggregations to file.</p>
</li>
<li><p>Re-process the CSV chunk by chunk and export to XLSX.</p>
</li>
</ul>
<p>But I still get OOM errors at different stages. Chunking solved DB-fetching OOMs, but file-writing OOMs persist. I think I’ll have to split the XLSX files into chunks too. And while doing this, I must ensure no unnecessary data stays in memory.</p>
<p>I also built some new APIs and made changes to existing features, like ingesting new data from a webhook.</p>
<p>Right now, I’m working on caching for another project. It currently has no cache, and a lot of queries hit the DB unnecessarily. Cache invalidation requirements led us to build a custom cache layer because we want invalidation triggered by changes in DB tables.</p>
<p>We also observed a lot of APIs making N+1 queries to the DB. We’re working on fixing them. We’re using <code>django-silk</code> to monitor all this behavior. Once your codebase grows large, observability becomes essential. Otherwise, things go haywire and you only notice at the last moment. You have to be proactive.</p>
<hr />
<h3 id="heading-thoughts-on-ai">Thoughts on AI</h3>
<p>I use multiple LLMs to write and fix code. One thing I’ve learned: you need tests. Without them, things become very difficult. The time from dev to “dev complete” has decreased, but the bottleneck has now moved to code reviews. After deploying for testing, multiple edge-case bugs show up, creating open loops and taking time to resolve.</p>
<p>Do I believe the cycle has become faster overall? Yes. Do I have data to prove it? Well, if I did, I’d be writing a paper on it. :P</p>
<p>AI does need supervision to drive it toward the correct answers or goals. Sometimes it starts “chasing its tail”, not great for us. There’s no single process for getting the best answers. Tech problems are usually open-ended and full of trade-offs, but AI can’t decide which trade-offs are acceptable.</p>
<p>I believe in developer liberty. Devs should be allowed to use whatever tools make them more productive and make their lives easier.</p>
<p>Context engineering is important, but sometimes magic happens, so you need to be open to that too. Make sure you provide the right amount of context, not too much and not too little. Otherwise, the LLM may consider factors that don’t really affect functionality. Be clear about your goal, but not too rigid about the approach.</p>
<p>There’s no “magic prompt.” Most of it is you talking to the LLM and working toward a solution. Treat it as your co-programmer, but don’t surrender your decision-making. Maintain your critical thinking skills. Don’t be fooled by the LLM, cross-check things whenever you’re suspicious.</p>
<p>If you have a very specific task, then no context is needed. For example: “Please extract the highlighted sentence from this screenshot.”</p>
<p>Some prompts that sometimes help: “please help,” “please fix this or my boss will fire me,” “my career depends on solving this issue,” “the deadline is closing in on me,” etc.</p>
<p>I don’t know what else to say. There’s no secret sauce. I don’t use autocomplete extensions like Cursor. Does that put me at a disadvantage? I don’t think so. I’m fine with the chat interface because I need to converse while working on a problem. Also, somehow talking to LLMs doesn’t count as social interaction, it doesn’t tire my mind.</p>
<p>I use all the LLMs I can, gather their code and approaches, and then choose whichever feels best to me.</p>
<p>I will close it now. I think this is all.</p>
<p>We will meet later.</p>
<p>Cheers.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do in July]]></title><description><![CDATA[Well, long time no see. I got pretty busy or maybe I just didn’t have the will to write this.
Hmm… existential crisis.
Anyway, I built a full end-to-end functionality using AI heavily, and honestly, it didn’t suck. Context is always a bottleneck. For...]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-in-july</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-in-july</guid><category><![CDATA[July]]></category><category><![CDATA[progress]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 09 Aug 2025 06:43:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754721711669/48dd0c75-47e4-4f36-90b3-7754500adaef.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Well, long time no see.</strong> I got pretty busy or maybe I just didn’t have the will to write this.</p>
<p><em>Hmm… existential crisis.</em></p>
<p>Anyway, I built a full end-to-end functionality using AI heavily, and honestly, it didn’t suck. Context is always a bottleneck. For similar tasks, there was a lot of repetition, but that’s likely because you have to provide so much context.</p>
<p>I think a better approach is to have AI first write an overall architecture, and then use a summary of that architecture along with your DB models and functions as context. Providing the functions and their return values works better than dumping a ton of raw code. Abstracting certain parts can really help.</p>
<hr />
<h3 id="heading-download-reports">Download Reports</h3>
<p>The functionality I built allows users to download reports in both JSON and XLSX formats - summarised reports and raw data. This needed to be an on-demand download so that the dependency on developers was removed.</p>
<p>It’s useful, but I’ve built similar reporting systems everywhere I’ve worked. There’s nothing novel in the core approach:</p>
<ul>
<li>Query the DB → Get data → Generate user-friendly reports.</li>
</ul>
<p>The real challenges are <strong>delivery and storage</strong>. Large datasets need cleanup after 3–4 months (unless you’re using buckets). But storage is cheap, so… who cares? ¯\_(ツ)_/¯</p>
<p>The client also wanted the files zipped, so we did that. Everything runs asynchronously.</p>
<p>AI wrote some code and I had to spend 4-6 hours finding and fixing a bug that AI introduced. Suxx.</p>
<hr />
<h3 id="heading-devops-work">DevOps Work</h3>
<p>I also did some DevOps. You’d think dockerising things is easy, it isn’t.</p>
<ul>
<li><p><strong>CI/CD:</strong> Setting up the pipeline was straightforward and saved us from manual deployments.</p>
</li>
<li><p><strong>Dockerising:</strong> Writing the Dockerfile is fine.</p>
</li>
<li><p><strong>Kubernetes:</strong> That’s where the pain starts. So many configurations, so many things that can go wrong. But once it’s set up, life becomes much easier. :P</p>
</li>
</ul>
<hr />
<h3 id="heading-mentoring-amp-bug-fixing">Mentoring &amp; Bug Fixing</h3>
<p>I resolved user-raised bugs and handled some organisational stuff.</p>
<p>I’m also mentoring two interns and helping with their projects. Let’s see how that turns out.</p>
<p>This was my first time mentoring someone showing them the ropes.</p>
<p>I told them: <em>Don’t use AI to write code directly.</em> Your ability to write and understand code will diminish. Use AI for brainstorming and idea generation, but write code with your own hands. Speed isn’t the point here.</p>
<hr />
<h3 id="heading-the-permission-bug">The Permission Bug</h3>
<p>We had an issue where files created by Celery tasks weren’t accessible to the Django server. The root cause? The <code>www-data</code> user (owned by Apache2) was different from the user running the Celery jobs.</p>
<p>I have also added aliases so that my life on the server becomes easy.</p>
<p>Solution:</p>
<pre><code class="lang-python">usermod &lt;username&gt;
</code></pre>
<p>One command. Done.</p>
<hr />
<h3 id="heading-learning-amp-side-projects">Learning &amp; Side Projects</h3>
<p>I feel I’m getting better at LLD and HLD, always pragmatic, no over-engineering.</p>
<p>I also built a CLI-based side project for timezones: <a target="_blank" href="https://pypi.org/project/chrononav/">Chrononav</a>, nice, right?</p>
<p>I want to dive deeper into lower-level programming, maybe even kernel-level stuff, but I’m not sure where to start. I’ll probably look for open-source repos to contribute to.</p>
<p>I feel basic CRUD work can already be done with AI, with just a few tweaks. It’s not fully there yet, but almost.</p>
<p>Maybe I’ll become a Rust or Haskell dev. I’ve tried learning Rust three times and even contributed small things to OSS Rust projects, but I don’t vibe with it. Or maybe my neuroplasticity is gone.</p>
<p>Still, I think jumping straight into a Rust project might work for me, I’m more of a hands-on learner.</p>
]]></content:encoded></item><item><title><![CDATA[Getting User Names from Facebook Messages: A Graph API Workaround]]></title><description><![CDATA[When building applications that handle Facebook messages, you'll quickly run into a pretty frustrating limitation: the messages webhook doesn't include the sender's name. This seems like a weird oversight since if a user has messaged your application...]]></description><link>https://tech.peculiarvivek.com/getting-user-names-from-facebook-messages-a-graph-api-workaround</link><guid isPermaLink="true">https://tech.peculiarvivek.com/getting-user-names-from-facebook-messages-a-graph-api-workaround</guid><category><![CDATA[Facebook]]></category><category><![CDATA[graph api]]></category><category><![CDATA[username]]></category><category><![CDATA[APIs]]></category><category><![CDATA[webhooks]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 31 May 2025 06:24:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1748672641680/6938c10c-b715-4e41-827d-f571f42ef8c0.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When building applications that handle Facebook messages, you'll quickly run into a pretty frustrating limitation: the messages webhook doesn't include the sender's name. This seems like a weird oversight since if a user has messaged your application, you should reasonably have access to their basic information like their name, right? Here's how we solved this problem without going through the lengthy app review process.</p>
<h2 id="heading-the-problem">The Problem</h2>
<p>Facebook's Messages Webhook gives you plenty of useful information when someone sends a message to your page or app:</p>
<ul>
<li><p>Message content</p>
</li>
<li><p>Timestamp</p>
</li>
<li><p>Message ID</p>
</li>
<li><p>Sender ID (PSID - Page Scoped ID)</p>
</li>
</ul>
<p>But what's notably missing? The sender's name! This creates a pretty poor user experience since you can't personalise responses or properly identify who you're talking to in your system.</p>
<h2 id="heading-the-obvious-solution-that-we-wanted-to-avoid">The Obvious Solution (That We Wanted to Avoid)</h2>
<p>The most straightforward approach would be to use the User Profile API:</p>
<pre><code class="lang-python">GET /{user-id}?fields=first_name,last_name
</code></pre>
<p>However, this API requires the <code>user_profile</code> permission, which is not available by default. To get this permission, Facebook requires you to go through their App Review process. This means:</p>
<ul>
<li><p>Submitting detailed documentation explaining why you need user profile data</p>
</li>
<li><p>Waiting weeks (sometimes months) for approval</p>
</li>
<li><p>Providing screencasts and detailed use case justifications</p>
</li>
<li><p>Risk of rejection if Facebook doesn't deem your use case valid</p>
</li>
</ul>
<p>Since users are actively messaging us, requiring an extensive app review process just to get their names felt excessive and time-consuming. We needed a solution that worked immediately without bureaucratic delays.</p>
<h2 id="heading-our-workaround-the-conversations-api">Our Workaround: The Conversations API</h2>
<p>After some digging around, we discovered a neat workaround using the Conversations API. Here's how it works:</p>
<h3 id="heading-step-1-use-the-conversations-api-with-filters">Step 1: Use the Conversations API with Filters</h3>
<p>Instead of trying to get user details directly, we queried the conversations endpoint with specific parameters:</p>
<pre><code class="lang-python">GET /{page-id}/conversations?platform=messenger&amp;user_id={sender_id}
</code></pre>
<p>This approach filters conversations by:</p>
<ul>
<li><p><code>platform=messenger</code> - Only messenger conversations (or you can use <code>platform=instagram</code> for Instagram messages!)</p>
</li>
<li><p><code>user_id={sender_id}</code> - Only conversations with the specific user who sent the message</p>
</li>
</ul>
<p>This successfully returned conversation data that included the user's name! And here's a bonus: this same approach works for Instagram messages too - just change the platform parameter to <code>instagram</code> and you'll get Instagram usernames.</p>
<h3 id="heading-step-2-alternative-approach-using-message-id">Step 2: Alternative Approach Using Message ID</h3>
<p>We also found another path that worked pretty well. Using the message ID from the webhook, we could get detailed information:</p>
<pre><code class="lang-python">GET /{message-id}?fields=to,<span class="hljs-keyword">from</span>
</code></pre>
<p>The <code>fields=to,from</code> parameter returns sender and recipient information, including names. This gave us another solid way to extract user names from the conversation data.</p>
<h2 id="heading-implementation-strategy">Implementation Strategy</h2>
<p>Once you have the user's name through either method, the implementation is pretty straightforward:</p>
<ol>
<li><p><strong>Extract the name</strong> from the API response</p>
</li>
<li><p><strong>Map the sender_id to the name</strong> in your database</p>
</li>
<li><p><strong>Cache this mapping</strong> for future messages from the same user</p>
</li>
</ol>
<p>This way, you only need to make the API call once per user, and subsequent messages can use the cached name data.</p>
<h2 id="heading-why-this-works">Why This Works</h2>
<p>This approach works because:</p>
<ol>
<li><p><strong>No special permissions required</strong> - The conversations API is available with basic page access tokens</p>
</li>
<li><p><strong>Logical data access</strong> - If someone messages you, you should be able to see who they are</p>
</li>
<li><p><strong>Efficient</strong> - You only need to fetch the name once per user</p>
</li>
<li><p><strong>User-initiated</strong> - The user has already engaged with your application</p>
</li>
<li><p><strong>Cross-platform</strong> - Works for both Facebook Messenger and Instagram messages</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>While Facebook's webhook design choice to exclude sender names is pretty puzzling, the Conversations API provides a legit workaround. This solution lets you provide a personalised experience without the overhead of app review, while still respecting user privacy since you're only accessing names of users who have actively messaged your application.</p>
<p>The key insight here is that Facebook's Graph API often has multiple paths to the same data - sometimes the direct route requires more permissions, but alternative endpoints can give you the same information through the relationships between different objects in the social graph.</p>
<p>Just remember to always respect user privacy and only collect the data you actually need for your application's functionality. In this case, knowing who is messaging you is totally reasonable for providing good customer service and user experience.</p>
]]></content:encoded></item><item><title><![CDATA[Yield and we will spare your lives!]]></title><description><![CDATA[If you have programmed in python then you must have come across the keyword `yield` so what does it do? It YIELDS!

Have you been through the above pain? So what do we do?
We do not Yield to the machine overlord. We try to become efficient, and don’t...]]></description><link>https://tech.peculiarvivek.com/yield-and-we-will-spare-your-lives</link><guid isPermaLink="true">https://tech.peculiarvivek.com/yield-and-we-will-spare-your-lives</guid><category><![CDATA[Python]]></category><category><![CDATA[yield]]></category><category><![CDATA[lazy loading]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 15 Mar 2025 14:33:50 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1742049220431/bc664fe0-9f1d-4ab1-904e-f1d949f4e74c.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you have programmed in python then you must have come across the keyword `yield` so what does it do? It YIELDS!</p>
<p><img src="https://i.imgflip.com/9nhavb.jpg" alt /></p>
<p>Have you been through the above pain? So what do we do?</p>
<p>We do not Yield to the machine overlord. We try to become efficient, and don’t let RAM limit our ambitions.</p>
<p>So yield will yield the data line by line.</p>
<p>Let us do a simple test.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> time
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> psutil
<span class="hljs-keyword">import</span> random


<span class="hljs-comment"># Without generators (reading entire file into memory)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_file_without_generator</span>(<span class="hljs-params">filename</span>):</span>
    <span class="hljs-keyword">with</span> open(filename, <span class="hljs-string">"r"</span>) <span class="hljs-keyword">as</span> file:
        lines = file.readlines()  <span class="hljs-comment"># Loads entire file into memory</span>
    <span class="hljs-keyword">return</span> lines


<span class="hljs-comment"># With generators (using yield)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">read_file_with_generator</span>(<span class="hljs-params">filename</span>):</span>
    <span class="hljs-keyword">with</span> open(filename, <span class="hljs-string">"r"</span>) <span class="hljs-keyword">as</span> file:
        <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> file:  <span class="hljs-comment"># Reads one line at a time</span>
            <span class="hljs-keyword">yield</span> line.strip()


<span class="hljs-comment"># Create a large test file (100MB)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_large_file</span>(<span class="hljs-params">filename, size_mb=<span class="hljs-number">100</span></span>):</span>
    <span class="hljs-keyword">with</span> open(filename, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
        <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(size_mb * <span class="hljs-number">10000</span>):  <span class="hljs-comment"># ~10KB per 100 lines</span>
            f.write(
                <span class="hljs-string">""</span>.join(random.choice(<span class="hljs-string">"abcdefghijklmnopqrstuvwxyz"</span>) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(<span class="hljs-number">100</span>))
                + <span class="hljs-string">"\n"</span>
            )


<span class="hljs-comment"># Memory usage function</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_memory_usage</span>():</span>
    process = psutil.Process(os.getpid())
    <span class="hljs-keyword">return</span> process.memory_info().rss / <span class="hljs-number">1024</span> / <span class="hljs-number">1024</span>  <span class="hljs-comment"># MB</span>


<span class="hljs-comment"># Test without generator</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_without_generator</span>(<span class="hljs-params">filename</span>):</span>
    start_time = time.time()
    start_mem = get_memory_usage()

    lines = read_file_without_generator(filename)
    count = sum(<span class="hljs-number">1</span> <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> lines <span class="hljs-keyword">if</span> <span class="hljs-string">"a"</span> <span class="hljs-keyword">in</span> line)  <span class="hljs-comment"># Count lines containing 'a'</span>

    end_time = time.time()
    end_mem = get_memory_usage()

    <span class="hljs-keyword">return</span> {
        <span class="hljs-string">"time"</span>: end_time - start_time,
        <span class="hljs-string">"memory"</span>: end_mem - start_mem,
        <span class="hljs-string">"count"</span>: count,
    }


<span class="hljs-comment"># Test with generator</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_with_generator</span>(<span class="hljs-params">filename</span>):</span>
    start_time = time.time()
    start_mem = get_memory_usage()

    lines = read_file_with_generator(filename)
    count = sum(<span class="hljs-number">1</span> <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> lines <span class="hljs-keyword">if</span> <span class="hljs-string">"a"</span> <span class="hljs-keyword">in</span> line)  <span class="hljs-comment"># Count lines containing 'a'</span>

    end_time = time.time()
    end_mem = get_memory_usage()

    <span class="hljs-keyword">return</span> {
        <span class="hljs-string">"time"</span>: end_time - start_time,
        <span class="hljs-string">"memory"</span>: end_mem - start_mem,
        <span class="hljs-string">"count"</span>: count,
    }


<span class="hljs-comment"># Run the comparison</span>
temp_file = <span class="hljs-string">"large_test_file.txt"</span>
create_large_file(temp_file)

print(<span class="hljs-string">"Testing without generator..."</span>)
result1 = test_without_generator(temp_file)

print(<span class="hljs-string">"Testing with generator..."</span>)
result2 = test_with_generator(temp_file)

print(<span class="hljs-string">"\nResults:"</span>)
print(<span class="hljs-string">f"Without generator: <span class="hljs-subst">{result1[<span class="hljs-string">'time'</span>]:<span class="hljs-number">.2</span>f}</span> seconds, <span class="hljs-subst">{result1[<span class="hljs-string">'memory'</span>]:<span class="hljs-number">.2</span>f}</span> MB"</span>)
print(<span class="hljs-string">f"With generator: <span class="hljs-subst">{result2[<span class="hljs-string">'time'</span>]:<span class="hljs-number">.2</span>f}</span> seconds, <span class="hljs-subst">{result2[<span class="hljs-string">'memory'</span>]:<span class="hljs-number">.2</span>f}</span> MB"</span>)
print(<span class="hljs-string">f"Memory saved: <span class="hljs-subst">{result1[<span class="hljs-string">'memory'</span>] - result2[<span class="hljs-string">'memory'</span>]:<span class="hljs-number">.2</span>f}</span> MB"</span>)
print(<span class="hljs-string">f"Time difference: <span class="hljs-subst">{result1[<span class="hljs-string">'time'</span>] - result2[<span class="hljs-string">'time'</span>]:<span class="hljs-number">.2</span>f}</span> seconds"</span>)
</code></pre>
<p>It is more like lazy loading.</p>
<pre><code class="lang-python">Testing without generator...
Testing <span class="hljs-keyword">with</span> generator...

Results:
Without generator: <span class="hljs-number">0.11</span> seconds, <span class="hljs-number">153.16</span> MB
With generator: <span class="hljs-number">0.13</span> seconds, <span class="hljs-number">0.19</span> MB
Memory saved: <span class="hljs-number">152.97</span> MB
Time difference: <span class="hljs-number">-0.01</span> seconds
</code></pre>
<p>So you see there was a little difference in time of execution, but you can see how much memory the program hogged.</p>
<p>The general idea is to process the file in chunks, and one can use `yield` for that in python.</p>
]]></content:encoded></item><item><title><![CDATA[What did I do in February?]]></title><description><![CDATA[February was a short month - pun intended. Ahaha, I’m so funny.
We deployed the app backend to production without Docker, using Apache2 with mod_wsgi to serve the Django app. However, the Authorization header wasn’t reaching the Django app. After som...]]></description><link>https://tech.peculiarvivek.com/what-did-i-do-in-february</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-do-in-february</guid><category><![CDATA[Django]]></category><category><![CDATA[apache]]></category><category><![CDATA[mod_wsgi]]></category><category><![CDATA[celery]]></category><category><![CDATA[progress]]></category><category><![CDATA[february]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 01 Mar 2025 05:42:35 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1740807708682/3e55fe86-cd90-4033-9e19-d5a2a253b267.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>February was a short month - pun intended. Ahaha, I’m so funny.</p>
<p>We deployed the app backend to production without Docker, using Apache2 with <code>mod_wsgi</code> to serve the Django app. However, the Authorization header wasn’t reaching the Django app. After some debugging, I discovered that when using <code>mod_wsgi</code>, you need to add the <code>WSGIPassAuthorization On</code> directive in your configuration file to ensure the Authorization header is included in the request.</p>
<p>Why you have to add the directive? They say it is for security. What security man?</p>
<p><img src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/419a8f9f-63f0-480a-99b4-a48714df8376/deploy.jpg" alt="💃 Appease the Deployment Gods with Gearset 🕺" /></p>
<p>I need to create a deployment template - there should be a common one for all deployments. Over the past two months, I’ve deployed two Django apps to production. For one, I used Docker to run it and Apache2 to serve it to the internet. For the other, I used <code>mod_wsgi</code> with Apache2. Both are running smoothly. One thing I feel I need to add to my stack is observability.</p>
<p>I also worked on writing raw SQL queries for generating reports. Initially, I used Django ORM, but my lead pointed out that it was inefficient since I needed to perform joins. The challenge was that the two tables were in separate databases, which made things tricky. But after some trial and error, I figured it out.</p>
<p>I had to add custom filtering for the report. Like, user wants only some specific condition to be met etc. etc.</p>
<p>Here’s how report generation works: The API gets called, which triggers a Celery task, and once the report is ready, Celery sends a notification. Cool, right? I mean, this is just standard process - ehehehe.</p>
<p><img src="https://www.blazesql.com/images/memes/skills.jpeg" alt="SQL Meme, you say? Check out the best SQL Memes ever." /></p>
<p>I debugged a streak maintenance issue where our logic allowed yesterday as a grace day to continue the streak. This caused some problems, and I had to tweak the logic multiple times - lmao. But it finally worked out.</p>
<p>I also wrote a README (API documentation) for the backend I worked on. It was time-consuming, but necessary. Maybe someone will need it in the future.</p>
<p><img src="https://codinginfinite.com/wp-content/uploads/2019/05/documentation-where-were-going-we-dont-need-documentation.jpg" alt="10 Funny Jokes Of 2019 Only Programmers Will Get | Meme | Funny" /></p>
<p>I worked on some smaller issues, but that’s about it.</p>
<p>I filled my self assessment form too, it was fun.</p>
<p>AI wrote a lot of code for me, and I’m kind of enjoying it. Occasionally, I have to fix minor issues, but that’s no big deal.</p>
<p>I also started learning Haskell - turns out most of functional programming is just list comprehension. :P</p>
<p><img src="https://i.redd.it/s0l1q498p8521.jpg" alt="Haskell is an easy to learn language : r/ProgrammerHumor" /></p>
<p>I’m not sure how to become a better developer now. With AI, everything is at my fingertips—tasks that used to take days now take hours, sometimes even minutes. I’m definitely faster, but I’m not sure if that means I’m better.</p>
<p>Since tech hiring is so focused on interviews, I wonder how things will evolve in the future.</p>
]]></content:encoded></item><item><title><![CDATA[What did I work on in January?]]></title><description><![CDATA[I think it's time to document everything work-related. I am not sure if I can add the links to my work here, I’ll ask my boss.
What happened in January?
I primarily worked on two major tasks:

Deploying a React-Django app (which I built end to end) t...]]></description><link>https://tech.peculiarvivek.com/what-did-i-work-on-in-january</link><guid isPermaLink="true">https://tech.peculiarvivek.com/what-did-i-work-on-in-january</guid><category><![CDATA[Django]]></category><category><![CDATA[summary]]></category><category><![CDATA[january]]></category><category><![CDATA[React]]></category><category><![CDATA[deployment]]></category><category><![CDATA[apache2 ]]></category><category><![CDATA[PostgreSQL]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 01 Feb 2025 05:45:15 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738388485485/6d12d60d-26b5-4da0-997b-2005dd72352c.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I think it's time to document everything work-related. I am not sure if I can add the links to my work here, I’ll ask my boss.</p>
<h4 id="heading-what-happened-in-january">What happened in January?</h4>
<p>I primarily worked on two major tasks:</p>
<ol>
<li><p>Deploying a <strong>React-Django app</strong> (which I built end to end) to production.</p>
</li>
<li><p>Working on the backend of an application that uses <strong>LLM as a reasoning layer</strong>.</p>
</li>
</ol>
<p>Additionally, I worked on fetching reviews from third-party APIs and storing them in the database. Sounds simple, right? It actually is—but only if you handle edge cases properly, avoid rate limits, and ensure that unnecessary costs aren’t incurred.</p>
<h4 id="heading-search-functionality-in-the-react-django-app">Search Functionality in the React-Django App</h4>
<p>I implemented a search feature using <strong>PostgreSQL’s trigram search</strong>, which comes as a built-in extension. Why trigram? Because the dataset was limited to just <strong>90 records</strong> - there was no need to overcomplicate things. Maybe in the future, I'll consider <strong>Elasticsearch</strong>.</p>
<h4 id="heading-what-did-i-learn">What Did I Learn?</h4>
<ul>
<li><p><strong>Dealing with clients</strong></p>
</li>
<li><p><strong>Writing better documentation</strong></p>
</li>
<li><p><strong>Deployments (those pesky config issues)</strong></p>
</li>
<li><p><strong>Frontend adjustments</strong>: minor changes, centering elements, aligning components—patience is key!</p>
</li>
</ul>
<p>For serving static assets, I used <strong>Apache</strong>, while the Django app ran with <code>runserver</code> since we weren’t expecting much traffic.</p>
<h4 id="heading-anything-novel-this-month">Anything Novel This Month?</h4>
<p>Honestly, nothing groundbreaking. I didn’t explore any new technologies - just relied on good old <strong>reliable tech</strong>. Maybe I should get back to my <strong>Rust learning journey</strong>. I also want to write systems level code, some kernel functions - given the AI boom - I think I can learn things easily. Some cuda, etc.</p>
<h4 id="heading-ai-in-development">AI in Development</h4>
<p>I used a lot of AI-generated code:</p>
<ul>
<li><p><strong>Claude</strong> was useful for frontend work.</p>
</li>
<li><p><strong>GPT-4o &amp; 4o-mini</strong> helped with Python code.</p>
</li>
<li><p><strong>O1</strong> assisted with deployment issues.</p>
</li>
</ul>
<p>Research was mostly done via <strong>Google</strong>. Documentation is underrated—it really helps.</p>
<p>I enjoy working on <strong>early-stage products</strong>.</p>
<p>AI has definitely made coding <strong>faster</strong>, but debugging AI-generated errors can be time-consuming. Speaking of which, <strong>o3-mini</strong> just launched - I haven’t tried it yet.</p>
<p><strong>Adding some links I found useful</strong></p>
<ul>
<li><p><a target="_blank" href="https://www.postgresql.org/docs/current/pgtrgm.html">https://www.postgresql.org/docs/current/pgtrgm.html</a></p>
</li>
<li><p><a target="_blank" href="https://docs.djangoproject.com/en/4.1/ref/contrib/postgres/lookups/#trigram-similarity">https://docs.djangoproject.com/en/4.1/ref/contrib/postgres/lookups/#trigram-similarity</a></p>
</li>
<li><p><a target="_blank" href="https://claude.ai">https://claude.ai</a></p>
</li>
<li><p><a target="_blank" href="https://www.digitalocean.com/community/tutorials/how-to-configure-the-apache-web-server-on-an-ubuntu-or-debian-vps">https://www.digitalocean.com/community/tutorials/how-to-configure-the-apache-web-server-on-an-ubuntu-or-debian-vps</a></p>
</li>
<li><p><a target="_blank" href="https://whitenoise.readthedocs.io/en/stable/django.html">https://whitenoise.readthedocs.io/en/stable/django.html</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[LLMs are going to take away our jobs]]></title><description><![CDATA[I came across a question on twitter/X. I had some ideas on how to solve it. But then I thought why not ask our LLMs if they can solve it. So let’s take a look at the question.
I didn’t wanna type the question but Multi-modal LLMs for the win!

Claude...]]></description><link>https://tech.peculiarvivek.com/llms-are-going-to-take-away-our-jobs</link><guid isPermaLink="true">https://tech.peculiarvivek.com/llms-are-going-to-take-away-our-jobs</guid><category><![CDATA[llm]]></category><category><![CDATA[#PromptEngineering]]></category><category><![CDATA[DFS]]></category><category><![CDATA[BFS]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Wed, 01 Jan 2025 14:24:39 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1735741403917/c6235202-8185-4b24-afff-d79f597faffb.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I came across a question on twitter/X. I had some ideas on how to solve it. But then I thought why not ask our LLMs if they can solve it. So let’s take a look at the question.</p>
<p>I didn’t wanna type the question but Multi-modal LLMs for the win!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1735733744649/767c3ec3-f0b0-48c0-a55b-2783ff0d331e.webp" alt class="image--center mx-auto" /></p>
<p><strong><em>Claude Sonnet 3.5</em></strong></p>
<p>Python Solution</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> collections <span class="hljs-keyword">import</span> deque


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">parse_maze</span>(<span class="hljs-params">grid</span>):</span>
    <span class="hljs-string">"""Convert list of strings maze representation to 2D list and find target points."""</span>
    maze = [list(line) <span class="hljs-keyword">for</span> line <span class="hljs-keyword">in</span> grid]
    target_points = []
    <span class="hljs-keyword">for</span> i, row <span class="hljs-keyword">in</span> enumerate(maze):
        <span class="hljs-keyword">for</span> j, cell <span class="hljs-keyword">in</span> enumerate(row):
            <span class="hljs-keyword">if</span> cell == <span class="hljs-string">"*"</span>:
                target_points.append((i, j))
    <span class="hljs-keyword">return</span> maze, target_points


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_neighbors</span>(<span class="hljs-params">point, shape</span>):</span>
    <span class="hljs-string">"""Get valid neighboring points (up, down, left, right)."""</span>
    y, x = point
    neighbors = []
    <span class="hljs-keyword">for</span> dy, dx <span class="hljs-keyword">in</span> [(<span class="hljs-number">0</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">-1</span>), (<span class="hljs-number">-1</span>, <span class="hljs-number">0</span>)]:
        ny, nx = y + dy, x + dx
        <span class="hljs-keyword">if</span> <span class="hljs-number">0</span> &lt;= ny &lt; shape[<span class="hljs-number">0</span>] <span class="hljs-keyword">and</span> <span class="hljs-number">0</span> &lt;= nx &lt; shape[<span class="hljs-number">1</span>]:
            neighbors.append((ny, nx))
    <span class="hljs-keyword">return</span> neighbors


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">shortest_path_length</span>(<span class="hljs-params">maze, start, end</span>):</span>
    <span class="hljs-string">"""Calculate shortest path length between two points avoiding walls."""</span>
    <span class="hljs-keyword">if</span> start == end:
        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>

    visited = set()
    queue = deque([(start, <span class="hljs-number">0</span>)])
    visited.add(start)

    <span class="hljs-keyword">while</span> queue:
        current, dist = queue.popleft()

        <span class="hljs-keyword">for</span> neighbor <span class="hljs-keyword">in</span> get_neighbors(current, (len(maze), len(maze[<span class="hljs-number">0</span>]))):
            <span class="hljs-keyword">if</span> neighbor <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> visited <span class="hljs-keyword">and</span> maze[neighbor[<span class="hljs-number">0</span>]][neighbor[<span class="hljs-number">1</span>]] != <span class="hljs-string">"#"</span>:
                <span class="hljs-keyword">if</span> neighbor == end:
                    <span class="hljs-keyword">return</span> dist + <span class="hljs-number">1</span>
                visited.add(neighbor)
                queue.append((neighbor, dist + <span class="hljs-number">1</span>))

    <span class="hljs-keyword">return</span> float(<span class="hljs-string">"inf"</span>)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">total_distance</span>(<span class="hljs-params">maze, point, target_points</span>):</span>
    <span class="hljs-string">"""Calculate sum of shortest path distances from point to all target points."""</span>
    <span class="hljs-keyword">return</span> sum(shortest_path_length(maze, point, target) <span class="hljs-keyword">for</span> target <span class="hljs-keyword">in</span> target_points)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">solve_geometric_median_maze</span>(<span class="hljs-params">grid</span>):</span>
    <span class="hljs-string">"""Find the point that minimizes sum of distances to all target points."""</span>
    maze, target_points = parse_maze(grid)
    min_total_dist = float(<span class="hljs-string">"inf"</span>)
    best_point = <span class="hljs-literal">None</span>

    <span class="hljs-comment"># Try all empty spaces as potential solution points</span>
    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(len(maze)):
        <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> range(len(maze[<span class="hljs-number">0</span>])):
            <span class="hljs-keyword">if</span> maze[i][j] != <span class="hljs-string">"#"</span>:  <span class="hljs-comment"># If not a wall</span>
                dist = total_distance(maze, (i, j), target_points)
                <span class="hljs-keyword">if</span> dist &lt; min_total_dist:
                    min_total_dist = dist
                    best_point = (i, j)

    <span class="hljs-keyword">return</span> best_point, min_total_dist


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">visualize_solution</span>(<span class="hljs-params">maze, solution_point</span>):</span>
    <span class="hljs-string">"""Create a string visualization of the maze with the solution point."""</span>
    maze_with_solution = [row[:] <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> maze]
    <span class="hljs-keyword">if</span> solution_point:
        maze_with_solution[solution_point[<span class="hljs-number">0</span>]][solution_point[<span class="hljs-number">1</span>]] = <span class="hljs-string">"O"</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">"\n"</span>.join(<span class="hljs-string">""</span>.join(row) <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> maze_with_solution)


<span class="hljs-comment"># Example grid</span>
grid = [
    <span class="hljs-string">"###################"</span>,
    <span class="hljs-string">"#    #     #      #"</span>,
    <span class="hljs-string">"#  *  #  *  #     #"</span>,
    <span class="hljs-string">"#    #     #      #"</span>,
    <span class="hljs-string">"#  #  #  #        #"</span>,
    <span class="hljs-string">"#  #           #  #"</span>,
    <span class="hljs-string">"#  #        *     #"</span>,
    <span class="hljs-string">"#  #  #           #"</span>,
    <span class="hljs-string">"#     #           #"</span>,
    <span class="hljs-string">"#  *  #           #"</span>,
    <span class="hljs-string">"###################"</span>,
]

<span class="hljs-comment"># Solve the maze</span>
solution_point, total_dist = solve_geometric_median_maze(grid)
print(<span class="hljs-string">f"Best solution point (row, col): <span class="hljs-subst">{solution_point}</span>"</span>)
print(<span class="hljs-string">f"Total distance to all targets: <span class="hljs-subst">{total_dist}</span>"</span>)

<span class="hljs-comment"># Visualize solution</span>
maze, _ = parse_maze(grid)
print(<span class="hljs-string">"\nSolution visualization:"</span>)
print(visualize_solution(maze, solution_point))
</code></pre>
<p><strong><em>GPT4o</em></strong></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> queue <span class="hljs-keyword">import</span> Queue


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">solve_min_distance</span>(<span class="hljs-params">grid</span>):</span>
    <span class="hljs-comment"># Check if all rows are of the same length</span>
    row_lengths = [len(row) <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> grid]
    <span class="hljs-keyword">if</span> len(set(row_lengths)) &gt; <span class="hljs-number">1</span>:
        print(<span class="hljs-string">"Warning: Grid rows have inconsistent lengths!"</span>)
        print(<span class="hljs-string">f"Row lengths: <span class="hljs-subst">{row_lengths}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>, <span class="hljs-literal">None</span>

    <span class="hljs-comment"># Convert the grid to a numerical format: 0 = empty space, 1 = wall, 2 = target</span>
    grid_numeric = []
    targets = []

    <span class="hljs-keyword">for</span> r, row <span class="hljs-keyword">in</span> enumerate(grid):
        grid_row = []
        <span class="hljs-keyword">for</span> c, char <span class="hljs-keyword">in</span> enumerate(row):
            <span class="hljs-keyword">if</span> char == <span class="hljs-string">"#"</span>:
                grid_row.append(<span class="hljs-number">1</span>)  <span class="hljs-comment"># Wall</span>
            <span class="hljs-keyword">elif</span> char == <span class="hljs-string">"*"</span>:
                grid_row.append(<span class="hljs-number">2</span>)  <span class="hljs-comment"># Target</span>
                targets.append((r, c))  <span class="hljs-comment"># Store target positions</span>
            <span class="hljs-keyword">else</span>:
                grid_row.append(<span class="hljs-number">0</span>)  <span class="hljs-comment"># Empty space</span>
        grid_numeric.append(grid_row)

    <span class="hljs-comment"># Debug: Print grid numeric representation</span>
    print(<span class="hljs-string">"Grid numeric representation:"</span>)
    <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> grid_numeric:
        print(row)

    <span class="hljs-comment"># Validate targets</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> targets:
        print(<span class="hljs-string">"No targets (*) found in the grid."</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>, <span class="hljs-literal">None</span>

    <span class="hljs-comment"># Helper function to compute shortest path distance using BFS</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bfs_distance</span>(<span class="hljs-params">grid, start</span>):</span>
        rows, cols = len(grid), len(grid[<span class="hljs-number">0</span>])
        visited = [
            [<span class="hljs-number">-1</span> <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(cols)] <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(rows)
        ]  <span class="hljs-comment"># Distance initialized to -1 (unvisited)</span>
        q = Queue()
        q.put(start)
        visited[start[<span class="hljs-number">0</span>]][start[<span class="hljs-number">1</span>]] = <span class="hljs-number">0</span>

        directions = [(<span class="hljs-number">-1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">-1</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">1</span>)]  <span class="hljs-comment"># Up, down, left, right</span>

        <span class="hljs-keyword">while</span> <span class="hljs-keyword">not</span> q.empty():
            x, y = q.get()

            <span class="hljs-keyword">for</span> dx, dy <span class="hljs-keyword">in</span> directions:
                nx, ny = x + dx, y + dy

                <span class="hljs-comment"># Debug: Check next position</span>
                <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> (<span class="hljs-number">0</span> &lt;= nx &lt; rows <span class="hljs-keyword">and</span> <span class="hljs-number">0</span> &lt;= ny &lt; cols):
                    print(<span class="hljs-string">f"Skipping out-of-bounds position: (<span class="hljs-subst">{nx}</span>, <span class="hljs-subst">{ny}</span>)"</span>)
                    <span class="hljs-keyword">continue</span>

                <span class="hljs-keyword">if</span> (
                    visited[nx][ny] == <span class="hljs-number">-1</span> <span class="hljs-keyword">and</span> grid[nx][ny] != <span class="hljs-number">1</span>
                ):  <span class="hljs-comment"># Not visited and not a wall</span>
                    visited[nx][ny] = visited[x][y] + <span class="hljs-number">1</span>
                    q.put((nx, ny))

        <span class="hljs-keyword">return</span> visited

    <span class="hljs-comment"># Compute distance maps for each target</span>
    distance_maps = [bfs_distance(grid_numeric, target) <span class="hljs-keyword">for</span> target <span class="hljs-keyword">in</span> targets]

    <span class="hljs-comment"># Replace all occurrences of -1 in the distance maps with a large value (infinity)</span>
    INF = float(<span class="hljs-string">"inf"</span>)
    refined_distance_maps = []
    <span class="hljs-keyword">for</span> i, distance_map <span class="hljs-keyword">in</span> enumerate(distance_maps):
        refined_map = []
        <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> distance_map:
            refined_map.append([cell <span class="hljs-keyword">if</span> cell != <span class="hljs-number">-1</span> <span class="hljs-keyword">else</span> INF <span class="hljs-keyword">for</span> cell <span class="hljs-keyword">in</span> row])
        refined_distance_maps.append(refined_map)

        <span class="hljs-comment"># Debug: Output the distance map</span>
        print(<span class="hljs-string">f"Distance map for target <span class="hljs-subst">{targets[i]}</span>:"</span>)
        <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> refined_map:
            print(row)

    <span class="hljs-comment"># Calculate the combined distance map</span>
    rows, cols = len(grid_numeric), len(grid_numeric[<span class="hljs-number">0</span>])
    combined_distances = [[<span class="hljs-number">0</span>] * cols <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(rows)]

    <span class="hljs-comment"># Debug: Check grid size</span>
    print(<span class="hljs-string">f"Grid size: <span class="hljs-subst">{rows}</span> rows, <span class="hljs-subst">{cols}</span> columns"</span>)

    <span class="hljs-keyword">for</span> r <span class="hljs-keyword">in</span> range(rows):
        <span class="hljs-keyword">for</span> c <span class="hljs-keyword">in</span> range(cols):
            <span class="hljs-keyword">if</span> all(refined_map[r][c] != INF <span class="hljs-keyword">for</span> refined_map <span class="hljs-keyword">in</span> refined_distance_maps):
                combined_distances[r][c] = sum(
                    refined_map[r][c] <span class="hljs-keyword">for</span> refined_map <span class="hljs-keyword">in</span> refined_distance_maps
                )
            <span class="hljs-keyword">else</span>:
                combined_distances[r][c] = INF

    <span class="hljs-comment"># Debug: Output combined distances map</span>
    print(<span class="hljs-string">"Combined distances map:"</span>)
    <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> combined_distances:
        print(row)

    <span class="hljs-comment"># Find the optimal point minimizing the sum of distances</span>
    min_distance = INF
    optimal_position = <span class="hljs-literal">None</span>
    <span class="hljs-keyword">for</span> r <span class="hljs-keyword">in</span> range(rows):
        <span class="hljs-keyword">for</span> c <span class="hljs-keyword">in</span> range(cols):
            <span class="hljs-comment"># Check that the coordinates do not go out of bounds and that we're not at a wall</span>
            <span class="hljs-keyword">if</span> (
                <span class="hljs-number">0</span> &lt;= r &lt; rows
                <span class="hljs-keyword">and</span> <span class="hljs-number">0</span> &lt;= c &lt; cols
                <span class="hljs-keyword">and</span> grid_numeric[r][c] == <span class="hljs-number">0</span>
                <span class="hljs-keyword">and</span> combined_distances[r][c] &lt; min_distance
            ):  <span class="hljs-comment"># Empty space</span>
                min_distance = combined_distances[r][c]
                optimal_position = (r, c)

    <span class="hljs-keyword">if</span> optimal_position <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        print(<span class="hljs-string">"No valid position found."</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">None</span>, INF

    <span class="hljs-keyword">return</span> optimal_position, min_distance


<span class="hljs-comment"># Test the function with your grid</span>
grid = [
    <span class="hljs-string">"###################"</span>,
    <span class="hljs-string">"#    #     #      #"</span>,
    <span class="hljs-string">"#  *  #  *  #     #"</span>,
    <span class="hljs-string">"#    #     #      #"</span>,
    <span class="hljs-string">"#  #  #  #        #"</span>,
    <span class="hljs-string">"#  #           #  #"</span>,
    <span class="hljs-string">"#  #        *     #"</span>,
    <span class="hljs-string">"#  #  #           #"</span>,
    <span class="hljs-string">"#     #           #"</span>,
    <span class="hljs-string">"#  *  #           #"</span>,
    <span class="hljs-string">"###################"</span>,
]

optimal_position, min_distance = solve_min_distance(grid)
print(<span class="hljs-string">f"Optimal Position: <span class="hljs-subst">{optimal_position}</span>, Minimum Distance: <span class="hljs-subst">{min_distance}</span>"</span>)
</code></pre>
<p><strong><em>Gemini-1206-Experimental</em></strong></p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> collections <span class="hljs-keyword">import</span> deque


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">solve_grid_puzzle</span>(<span class="hljs-params">grid</span>):</span>
    <span class="hljs-string">"""
    Finds the point in a grid that minimizes the sum of distances to target points,
    respecting walls.

    Args:
      grid: A list of strings representing the grid. '#' for walls, '*' for targets, ' ' for empty spaces.

    Returns:
      A tuple: (row, col) of the optimal solution point, or None if no solution is found.
    """</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_neighbors</span>(<span class="hljs-params">row, col</span>):</span>
        <span class="hljs-string">"""Returns valid neighbors of a cell (no walls)."""</span>
        neighbors = []
        <span class="hljs-keyword">for</span> dr, dc <span class="hljs-keyword">in</span> [(<span class="hljs-number">0</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">0</span>, <span class="hljs-number">-1</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">0</span>), (<span class="hljs-number">-1</span>, <span class="hljs-number">0</span>)]:
            new_row, new_col = row + dr, col + dc
            <span class="hljs-comment"># Correctly check if the new position is within bounds for the specific row</span>
            <span class="hljs-keyword">if</span> (
                <span class="hljs-number">0</span> &lt;= new_row &lt; len(grid)
                <span class="hljs-keyword">and</span> <span class="hljs-number">0</span> &lt;= new_col &lt; len(grid[new_row])
                <span class="hljs-keyword">and</span> grid[new_row][new_col] != <span class="hljs-string">"#"</span>
            ):
                neighbors.append((new_row, new_col))
        <span class="hljs-keyword">return</span> neighbors

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">bfs</span>(<span class="hljs-params">start_row, start_col</span>):</span>
        <span class="hljs-string">"""Performs Breadth-First Search to find shortest distances from a target."""</span>
        distances = {}
        queue = deque([(start_row, start_col, <span class="hljs-number">0</span>)])  <span class="hljs-comment"># (row, col, distance)</span>
        visited = set()

        <span class="hljs-keyword">while</span> queue:
            row, col, dist = queue.popleft()
            <span class="hljs-keyword">if</span> (row, col) <span class="hljs-keyword">in</span> visited:
                <span class="hljs-keyword">continue</span>
            visited.add((row, col))

            <span class="hljs-keyword">if</span> (
                grid[row][col] == <span class="hljs-string">" "</span>
            ):  <span class="hljs-comment"># Consider only empty spaces as potential solutions</span>
                distances[(row, col)] = dist

            <span class="hljs-keyword">for</span> neighbor_row, neighbor_col <span class="hljs-keyword">in</span> get_neighbors(row, col):
                queue.append((neighbor_row, neighbor_col, dist + <span class="hljs-number">1</span>))

        <span class="hljs-keyword">return</span> distances

    <span class="hljs-comment"># Find target points and empty spaces</span>
    targets = []
    empty_spaces = []
    <span class="hljs-keyword">for</span> row_index, row <span class="hljs-keyword">in</span> enumerate(grid):
        <span class="hljs-keyword">for</span> col_index, cell <span class="hljs-keyword">in</span> enumerate(row):
            <span class="hljs-keyword">if</span> cell == <span class="hljs-string">"*"</span>:
                targets.append((row_index, col_index))
            <span class="hljs-keyword">elif</span> cell == <span class="hljs-string">" "</span>:
                empty_spaces.append((row_index, col_index))

    <span class="hljs-comment"># Calculate shortest distances from each target</span>
    all_distances = {}
    <span class="hljs-keyword">for</span> target_row, target_col <span class="hljs-keyword">in</span> targets:
        all_distances[(target_row, target_col)] = bfs(target_row, target_col)

    <span class="hljs-comment"># Find the best solution point</span>
    best_solution = <span class="hljs-literal">None</span>
    min_total_distance = float(<span class="hljs-string">"inf"</span>)

    <span class="hljs-keyword">for</span> solution_row, solution_col <span class="hljs-keyword">in</span> empty_spaces:
        total_distance = <span class="hljs-number">0</span>
        valid_solution = <span class="hljs-literal">True</span>
        <span class="hljs-keyword">for</span> target_row, target_col <span class="hljs-keyword">in</span> targets:
            <span class="hljs-keyword">if</span> (solution_row, solution_col) <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> all_distances[
                (target_row, target_col)
            ]:
                valid_solution = <span class="hljs-literal">False</span>  <span class="hljs-comment"># No path from target to this solution</span>
                <span class="hljs-keyword">break</span>
            total_distance += all_distances[(target_row, target_col)][
                (solution_row, solution_col)
            ]

        <span class="hljs-keyword">if</span> valid_solution <span class="hljs-keyword">and</span> total_distance &lt; min_total_distance:
            min_total_distance = total_distance
            best_solution = (solution_row, solution_col)

    <span class="hljs-keyword">return</span> best_solution


<span class="hljs-comment"># Example usage with the provided grid:</span>
grid = [
    <span class="hljs-string">"###################"</span>,
    <span class="hljs-string">"#    #     #      #"</span>,
    <span class="hljs-string">"#  *  #  *  #     #"</span>,
    <span class="hljs-string">"#    #     #      #"</span>,
    <span class="hljs-string">"#  #  #  #        #"</span>,
    <span class="hljs-string">"#  #           #  #"</span>,
    <span class="hljs-string">"#  #        *     #"</span>,
    <span class="hljs-string">"#  #  #           #"</span>,
    <span class="hljs-string">"#     #           #"</span>,
    <span class="hljs-string">"#  *  #           #"</span>,
    <span class="hljs-string">"###################"</span>,
]

solution = solve_grid_puzzle(grid)

<span class="hljs-keyword">if</span> solution:
    print(<span class="hljs-string">"Best solution point:"</span>, solution)
<span class="hljs-keyword">else</span>:
    print(<span class="hljs-string">"No solution found."</span>)
</code></pre>
<p>Claude solved it in one shot, gpt4o did it in 2 shot, and gemini also took 2 shot.</p>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 9 - Consistency and Consensus - Thoughts and notes]]></title><description><![CDATA[Disclaimer: this blog has been written with the help of LLM, please do not hate me
I never imagined that consensus would be such a difficult problem, in a democracy one has consensus right, it is so easy. Even if there are multiple failure points the...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-9-consistency-and-consensus-thoughts-and-notes</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-9-consistency-and-consensus-thoughts-and-notes</guid><category><![CDATA[Consensus]]></category><category><![CDATA[distributed systems]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 15 Dec 2024 05:26:38 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734240344989/94592083-815a-42fd-9d58-b8e5339fd23e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Disclaimer: this blog has been written with the help of LLM, please do not hate me</em></p>
<p>I never imagined that consensus would be such a difficult problem, in a democracy one has consensus right, it is so easy. Even if there are multiple failure points the leader gets elected.</p>
<p>Anyway, this chapter speaks a lot about linearisability, consistency, consensus, how to be fault tolerant during consensus. Thing is that the tradeoff with making the system consistent is performance. There wasn’t much to discover in this chapter, I did find it a bit boring. The basic idea while building a distributed system should be that time is linear. I’d always use epochs.</p>
<h1 id="heading-linearizability-core-concepts">Linearizability: Core Concepts</h1>
<h2 id="heading-definition">Definition</h2>
<ul>
<li><p>Makes a distributed system appear as a single replica without replication lag</p>
</li>
<li><p>Ensures all clients see the most recent value once it's written</p>
</li>
<li><p>Requires consistent data across all replicas at the same time</p>
</li>
</ul>
<h2 id="heading-key-characteristics">Key Characteristics</h2>
<ul>
<li><p>Strict consistency model</p>
</li>
<li><p>Once a client sees a new value, all other clients must see that same new value</p>
</li>
<li><p>Critical for scenarios like:</p>
<ul>
<li><p>Leader election in single-leader systems</p>
</li>
<li><p>Choosing unique usernames</p>
</li>
<li><p>Seat booking systems</p>
</li>
</ul>
</li>
</ul>
<h2 id="heading-trade-offs">Trade-Offs</h2>
<ul>
<li><p>Performance-intensive</p>
</li>
<li><p>Often sacrificed for availability</p>
</li>
<li><p>Costly in multi-data center environments</p>
</li>
<li><p>Not many systems provide true linearizability</p>
</li>
</ul>
<h2 id="heading-limitations">Limitations</h2>
<ul>
<li><p>High performance overhead</p>
</li>
<li><p>Difficult to implement across distributed systems</p>
</li>
<li><p>Network delays can significantly impact implementation</p>
</li>
</ul>
<h1 id="heading-distributed-transactions">Distributed Transactions</h1>
<h2 id="heading-types">Types</h2>
<ol>
<li><p>Database-internal transactions</p>
<ul>
<li><p>Used within same database system</p>
</li>
<li><p>Easier to implement</p>
</li>
</ul>
</li>
<li><p>Heterogeneous distributed transactions</p>
<ul>
<li><p>Involve multiple, different systems</p>
</li>
<li><p>Focus on ensuring atomic commits</p>
</li>
</ul>
</li>
</ol>
<h2 id="heading-key-mechanism-two-phase-commit-2pc">Key Mechanism: Two-Phase Commit (2PC)</h2>
<ul>
<li><p>Two phases: Prepare and Commit</p>
</li>
<li><p>Transaction coordinator manages process</p>
</li>
<li><p>Ensures all nodes commit or abort together</p>
</li>
</ul>
<h2 id="heading-challenges">Challenges</h2>
<ul>
<li><p>Performance bottlenecks</p>
</li>
<li><p>Complexity of coordinating across systems</p>
</li>
<li><p>Potential for system-wide blocking if coordinator fails</p>
</li>
</ul>
<h1 id="heading-consensus">Consensus</h1>
<h2 id="heading-definition-1">Definition</h2>
<p>Getting multiple nodes to agree on a value with specific properties:</p>
<ul>
<li><p>Uniform agreement</p>
</li>
<li><p>Integrity</p>
</li>
<li><p>Validity</p>
</li>
<li><p>Termination</p>
</li>
</ul>
<h2 id="heading-key-characteristics-1">Key Characteristics</h2>
<ul>
<li><p>Requires majority of nodes to be functioning</p>
</li>
<li><p>Uses epoch numbering for leader selection</p>
</li>
<li><p>Implements total order broadcast</p>
</li>
</ul>
<h2 id="heading-limitations-1">Limitations</h2>
<ul>
<li><p>Requires strict majority of nodes</p>
</li>
<li><p>Difficult to add/remove nodes dynamically</p>
</li>
<li><p>Relies on timeout mechanisms</p>
</li>
<li><p>Performance can be impacted by frequent leader elections</p>
</li>
</ul>
<h1 id="heading-coordination-services-eg-zookeeper-etcd">Coordination Services (e.g., Zookeeper, etcd)</h1>
<h2 id="heading-features">Features</h2>
<ul>
<li><p>Linearizable atomic operations</p>
</li>
<li><p>Total ordering of operations</p>
</li>
<li><p>Failure detection</p>
</li>
<li><p>Change notifications</p>
</li>
<li><p>Service discovery</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 8 - The trouble with Distributed Systems - thoughts and notes]]></title><description><![CDATA[Disclaimer: This blog has been written with the help of LLM.
This chapter was pessimistic in its approach, but that’s alright, one has to be paranoid when they are building systems. When you are a small company and shipping fast you don’t think about...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-8-the-trouble-with-distributed-systems-thoughts-and-notes</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-8-the-trouble-with-distributed-systems-thoughts-and-notes</guid><category><![CDATA[Databases]]></category><category><![CDATA[System Design]]></category><category><![CDATA[distributed systems]]></category><category><![CDATA[networking]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 13 Oct 2024 11:13:09 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728817935314/9be1819f-e278-4bb9-890f-ad8600bcc2e0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Disclaimer: This blog has been written with the</em> <strong><em>help</em></strong> <em>of LLM.</em></p>
<p>This chapter was pessimistic in its approach, but that’s alright, one has to be paranoid when they are building systems. When you are a small company and shipping fast you don’t think about negative flows - like what will happen if things fail. You only focus on making the positive flows work.</p>
<p>When the company becomes a behemoth data guzzler you have to be mindful and deliver on the promises made on the contract to the customers.</p>
<p>There are so many ways in which a distributed system can fail. So many. One cannot prevent some things from happening but we can have fail safes in position.</p>
<p>The token fencing idea was good though. Learnt a new approach.</p>
<h3 id="heading-fails-and-partial-fails">Fails and Partial Fails</h3>
<ol>
<li><p>Distributed systems have more potential failures compared to single computers.</p>
</li>
<li><p>Single computers are deterministic; retrying an operation yields the same result, and failures are total rather than partial.</p>
</li>
<li><p>In single systems, failures often result from software issues, and the system prefers to crash over returning incorrect results.</p>
</li>
<li><p>Distributed systems consist of multiple connected computers, making them fundamentally different from single systems.</p>
</li>
<li><p>The focus is on commodity computers connected via a network, not supercomputers.</p>
</li>
<li><p>Distributed systems must handle both partial and complete failures, necessitating fault-tolerance mechanisms.</p>
</li>
<li><p>Fault handling is essential in the design of distributed systems, requiring a proactive approach.</p>
</li>
</ol>
<h3 id="heading-unreliable-networks">Unreliable Networks</h3>
<ol>
<li><p>Distributed systems are "shared-nothing" computers communicating over a network.</p>
</li>
<li><p>Relying on networks introduces various risks, such as message loss, damage, rejection, or queuing delays.</p>
</li>
<li><p>Responses can also be lost, delayed, or affected by network misconfigurations or overloads.</p>
</li>
<li><p>The reason for a lack of response is often unknown, making failure detection difficult.</p>
</li>
<li><p>Timeouts are a common strategy to handle unresponsive requests, though they don't confirm message delivery.</p>
</li>
<li><p>Network failures are inevitable at all levels, from switches to data centers.</p>
</li>
<li><p>Systems must either handle or tolerate network problems; handling involves precautions, while tolerating may involve user error messages.</p>
</li>
<li><p>Regular testing is needed to assess how systems manage network issues and when to notify users.</p>
</li>
</ol>
<p><strong>Detecting Faults</strong></p>
<ol>
<li><p>Fault detection is common in many systems, but the exact cause is unknown without a successful response.</p>
</li>
<li><p>In distributed systems, faults can be detected at various levels, like load balancers or leader nodes.</p>
</li>
<li><p>Failures can occur anywhere in the network, including switches, routers, applications, and machines.</p>
</li>
<li><p>Some faults are undetectable, leading to the use of timeouts to declare a node as down.</p>
</li>
<li><p>Retries can be attempted, but eventually, a node may need to be declared dead.</p>
</li>
</ol>
<p><strong>Timeouts and Unbounded Delay</strong></p>
<ol>
<li><p>Setting effective timeout values is challenging:</p>
<ul>
<li><p>Too short: May falsely declare a node as dead during temporary slowdowns.</p>
</li>
<li><p>Too long: Prolongs waiting for genuinely dead nodes, worsening user experience.</p>
</li>
</ul>
</li>
<li><p>Using retries with short timeouts can cause problems, like:</p>
<ul>
<li><p>Executing the same operation twice if redirected while the original request is still queued.</p>
</li>
<li><p>Triggering cascading failures if the new node is already overloaded.</p>
</li>
</ul>
</li>
<li><p>Network delays are unpredictable; asynchronous networks have no upper limit on delay times.</p>
</li>
<li><p>Network switches can cause congestion by queuing messages on the same link, increasing wait times.</p>
</li>
<li><p>Protocols like TCP add delays due to retransmissions after packet loss, with inconsistent timeout values.</p>
</li>
<li><p>Systems like TCP and Cassandra use dynamic timeouts based on past network performance.</p>
</li>
<li><p>Defining fixed timeouts is nearly impossible; understanding network behavior and adjusting timeouts dynamically is crucial.</p>
</li>
<li><p>It’s important to know the differences between synchronous and asynchronous networks for better timeout estimation.</p>
</li>
</ol>
<p><strong>Synchronous versus Asynchronous Networks</strong></p>
<ol>
<li><p><strong>Phone calls are synchronous</strong>: They allocate fixed bandwidth (circuit) end-to-end, ensuring no queuing and a fixed network latency (bounded delay).</p>
</li>
<li><p><strong>Synchronous bandwidth allocation</strong>: For phone calls, if 10,000 connections are available, exactly 10,000 concurrent calls can be made without waiting, regardless of network routing.</p>
</li>
<li><p><strong>TCP connections differ</strong>: Unlike phone calls, bandwidth needs are unknown when establishing a connection, making fixed allocation impractical.</p>
</li>
<li><p><strong>Bandwidth challenges in TCP</strong>: Fixed bandwidth per connection (e.g., 100kbps) would lead to long waits for large data transfers and wasted resources if other slots remain empty.</p>
</li>
<li><p><strong>Queuing in TCP</strong>: To optimize for speed, TCP dynamically adjusts data transfer rates based on available network capacity, allowing connections to queue for bandwidth.</p>
</li>
<li><p><strong>Unbounded delays</strong>: Network setups with queuing have unpredictable delays, making it impossible to guarantee specific delay times or reliability.</p>
</li>
<li><p><strong>Timeout setting challenges</strong>: Due to variable network conditions, correct timeout values can only be determined through experimentation.</p>
</li>
<li><p><strong>Experimentation requires clock knowledge</strong>: Understanding network timing and delays involves learning how clocks work.</p>
</li>
</ol>
<h3 id="heading-unreliable-clocks">Unreliable Clocks</h3>
<ol>
<li><p><strong>Two types of clocks</strong>: Monotonic and time-of-day clocks serve different purposes.</p>
</li>
<li><p><strong>Time-of-day clocks</strong>:</p>
<ul>
<li><p>Based on an epoch (e.g., UTC Jan 1, 1970) without accounting for leap seconds.</p>
</li>
<li><p>Synchronized using Network Time Protocol (NTP) to maintain consistency across machines.</p>
</li>
<li><p>Can jump forward or backward during synchronization, making them unsuitable for measuring elapsed time (e.g., timeouts).</p>
</li>
</ul>
</li>
<li><p><strong>Monotonic clocks</strong>:</p>
<ul>
<li><p>Continuously increase, like a counter tracking nanoseconds.</p>
</li>
<li><p>Do not track the actual time of day; they measure elapsed time accurately without jumps.</p>
</li>
<li><p>Each computer or CPU has its own monotonic clock, typically not synced between machines.</p>
</li>
<li><p>NTP can still adjust the speed of monotonic clocks (faster or slower), but cannot cause jumps, ensuring consistency in elapsed time measurement.</p>
</li>
</ul>
</li>
</ol>
<p><strong>Relying on Synchronous Clocks</strong></p>
<ol>
<li><p><strong>Clocks have reliability issues</strong>:</p>
<ul>
<li><p>Quartz clocks experience delays and skew.</p>
</li>
<li><p>NTP synchronization helps but doesn't fully resolve timing inaccuracies due to network delays.</p>
</li>
</ul>
</li>
<li><p><strong>Time-of-day clock challenges</strong>:</p>
<ul>
<li><p>Systems must account for clock faults and monitor offsets between machines to prevent large discrepancies.</p>
</li>
<li><p>Using timestamps for decision-making can be problematic due to inevitable offsets.</p>
</li>
</ul>
</li>
<li><p><strong>Decision-making issues with timestamps</strong>:</p>
<ul>
<li><p>Clocks may not be accurate enough to determine the order of operations, especially for rapid or concurrent writes.</p>
</li>
<li><p>Identical timestamps may occur if the clock resolution is limited (e.g., milliseconds).</p>
</li>
<li><p>Conflict resolution strategies like Last Write Wins (LWW) can fail due to clock skew, leading to data loss.</p>
</li>
</ul>
</li>
<li><p><strong>Improving conflict resolution</strong>:</p>
<ul>
<li><p>Additional tiebreakers or version vectors can help resolve situations where timestamps are insufficient.</p>
</li>
<li><p>Google’s True Time API provides a confidence interval (earliest and latest time), increasing accuracy in determining operation order.</p>
</li>
</ul>
</li>
<li><p><strong>Snapshot Isolation and Repeatable Read issues</strong>:</p>
<ul>
<li><p>Determining whether a read occurred after a write is challenging in distributed systems.</p>
</li>
<li><p>Google’s True Time-based transaction ID generation doesn’t fully solve these issues with ordering transactions.</p>
</li>
</ul>
</li>
</ol>
<p><strong>Process Pauses</strong></p>
<ol>
<li><p><strong>Leader Node Lease Mechanism</strong>:</p>
<ul>
<li><p>A leader node obtains a lease from other nodes, similar to a lock with a timeout.</p>
</li>
<li><p>It must periodically renew the lease to maintain its leadership; if it fails, another node can take over.</p>
</li>
</ul>
</li>
<li><p><strong>Challenges with Timing Assumptions</strong>:</p>
<ul>
<li><p>Various reasons can cause process pauses, including:</p>
<ul>
<li><p>Garbage collection</p>
</li>
<li><p>Virtual machine suspensions</p>
</li>
<li><p>System sleep</p>
</li>
<li><p>Context-switching</p>
</li>
<li><p>Synchronous disk access</p>
</li>
<li><p>Paging</p>
</li>
<li><p>Signals like SIGSTOP</p>
</li>
</ul>
</li>
<li><p>Timing cannot be reliably assumed in distributed systems.</p>
</li>
</ul>
</li>
<li><p><strong>Real-Time Response Requirements</strong>:</p>
<ul>
<li><p>Some systems require guaranteed responses before specific deadlines (e.g., real-time operating systems).</p>
</li>
<li><p>These systems need documentation of worst-case execution times and may restrict dynamic memory allocation.</p>
</li>
</ul>
</li>
<li><p><strong>Handling Garbage Collection</strong>:</p>
<ul>
<li><p>Garbage collection can be treated as planned outages; runtime warnings allow applications to stop sending new requests during GC.</p>
</li>
<li><p>Alternatively, GC can be limited to short-lived objects, with periodic process restarts to manage memory.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-knowledge-truth-and-lies">Knowledge, Truth and Lies</h3>
<ol>
<li><p><strong>Challenges in Building Reliable Systems</strong>:</p>
<ul>
<li><p>Numerous potential failures in distributed systems make it difficult to identify specific problems.</p>
</li>
<li><p>Assumptions must be made to create systems based on certain truths and knowledge.</p>
</li>
</ul>
</li>
<li><p><strong>Truth Defined by Majority (Quorum)</strong>:</p>
<ul>
<li><p>Relying on a single node for decisions creates a single point of failure.</p>
</li>
<li><p>A quorum, or majority decision, must be used to prevent incorrect assumptions by isolated nodes.</p>
</li>
</ul>
</li>
<li><p><strong>Lock Management Example</strong>:</p>
<ul>
<li><p>A node holding a lock may expire while garbage collection is paused, leading to data corruption.</p>
</li>
<li><p>Fencing tokens (monotonically increasing numbers) can be used to ensure only the most recent lock holder can write to storage.</p>
</li>
</ul>
</li>
<li><p><strong>Byzantine Faults</strong>:</p>
<ul>
<li><p>Byzantine faults occur when nodes provide false information or fail to act correctly.</p>
</li>
<li><p>These are particularly relevant in environments where nodes may be compromised, but most systems operate in closed networks.</p>
</li>
<li><p>Input validation can mitigate risks, such as SQL injection or denial of service attacks.</p>
</li>
</ul>
</li>
<li><p><strong>System Models and Expectations</strong>:</p>
<ul>
<li><p>Acknowledgment of various problems leads to the creation of expectations and assumptions about system behavior.</p>
</li>
<li><p>System models (synchronous, partially synchronous, asynchronous) help define these expectations regarding timing and fault types (crash-stop, crash-recovery, Byzantine).</p>
</li>
<li><p>Algorithm correctness is defined through safety (e.g., uniqueness of tokens) and liveness properties (e.g., eventual response).</p>
</li>
<li><p>Safety properties are expected to always hold, while liveness properties may have exceptions.</p>
</li>
</ul>
</li>
</ol>
<p><em>Fin.</em></p>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 7 - Transactions - thoughts and notes]]></title><description><![CDATA[Disclaimer: This blog has been partially written by the help of LLM.
The more I read DDIA the more my trust on databases is vanishing. :’)
This chapter on transactions was not that interesting, honestly. Most of it was just how the keywords in ACID a...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-7-transactions-thoughts-and-notes</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-7-transactions-thoughts-and-notes</guid><category><![CDATA[Databases]]></category><category><![CDATA[transactions]]></category><category><![CDATA[Accessibility]]></category><category><![CDATA[ACID Transactions]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 29 Sep 2024 07:40:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1727595573781/cf158936-6b59-4baa-9b5e-4c1faeb539d2.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Disclaimer: This blog has been partially written by the help of LLM.</strong></p>
<p>The more I read DDIA the more my trust on databases is vanishing. :’)</p>
<p>This chapter on transactions was not that interesting, honestly. Most of it was just how the keywords in ACID are mostly for show and they mean different things in different situations.</p>
<p>A lot of the chapter was just talking about how things need to be atomic. Trade-offs, trade-offs, trade-offs. The next chapter will be mind boggling because this chapter focussed on transactions on a single node application. Imagine if things are distributed, god save us then.</p>
<h3 id="heading-transactions">Transactions</h3>
<p>Put multiple commands in a single block and if any command fails the whole block execution gets aborted. Easy, right? Right. We will see that things are not what they seem.</p>
<p>ACID</p>
<ul>
<li><strong>Atomicity</strong>: A transaction must either complete fully or not at all. If any part fails, the entire transaction is rolled back, leaving the database unchanged.</li>
</ul>
<ul>
<li><p><strong>Consistency</strong>: Ensures that the database moves from one valid state to another, but this is primarily the application's responsibility, not the database's. The application uses atomicity and isolation to maintain consistency.</p>
</li>
<li><p><strong>Isolation</strong>: Transactions are executed independently without interference, appearing as if they were run sequentially, even if executed in parallel. Full isolation is rare due to performance costs.</p>
</li>
<li><p><strong>Durability</strong>: Once a transaction is committed, the changes are permanent, even in the event of system failures. This is often ensured through mechanisms like write-ahead logging.</p>
</li>
</ul>
<p><strong>Single-Object Writes</strong>: Atomicity and isolation for single-object writes are simple, using locks for isolation and logs for crash recovery. Operations like <code>INCREMENT</code> are not full transactions but offer ACID-like guarantees.</p>
<p><strong>Multi-Object Transactions</strong>: Multi-object updates (e.g., foreign keys, denormalized data, secondary indexes) often require transactions to ensure consistency. Without them, error handling becomes complex.</p>
<p><strong>Error Handling</strong>: Transactions allow abort and retry to prevent incomplete operations. However, systems like leaderless replication use a "best-effort" model, leaving error recovery to the application.</p>
<p><strong>Challenges</strong>:</p>
<ul>
<li><p>Network failures may mislead users about transaction status.</p>
</li>
<li><p>Retrying during overload can worsen issues.</p>
</li>
<li><p>Retries are useless for permanent errors.</p>
</li>
<li><p>Side effects may still occur after an aborted transaction.</p>
</li>
</ul>
<h3 id="heading-weak-isolation-levels">Weak Isolation Levels</h3>
<p>Concurrency bugs are hard to detect and reproduce. Databases use transaction isolation to address them, but full serializability is costly. To balance performance, weaker isolation levels are often used, though they come with trade-offs and potential subtle bugs. Understanding these trade-offs is essential.</p>
<p><strong>Read Committed Isolation</strong>:</p>
<ul>
<li><p><strong>No dirty reads</strong>: A transaction only sees data committed by other transactions, preventing confusion from partially updated or aborted transactions.</p>
</li>
<li><p><strong>No dirty writes</strong>: Prevents a transaction from overwriting uncommitted data by delaying the second write until the first transaction is complete.</p>
</li>
</ul>
<p><strong>Implementation</strong>:</p>
<ul>
<li><p><strong>No dirty writes</strong>: Achieved using row-level locks that hold until the transaction is committed or aborted.</p>
</li>
<li><p><strong>No dirty reads</strong>: The database provides the old value for reads while a transaction is in progress and switches to the new value after the transaction commits.</p>
</li>
</ul>
<p><strong>Snapshot Isolation and Repeatable Read</strong>: Even with these isolation levels, concurrency bugs like nonrepeatable reads (or read skew) can occur, where reading data during concurrent commits results in inconsistent data.</p>
<p><strong>Situations where inconsistencies are critical</strong>:</p>
<ul>
<li><p><strong>Backups</strong>: If writes happen during a backup, restoring from it may cause permanent inconsistencies.</p>
</li>
<li><p><strong>Analytics and integrity checks</strong>: Queries could return nonsensical results if they access data at different points in time.</p>
</li>
</ul>
<p><strong>Snapshot Isolation</strong>: Provides a consistent view of the database for each transaction by reading from a snapshot, preventing nonrepeatable reads. It uses <strong>write locks</strong> to prevent dirty writes and relies on <strong>multi-version concurrency control (MVCC)</strong> to maintain multiple versions of data.</p>
<ul>
<li><p><strong>Read Committed</strong>: Uses a separate snapshot for each query.</p>
</li>
<li><p><strong>Snapshot Isolation</strong>: Uses the same snapshot for the entire transaction.</p>
</li>
</ul>
<p>In some systems, <strong>snapshot isolation</strong> is called <strong>serialisable</strong> (Oracle) or <strong>repeatable read</strong> (PostgreSQL, MySQL).</p>
<p><strong>Preventing Lost Updates</strong>: Lost updates occur when two concurrent transactions modify the same data, leading to one update being overwritten. This can happen during operations like incrementing values. Strategies to prevent lost updates include:</p>
<ul>
<li><p><strong>Atomic write operations</strong>: Some databases support atomic operations (like incrementing a counter) that happen without a read-modify-write cycle, using exclusive locks.</p>
</li>
<li><p><strong>Explicit locking</strong>: Transactions explicitly lock the data, preventing access until the first transaction completes its update.</p>
</li>
<li><p><strong>Automatic detection</strong>: Allows transactions to run in parallel, but the system detects lost updates and forces a retry.</p>
</li>
<li><p><strong>Compare-and-set</strong>: Ensures data is modified only if it hasn’t changed since it was last read, restarting the cycle if changes are detected.</p>
</li>
<li><p><strong>Conflict resolution in replication</strong>: In replicated databases, conflicts are resolved using application logic or special data structures. Commutative atomic operations are helpful in these cases.</p>
</li>
<li><p><strong>Last-Write-Wins (LWW)</strong>: Common in many databases, but it does not prevent lost updates, as the last update overwrites previous ones.</p>
</li>
</ul>
<p><strong>Write Skews and Phantoms</strong>:</p>
<p>Write skews occur when two concurrent transactions read one set of data and update another based on that, leading to inconsistent states. Unlike lost updates, write skews happen on different objects, making them harder to detect. For example, two doctors going off-call based on their view of the on-call list could cause scheduling issues.</p>
<p>Phantoms occur when a transaction's write affects the result of a search query in another transaction (e.g., a doctor going off-call changes the count of available doctors). Solving this requires locking rows returned by the query, but this doesn't work for rows that don’t exist (e.g., when no doctors are on-call).</p>
<p><strong>Solution</strong>:</p>
<ul>
<li><strong>Materialising conflicts</strong>: Create rows for potential scenarios (e.g., pre-create empty calendar slots) and lock them. However, this approach is complex and impacts the data model, so it's typically a last resort.</li>
</ul>
<h3 id="heading-serializability"><strong>Serializability</strong></h3>
<p>The serializable isolation level ensures transactions produce results as if they were executed one after another, preventing race conditions even when run in parallel.</p>
<p><strong>Actual Serial Execution</strong>:</p>
<ul>
<li><p>Executing transactions serially is re-emerging due to cheaper, higher-capacity RAM, which reduces disk I/O operations.</p>
</li>
<li><p>Online Transaction Processing (OLTP) transactions are short and require less data, making serial execution viable.</p>
</li>
</ul>
<p><strong>Stored Procedures</strong>:</p>
<ul>
<li><p>Applications often require multi-step transactions that involve back-and-forth communication with the database.</p>
</li>
<li><p>For example, in a doctor's on-call scenario, conditional logic can be moved into a stored procedure to reduce network overhead, allowing faster execution without blocking other transactions.</p>
</li>
<li><p><strong>Pros and Cons</strong>:</p>
<ul>
<li><p><strong>Cons</strong>:</p>
<ul>
<li><p>Each database vendor has its own procedural language, often outdated.</p>
</li>
<li><p>Limited libraries and tools lead to a poor developer experience.</p>
</li>
<li><p>Poorly written procedures can degrade overall system performance.</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p><strong>Partitioning</strong>:</p>
<ul>
<li><p>Single-threaded transactions can be optimized by partitioning data and assigning separate CPUs to each partition, allowing parallel transaction processing.</p>
</li>
<li><p>While transactions across multiple partitions require locks, effective partitioning strategies can minimise performance issues.</p>
</li>
</ul>
<p><strong>Two-Phase Locking (2PL)</strong>: In 2PL, writers block both other writers and readers, while snapshot isolation allows readers and writers to operate without blocking each other.</p>
<p><strong>Implementation of Two-Phase Locking</strong>:</p>
<ul>
<li><p>Readers acquire shared mode locks, while writers obtain exclusive mode locks.</p>
</li>
<li><p>A shared lock prevents writers from acquiring an exclusive lock.</p>
</li>
<li><p>Locks are released when a transaction commits.</p>
</li>
<li><p>The database detects deadlocks and the application retries any aborted transactions.</p>
</li>
</ul>
<p><strong>Performance of Two-Phase Locking</strong>:</p>
<ul>
<li><p>There is overhead associated with acquiring and releasing locks.</p>
</li>
<li><p>Concurrency is reduced, leading to unstable latencies, especially at high percentiles.</p>
</li>
<li><p>Aborted transactions waste resources.</p>
</li>
<li><p>Deadlocks occur more frequently in 2PL compared to lock-based read committed methods.</p>
</li>
</ul>
<h3 id="heading-serialisable-snapshot-isolation-ssi"><strong>Serialisable Snapshot Isolation (SSI)</strong></h3>
<p>SSI aims to achieve full serialisability with minimal performance overhead compared to snapshot isolation. It is used in single-node systems (like PostgreSQL 9.1) and distributed environments, potentially becoming the new default.</p>
<p><strong>Pessimistic vs. Optimistic Concurrency Control</strong>:</p>
<ul>
<li><p><strong>2PL</strong>: Pessimistic, as it locks resources upfront.</p>
</li>
<li><p><strong>SSI</strong>: Optimistic, where the database checks for isolation violations only upon transaction commit.</p>
</li>
</ul>
<p><strong>Performance Considerations</strong>:</p>
<ul>
<li><p>Optimistic techniques perform better when there's spare capacity and low contention between transactions.</p>
</li>
<li><p>SSI builds on snapshot isolation by adding an algorithm to detect serialization conflicts among writes and determine which transactions to abort.</p>
</li>
</ul>
<p><strong>Detecting Stale MVCC Reads</strong>:</p>
<ul>
<li><p>The database tracks when transactions ignore writes based on MVCC visibility rules.</p>
</li>
<li><p>It checks if any ignored writes have been committed and aborts if they have.</p>
</li>
<li><p>Read-only transactions do not cause write skew and may wait until committing.</p>
</li>
</ul>
<p><strong>Detecting Writes Affecting Prior Reads</strong>:</p>
<ul>
<li><p>The database maintains ephemeral information about transactions reading data at the index or table level.</p>
</li>
<li><p>When a transaction writes, it notifies existing readers of the data.</p>
</li>
<li><p>It checks if other writes commit when the reader commits.</p>
</li>
</ul>
<p><strong>Performance of Serialisable Snapshot Isolation</strong>:</p>
<ul>
<li><p>The rate of aborts significantly impacts SSI's overall performance.</p>
</li>
<li><p>SSI is best suited for short read-write transactions and is less sensitive to slow transactions compared to two-phase locking or serial execution.</p>
</li>
</ul>
<p>If you reached here then thanks!</p>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 6 - Partitioning - thoughts and notes]]></title><description><![CDATA[Cool, so how do you replicate huuuuuge databases? Well, you don’t. You partition them and then replicate. So we are just adding additional complexity to an already complex system? Oh, Yes. It will be fun.
Partitioning is important for efficiently sca...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-6-partitioning-thoughts-and-notes</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-6-partitioning-thoughts-and-notes</guid><category><![CDATA[Databases]]></category><category><![CDATA[Database Partitioning]]></category><category><![CDATA[strategies]]></category><category><![CDATA[Indexes]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 15 Sep 2024 07:21:51 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726384844388/d4e6a6a6-f47d-42fd-ac29-efe841d614d4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Cool, so how do you replicate huuuuuge databases? Well, you don’t. You partition them and then replicate. So we are just adding additional complexity to an already complex system? Oh, Yes. It will be fun.</p>
<p>Partitioning is important for efficiently scaling systems. To distribute the load across multiple node, and make the system fault tolerant by replicating the partitioned nodes.</p>
<h3 id="heading-partitioning-of-key-value-databases"><strong>Partitioning of Key-Value databases</strong></h3>
<p>So you partition your database into 25 nodes, then it should be able to handle 25 times the load, right? Nah, the world isn’t fair and so is partitioning.</p>
<p>When data is partitioned unfairly then it is called a <em>skewed partition</em>. When a single partition receives a high load as compared to other partitions it is called a <em>hot spot</em>.</p>
<p>The goal of partitioning must be to distribute data and query load equally onto multiple nodes.</p>
<p>Lets say you are partitioning a simple kv datastore. You can do it in two ways:</p>
<ol>
<li><p><strong>Key range partitioning</strong>: We use key ranges to define boundaries for partitions, similar to how encyclopaedias are organised. The keys are sorted and assigned to specific partitions, making range queries efficient since not all partitions need to be queried. However, this approach can cause hot spots, where certain access points are overused, leading to an imbalanced database. To handle this, partition ranges must be adjusted based on the data, and key selection needs to be strategic. Predicting which keys will create hot spots, though, can be challenging.</p>
</li>
<li><p><strong>Key Hash partitioning</strong>: To prevent hot spots in key-range partitioning, hash functions can be used to distribute keys. Even similar keys will have different hashes, provided a good hash function is used. Hash-based partitioning ensures even distribution of keys, but sacrifices the efficiency of range queries, as all partitions must be queried.</p>
<p> A combined approach uses a compound primary key, where the first part is hashed to determine the partition, and the second part can be used for range queries (e.g., [user_id, timestamp]). This allows for range queries on the second part.</p>
<p> However, hash keys don't fully eliminate skew. In cases like social media accounts with millions of followers, hot spots may still occur, requiring custom solutions from developers.</p>
</li>
</ol>
<h3 id="heading-partitioning-and-secondary-indexes">Partitioning and Secondary Indexes</h3>
<p>The secondary indexes don’t map neatly into partitions, you have to take some extra steps to handle secondary indexes as they don’t uniquely identify the rows. There are two ways to handle secondary indexes in partitions</p>
<ol>
<li><p><strong>By Document (Local Index):</strong><br /> Secondary indexes are stored within the same partition as the primary index, so each partition only covers its own documents. For example, if searching for red Fender guitars, you must query all partitions.</p>
<p> While this method (scatter/gather) slows down searches by secondary index, it makes write operations fast since both primary and secondary indexes are updated within the same partition.</p>
</li>
<li><p><strong>By Term (Global Index)</strong></p>
<p> In a global index strategy, secondary indexes are stored in separate partitions based on the indexed terms, not alongside the primary data. For example, if you're searching for <strong>red Fender guitars</strong>, the partition containing the index for <strong>color</strong>will include references (like primary keys) to all matching items, regardless of the partition the actual data resides in. This allows the search to be directed to specific partitions rather than querying all of them.</p>
<p> Global Indexing makes writes a bit complex.</p>
</li>
</ol>
<h3 id="heading-rebalancing-the-partitions">Rebalancing the Partitions</h3>
<p>Now you have multiple partitions, but soon these partitions are going to get huge so you will have to partition them again, so how do you approach that?</p>
<ol>
<li><p><strong>Hash Mod N</strong>:<br /> Data is assigned to partitions using a mod function. However, this method is inefficient when adding new nodes, as most of the data gets reshuffled, leading to unnecessary data movement. This strategy is generally discouraged.</p>
</li>
<li><p><strong>Fixed Number of Partitions</strong>:<br /> Multiple partitions are assigned to each node, with a fixed total number of partitions across the cluster. When a new node is added, it takes over some partitions from existing nodes. This limits data movement to just partition reassignments. However, it's challenging to predict the right number of partitions initially, as future data growth can vary.</p>
</li>
<li><p><strong>Dynamic Partitioning</strong>:<br /> As data grows, partitions can be split dynamically. Nodes handle multiple partitions, and new partitions are created or deleted as needed. This approach adapts to data changes but requires starting with a reasonable number of partitions. Pre-splitting can help if key distribution is predictable.</p>
</li>
<li><p><strong>Partitioning Proportional to Nodes</strong>:<br /> A fixed number of partitions is assigned per node. When a new node is added, it randomly splits and takes over half of existing partitions. Hash-based partitioning is used for randomness, though this can lead to imbalanced splits. Advanced algorithms can mitigate unfair partition distribution.</p>
</li>
</ol>
<p>Teams prefer manual rebalancing over automatic rebalancing to avoid operational and unwanted failures, automatic rebalancing can some times lead to inefficient data segments and overload the nodes. Having context of the systems and business logic can prevent mishaps hence manual rebalancing is preferred.</p>
<h3 id="heading-request-routing">Request Routing</h3>
<p>When rebalancing data, nodes, and partitions, it's important to ensure users can still access the correct nodes. Users need to know the IP and port of the node to send requests. There are three ways to manage this:</p>
<ol>
<li><p><strong>Node Forwarding</strong>: If a node receives a request for data it doesn't have, it forwards the request to the correct node, retrieves the data, and sends it back to the client.</p>
</li>
<li><p><strong>Routing Tier</strong>: A separate routing layer determines which node should handle the request.</p>
</li>
<li><p><strong>Direct Client Knowledge</strong>: Clients are aware of which node and partition to contact.</p>
</li>
</ol>
<p>The challenge in all cases is ensuring that clients or routing tiers know about changes in partition-to-node assignments. This can be managed using an external service (e.g., Zookeeper) that tracks node changes and updates either the routing tier or clients accordingly.</p>
<p>Okay, that’s it for this chapter. This one was easy and short, just how I like it :P</p>
]]></content:encoded></item><item><title><![CDATA[DDIA Chapter 5 - Replication - Part 2 - Thoughts and notes]]></title><description><![CDATA[Welcome back folks.
We are going to continue from where we left off in the last blog. We discussed Replication Logs, so now we are going to dive into the Problems with Replication Logs.
Problems with Replication Logs
Let's say you have a read heavy s...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-5-replication-part-2-thoughts-and-notes</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-5-replication-part-2-thoughts-and-notes</guid><category><![CDATA[Databases]]></category><category><![CDATA[replication]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Fri, 06 Sep 2024 13:22:44 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725628870150/8109a507-f16d-4878-857a-7601cbfe0864.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome back folks.</p>
<p>We are going to continue from where we left off in the last blog. We discussed Replication Logs, so now we are going to dive into the Problems with Replication Logs.</p>
<h3 id="heading-problems-with-replication-logs">Problems with Replication Logs</h3>
<p>Let's say you have a read heavy system, so to scale you just have to add more read replica's right? Yes. But in any system, problems arise. You cannot have sync replication of data here because then the latency will increase a lot, also if one node goes down then the writes will get blocked indefinitely.</p>
<p>Only async replication will work in this scenario, but the replication lag may vary from fraction of seconds to minutes. This leads to the follower falling behind the actual leader and can lead to inconsistencies in data, eventually the follower will catch up and hence this effect is called "eventual consistency". So lets see how to solve them:</p>
<p><strong>Read-After-Write Consistency</strong></p>
<ul>
<li><p>Ensures a client reads the latest data version immediately after writing it</p>
</li>
<li><p>Useful when there is replication lag between the leader (where writes occur) and followers (where reads occur)</p>
</li>
</ul>
<p><strong>Strategies for Implementing Read-After-Write Consistency</strong></p>
<ol>
<li><p><strong>Read from the Leader</strong>: When a user reads data they may have recently modified, read directly from the leader. Example: In social networks, users read their own profiles from the leader and others' profiles from followers</p>
</li>
<li><p><strong>Use Last Update Time</strong>: If a user updated data within a recent timeframe (e.g., the last two seconds), read from the leader</p>
</li>
<li><p><strong>Track Last Write Timestamp</strong>: Save the timestamp of the last write and read from the leader until the data reflects updates up to that timestamp</p>
</li>
</ol>
<p><strong>Monotonic Reads</strong></p>
<ul>
<li><p>Ensures a user does not see older data after having seen newer data, avoiding "time travel" effects due to replication lag</p>
</li>
<li><p>Provides a weaker guarantee than strong consistency but is stronger than eventual consistency</p>
</li>
<li><p><strong>How to make sure the above two things happen?</strong></p>
<p>  Ensure the user always reads from the same replica</p>
</li>
</ul>
<p><strong>Consistent Prefix Reads</strong></p>
<ul>
<li><p>Ensures the correct order of write operations when there is causality in data (e.g., messages in a chat app)</p>
</li>
<li><p>Prevents anomalies like broken message sequences caused by reading from different replicas</p>
</li>
<li><p><strong>Common Issue</strong>: Occurs more frequently in sharded databases</p>
</li>
<li><p><strong>Solutions</strong>:</p>
<ul>
<li><p>Write causally related data to the same shard.</p>
</li>
<li><p>Use specialised algorithms to maintain the correct order</p>
</li>
</ul>
</li>
</ul>
<p><strong>Solving Replication Lag</strong></p>
<ul>
<li><p>Replication lag is challenging to manage in applications; do not assume replication is synchronous when it is actually asynchronous</p>
</li>
<li><p><strong>Use Transactions</strong>: Transactions help provide stronger consistency and reduce complexity in application code</p>
</li>
</ul>
<h3 id="heading-multi-leader-replication">Multi-Leader Replication</h3>
<ol>
<li><p><strong>Concept:</strong></p>
<ul>
<li>Unlike leader-based replication (single leader for all writes), multi-leader replication allows multiple nodes to accept writes simultaneously. Each leader also acts as a follower to other leaders.</li>
</ul>
</li>
<li><p><strong>Use Cases:</strong></p>
<ul>
<li><p>Rarely useful within a single datacenter.</p>
</li>
<li><p><strong>Multi-datacenter operation:</strong> Each datacenter has its own leader; leaders replicate changes asynchronously to other data centers.</p>
<ul>
<li><p><strong>Benefits:</strong></p>
<ul>
<li><p><strong>Performance:</strong> Local writes reduce latency; network delays are hidden from users.</p>
</li>
<li><p><strong>Datacenter outage tolerance:</strong> Data centers can operate independently.</p>
</li>
<li><p><strong>Network problem tolerance:</strong> More resilient to network issues with asynchronous replication.</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li><p><strong>Implementation:</strong></p>
<ul>
<li><p>Tools: Tungsten Replicator (MySQL), BDR (PostgreSQL), GoldenGate (Oracle).</p>
</li>
<li><p>Challenges: Configuration issues (e.g., auto incrementing keys, triggers), potential data conflicts.</p>
</li>
</ul>
</li>
<li><p><strong>Specific Use Cases:</strong></p>
<ul>
<li><p><strong>Offline Clients:</strong> Applications (like a calendar) that work offline use local databases and synchronise later.</p>
</li>
<li><p><strong>Collaborative Editing:</strong> Real-time editing (e.g., Google Docs) involves local changes replicated asynchronously to the server and other users.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-handling-write-conflicts">Handling Write Conflicts</h3>
<ol>
<li><p><strong>Key Challenge:</strong></p>
<ul>
<li>Conflict resolution is more complex in multi-leader replication compared to single-leader replication.</li>
</ul>
</li>
<li><p><strong>Conflict Detection:</strong></p>
<ul>
<li><p><strong>Synchronous:</strong> Conflicts detected immediately (single-leader).</p>
</li>
<li><p><strong>Asynchronous:</strong> Conflicts detected later (multi-leader).</p>
</li>
</ul>
</li>
<li><p><strong>Strategies for Conflict Handling:</strong></p>
<ul>
<li><p><strong>Conflict Avoidance:</strong> Route all writes for a particular record to the same leader.</p>
</li>
<li><p><strong>Convergent State:</strong> Ensure all replicas eventually reach the same final value.</p>
</li>
</ul>
</li>
<li><p><strong>Conflict Resolution Methods:</strong></p>
<ul>
<li><p><strong>Last Write Wins (LWW):</strong> Keep the write with the highest ID.</p>
</li>
<li><p><strong>Replica Precedence:</strong> Writes from a higher-numbered replica take precedence.</p>
</li>
<li><p><strong>Merge Values:</strong> Combine values in some way.</p>
</li>
<li><p><strong>Custom Logic:</strong> Use application code to resolve conflicts on write or read.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-multi-leader-replication-topologies">Multi-Leader Replication Topologies</h3>
<ol>
<li><p><strong>Types of Topologies:</strong></p>
<ul>
<li><p><strong>All-to-All:</strong> Every leader sends writes to every other leader.</p>
</li>
<li><p><strong>Circular:</strong> Nodes forward writes in a circular manner.</p>
</li>
<li><p><strong>Star:</strong> One node forwards writes to all other nodes.</p>
</li>
</ul>
</li>
<li><p><strong>Challenges:</strong></p>
<ul>
<li><p><strong>Fault Tolerance:</strong> All-to-all is more fault-tolerant but may have ordering issues.</p>
</li>
<li><p><strong>Replication Loops:</strong> Use unique identifiers and version vectors to prevent loops.</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-leaderless-replication">Leaderless Replication</h3>
<ol>
<li><p><strong>Concept:</strong></p>
<ul>
<li>Any replica can accept writes; no leader. Writes are sent to all replicas in parallel.</li>
</ul>
</li>
<li><p><strong>Read and Write Mechanisms:</strong></p>
<ul>
<li>Clients send write/read requests to multiple nodes. Version numbers determine the most recent data.</li>
</ul>
</li>
<li><p><strong>Mechanisms for Data Synchronisation:</strong></p>
<ul>
<li><p><strong>Read Repair:</strong> Stale replicas are updated during reads.</p>
</li>
<li><p><strong>Anti-Entropy Process:</strong> Background process reconciles differences between replicas.</p>
</li>
</ul>
</li>
<li><p><strong>Quorums:</strong></p>
<ul>
<li><p>Writes must be confirmed by a minimum number (w) of nodes, and reads must query a minimum (r) number of nodes.</p>
</li>
<li><p>Ensures up-to-date values as long as <code>w + r &gt; n</code> (total replicas).</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-limitations-and-challenges">Limitations and Challenges</h3>
<ol>
<li><p><strong>Sloppy Quorum:</strong> Increases write availability by allowing writes to any reachable node, even outside the designated ones.</p>
</li>
<li><p><strong>Concurrent Write Detection:</strong> Requires knowledge of the database's conflict handling mechanism.</p>
</li>
<li><p><strong>Version Vectors:</strong> Helps determine concurrent writes and ensures data consistency.</p>
</li>
</ol>
<h3 id="heading-key-takeaways">Key Takeaways</h3>
<ul>
<li><p>Multi-leader and leaderless replications offer flexibility and performance benefits but introduce complexities in conflict resolution.</p>
</li>
<li><p>Appropriate for specific use cases like multi-datacenter operations, offline clients, and collaborative applications.</p>
</li>
<li><p>Requires careful consideration of replication topology, conflict handling, and consistency mechanisms.</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 5 - Replication - thoughts and notes - Part 1]]></title><description><![CDATA[Why replicate db data?

keep data close to customer (improve latency)

high availability

increased read throughput


Leaders and followers - who are they?

Leaders - well, the name suggests the meaning, all writes to the leader node

Followers - wel...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-5-replication-thoughts-and-notes-part-1</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-5-replication-thoughts-and-notes-part-1</guid><category><![CDATA[replication]]></category><category><![CDATA[Databases]]></category><category><![CDATA[Outages]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sun, 01 Sep 2024 07:09:45 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1725174536338/95db6b54-dc05-4266-b1b2-98f5b3dcdac6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-why-replicate-db-data"><strong>Why replicate db data?</strong></h3>
<ul>
<li><p>keep data close to customer (improve latency)</p>
</li>
<li><p>high availability</p>
</li>
<li><p>increased read throughput</p>
</li>
</ul>
<h3 id="heading-leaders-and-followers-who-are-they"><strong>Leaders and followers - who are they?</strong></h3>
<ul>
<li><p>Leaders - well, the name suggests the meaning, all writes to the leader node</p>
</li>
<li><p>Followers - well, the name suggests the meaning, all reads go to the follower node</p>
</li>
</ul>
<p>It is leader node's job to keep the followers up to date with the current data changes</p>
<h3 id="heading-sync-vs-async-replication"><strong>Sync vs Async replication</strong></h3>
<p>So the leader can update the followers in a synchronous manner like when a row is updated, it tells the followers to update that row and until all the followers update that row the user will have to wait for the success response.</p>
<p>In async update the leader updates the row but doesn't wait for followers to do the same and sends the success response to the user.</p>
<p>Semi-sync approach is more better here, you have sync updates to 1-2 followers and the rest followers update async, this is how you maintain data durability and also the latency.</p>
<h3 id="heading-setting-up-new-followers"><strong>Setting up new followers</strong></h3>
<p>Setting up a follower can usually be done without downtime. The process looks like this:</p>
<ol>
<li><p>Take a snapshot of the leader’s database</p>
</li>
<li><p>Copy the snapshot to the follower node</p>
</li>
<li><p>Follower requests data changes that have happened since the snapshot was taken</p>
</li>
<li><p>Once follower processed the backlog of data changes since snapshot, it has <em>caught up</em>.</p>
</li>
</ol>
<h3 id="heading-handling-node-outages"><strong>Handling Node Outages</strong></h3>
<ul>
<li><p>Follower Failure: Catchup Recovery</p>
<ul>
<li><p>Follower reconnects to the leader</p>
</li>
<li><p>Requests all data changes missed while it was disconnected</p>
</li>
</ul>
</li>
<li><p>Leader Failure: Failover</p>
<ul>
<li><p>A follower is promoted to a new leader</p>
</li>
<li><p>Clients are reconfigured to direct writes to the new leader</p>
</li>
<li><p>Other followers start consuming data changes from the new leader</p>
</li>
</ul>
</li>
<li><p>Automatic Failover Steps</p>
<ul>
<li><p><strong>Detect Leader Failure</strong>: A node is considered dead if it does not respond within a specific period</p>
</li>
<li><p><strong>Select a New Leader</strong>: The best candidate is the replica with the most up-to-date changes from the old leader</p>
</li>
<li><p><strong>Reconfigure the System</strong>: Ensure the old leader becomes a follower and recognises the new leader</p>
</li>
</ul>
</li>
<li><p>Potential Issues that might occur during automatic failover</p>
<ul>
<li><p><strong>Conflicting Writes</strong>: Asynchronous replication may result in the new leader receiving conflicting writes.</p>
</li>
<li><p><strong>Data Coordination</strong>: Discarding writes can be risky if other systems need to be synchronised with the database</p>
</li>
<li><p><strong>Split-Brain Scenario</strong>: Two nodes might mistakenly believe they are the leader, causing data loss or corruption</p>
</li>
<li><p><strong>Timeout Determination</strong>: Deciding the right time before declaring a leader dead can be challenging</p>
</li>
</ul>
</li>
</ul>
<h3 id="heading-implementation-of-replication-logs"><strong>Implementation of Replication Logs</strong></h3>
<p><strong>Statement-Based Replication</strong></p>
<ul>
<li><p>The leader logs every SQL statement (e.g., INSERT, UPDATE, DELETE) and sends it to its followers.</p>
</li>
<li><p>Problems with Statement-Based Replication</p>
<ul>
<li><p>Non-deterministic Functions: Functions like <code>NOW()</code> or <code>RAND()</code> produce different values on replicas.</p>
</li>
<li><p>Order Dependency: Statements relying on existing data (e.g., auto-increments) must execute in the same order on all replicas.</p>
</li>
<li><p>Side Effects: Statements with side effects may result in different outcomes on each replica.</p>
</li>
</ul>
</li>
<li><p>Solution</p>
<ul>
<li>Replace nondeterministic functions with fixed values on the leader.</li>
</ul>
</li>
</ul>
<p><strong>Write-Ahead Log (WAL) Shipping</strong></p>
<ul>
<li><p>An append-only sequence of bytes records all writes to the database. The leader sends this log to its followers.</p>
</li>
<li><p>Used in PostgreSQL and Oracle</p>
</li>
<li><p>Disadvantages of WAL Shipping</p>
<ul>
<li><p>Low-Level Data Description: The log describes data changes at a low level (specific bytes and disk blocks), tying it closely to the storage engine.</p>
</li>
<li><p>Version Incompatibility: Running different versions of the database on leaders and followers is usually not possible, impacting operations and complicating zero-downtime upgrades.</p>
</li>
</ul>
</li>
</ul>
<p><strong>Logical (Row-Based) Log Replication</strong></p>
<ul>
<li><p>A sequence of records describing writes to database tables at the row level:</p>
<ul>
<li><p>Inserted Row: Contains new values for all columns.</p>
</li>
<li><p>Deleted Row: Identifies the unique information for the deleted row.</p>
</li>
<li><p>Updated Row: Contains unique row identifier and new values for columns.</p>
</li>
</ul>
</li>
<li><p>Used by MySQL binlog.</p>
</li>
<li><p>Advantages of Logical Log Replication</p>
<ul>
<li><p>Decoupled from Storage Engine: Easier to maintain backward compatibility.</p>
</li>
<li><p>Ease of Parsing: Beneficial for external applications like data warehouses, custom indexes, and caches (change data capture).</p>
</li>
</ul>
</li>
</ul>
<p><strong>Trigger-Based Replication</strong></p>
<ul>
<li><p>Moves replication to the application layer.</p>
</li>
<li><p>A trigger executes custom application code automatically upon data changes, logging changes to a separate table for external processing.</p>
</li>
<li><p>Disadvantages of Trigger-Based Replication</p>
<ul>
<li><p>Overhead and Complexity: Higher overheads and more prone to bugs.</p>
</li>
<li><p>Flexibility: Can be useful for specific use cases requiring custom logic.</p>
</li>
</ul>
</li>
</ul>
<p>I'll have to break this chapter into 3 parts otherwise the blog will become too long and that will be problematic for the goldfish to read as they have a small attention span.</p>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 4 - Encoding and Evolution - Thoughts and notes]]></title><description><![CDATA[The fourth chapter, the last chapter of part 1 of the DDIA book. Lessgo!
What does software do? It propagates the transfer of data, well, it is too simple of an explanation, there are logical layers in between to make the data go to the right place i...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-4-encoding-and-evolution-thoughts-and-notes</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-4-encoding-and-evolution-thoughts-and-notes</guid><category><![CDATA[Databases]]></category><category><![CDATA[encoding]]></category><category><![CDATA[evolution]]></category><category><![CDATA[json]]></category><category><![CDATA[Compatibility]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 24 Aug 2024 05:16:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1724476539320/0c3bfa50-2d3f-40b0-93d6-ff96e04ab253.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The fourth chapter, the last chapter of part 1 of the DDIA book. Lessgo!</p>
<p>What does software do? It propagates the transfer of data, well, it is too simple of an explanation, there are logical layers in between to make the data go to the right place in the right form. What that right form is on you to decide.</p>
<p>There have been many formats in which one could transfer data, my favourite is JSON, why? Because it is human readable. :')</p>
<p>But data requirements keep changing due to new feature addition or refactor. How do you maintain the consistency of data then? Old code cannot read newer data. Newer code cannot read older data.</p>
<p>There are two keywords for this:</p>
<ul>
<li><p>Backward Compatibility - newer code can read older data written by older code</p>
</li>
<li><p>Forward Compatibility - older code can read newer data written by newer code</p>
</li>
</ul>
<p>Back-populating helps. But a programmer needs to make plans before releasing the changes which might break backward/forward compatibility to prod.</p>
<p>How to encode data? (You gotta interchange it right? So how do you encode it?)</p>
<ol>
<li><p>in-memory objects - your garden variety data structures (arrays, structs, trees etc)</p>
</li>
<li><p>In-memory data that is converted to bytes so that it can be sent over the server.</p>
<p> Here the conversion of in-memory data to bytes is called <em>encoding, serialisation, marshalling</em> and other way around is called <em>decoding, deserialisation, unmarshalling.</em></p>
</li>
</ol>
<p>Avoid using language specific encoders like serialisers (java), pickle (python) because they tie you to a single language, they are bad with versioning, and there is a security vulnerability of remote arbitrary code execution.</p>
<p>The most popular encoding formats are <strong>JSON, XML and their binary forms</strong>.</p>
<p>JSON and XML don't support sending binary data. People use base64 encoding and pass data as a string, which increases the data size by 33%.</p>
<p>The goal is to keep the bytes to a lower size, there are many methods to do that but not everyone agrees to something common. There is thrift, protobuf that decrease the size of the data while maintaining schema. The author then talks about <strong>Avro,</strong> I won't be explaining it here, info about Avro is plentiful on the internet.</p>
<p>The book talks about modes of dataflow next:</p>
<ul>
<li><p>data flow through dbs (Data outlives code)</p>
</li>
<li><p>dataflow through REST, SOAP, RPC</p>
</li>
<li><p>message passing (kafka, rabbitmq)</p>
</li>
</ul>
<p>And I'm going to stop here.</p>
]]></content:encoded></item><item><title><![CDATA[DDIA - Chapter 3 - Storage and Retrieval - (Part 2)]]></title><description><![CDATA[Henlo frens
We discussed till LSM-tree in the last blog, had to break it down into 2 cause this chapter has too much information.
Many databases use B-trees, it has become the standard. LSM-tree and B-tree both keep the key value pair sorted, but bot...]]></description><link>https://tech.peculiarvivek.com/ddia-chapter-3-storage-and-retrieval-part-2</link><guid isPermaLink="true">https://tech.peculiarvivek.com/ddia-chapter-3-storage-and-retrieval-part-2</guid><category><![CDATA[Databases]]></category><category><![CDATA[columnar dbs]]></category><category><![CDATA[btree]]></category><category><![CDATA[OLAP]]></category><category><![CDATA[OLTP]]></category><dc:creator><![CDATA[Vivek Khatri]]></dc:creator><pubDate>Sat, 17 Aug 2024 11:51:25 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1723895426170/12b484b5-e5d1-460a-bbf6-df534bf69680.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Henlo frens</p>
<p>We discussed till LSM-tree in the last blog, had to break it down into 2 cause this chapter has too much information.</p>
<p>Many databases use B-trees, it has become the standard. LSM-tree and B-tree both keep the key value pair sorted, but both of they have a different design philosophy.</p>
<p>Log-structured indexes -&gt; break down db into variable size segments, writes a segment sequentially</p>
<p>B-tree indexes -&gt; break the db into fixed size blocks or pages, 4Kb in size, read/writes one page at a time, O(logn) depth</p>
<p>B-tree use a Write ahead log to handle crashes, any data that is to be written in the B-tree gets appended to this file and then gets written in the B-tree.</p>
<p>Comparison between LSM-tree and B-tree. (You can find this anywhere, not gonna explain here sorry :()</p>
<p>Optimisations in B-tree:</p>
<ul>
<li><p>Copy-on-Write Scheme</p>
</li>
<li><p>Key Abbreviation</p>
</li>
<li><p>Disk Layout Optimisation</p>
</li>
<li><p>Additional pointers</p>
</li>
</ul>
<p>So apart from primary indexes there are also secondary indexes. They complement primary indexes and enable efficient joins and queries. the difference between them is that in secondary indexes the key values can be non-unique.</p>
<p>Coming to full text search/fuzzy search - your key-value indexing cannot handle full text search as even if there is a simple mistake in the keyword it won't show you any results. Full-text search engines expand queries to include synonyms, ignore grammatical variations, and support proximity searches. Fuzzy search deals with misspelled words.</p>
<p>This is a blog about correcting spellings: <a target="_blank" href="https://norvig.com/spell-correct.html">https://norvig.com/spell-correct.html</a></p>
<p>Okay, now lets talk about OLTP (Online Transaction Processing) and OLAP (Online Analytics processing).</p>
<p>OLTP forms the logical unit for the business use case, but using the same db for running analytical queries can slow the db down, so someone came up with OLAP dbs, these dbs are specifically built for analytics.</p>
<p>Data warehouses are for analysts to run their inefficient queries and fetch useless stats to impress the upper management, kidding. Data is piped from OLTP through an ETL (Extract-Transform-Load) pipeline into data warehouses where analysts can run any number of queries. Data warehouses usually follow a star schema. Look up on google. I cannot spoon feed everything, duh.</p>
<p>There are something called Columnar Dbs, as the name suggests well they store the column values together instead of the rows, these dbs are efficient for analytical queries. See tables in warehouses can have hundreds of columns and millions of rows, now going through every row and having the whole row in memory doesn't make sense if the user needs only specific column values, so column-oriented storage comes handy here.</p>
<p>Column-oriented storage, compression, and sorting helps to make read queries faster and make sense in data warehouses, where the load consist on large read-only queries run by business teams.</p>
<p>That's it, the chapter also has a lot of nitty gritties but I won't document them here, I have broadly covered the important things.</p>
]]></content:encoded></item></channel></rss>