Managing Technical Debt: Tradeoff between Speed and Long-Term Sustainability

Technical Debt
Photo by DeepMind on Unsplash

Technical debt is a concept that has become increasingly important in software development over the past few decades. In essence, it refers to the trade-off between delivering software quickly and maintaining its long-term sustainability. Just as financial debt can accumulate over time and can become a burden, technical debt too if it is not managed effectively.

In this blog post, we will explore the concept of technical debt and provide some best practices for managing it. We will begin by defining technical debt and explaining why it is important to manage it effectively. We will then explore some common causes of technical debt and provide some strategies for reducing and avoiding it. Finally, we will discuss some best practices for balancing the need for speed with long-term sustainability in software development.

What is Technical Debt?

Technical debt refers to the cost of maintaining and supporting software that was built quickly and without regard for its long-term sustainability. Just as financial debt accumulates over time and accrues interest, technical debt accumulates as developers take shortcuts or make compromises that result in suboptimal design and code quality. That results that the code is difficult to maintain, modify, or extend. This can lead to a situation where the cost of maintaining and supporting the software is much higher than it would have been if the software had been built with sustainability in mind from the beginning.

What causes Technical Debt?

There are many factors that can contribute to the accumulation of technical debt. Here are a few common causes:

  1. Rush to Meet Deadlines

One of the most common causes of technical debt is the pressure to meet deadlines. If a development team is under pressure to deliver software quickly, they may take shortcuts or make compromises that result in suboptimal code quality. While this may allow them to meet the deadline, it can lead to a situation where the cost of maintaining and supporting the software is much higher than it would have been if the software had been built with sustainability in mind from the beginning.

  1. Lack of Planning

Another common cause of technical debt is a lack of planning. If a development team does not take the time to carefully plan out the architecture and design of their software, they may end up with design/code that is difficult to maintain or extend.

  1. Working in Silos

Poor communication between dev team members can also contribute to the accumulation of technical debt. If developers are not communicating effectively with each other, cross functional team members or with stakeholders, they may end up making design decisions that are not optimal for the long-term sustainability of the product.

  1. Lack of DevSecOps adoption and right SDLC tooling

Finally, a lack of DevSecOps adoption and not to incorporate right tooling to automate testing, code scanning, code reviews etc., can also contribute to the accumulation of technical debt. If developers do not have a robust code review process, automated testing suite in place, they may introduce defects that are not caught until much later in the development cycle. This can lead to a situation where the cost of fixing defects is much higher than it would have been if the defects had been caught earlier in the development process. This also degrades morale of the developers as they spend more hours to manage and fix the defects.

Why is Technical Debt Important to Manage?

Managing technical debt is important for a number of reasons. First and foremost, technical debt can be a major drain on productivity and resources. If developers are spending all their time fixing defects and maintaining legacy code, they will have less time to work on new features and improvements.

In addition, technical debt can lead to higher costs and longer development times. If code is poorly designed or difficult to modify, it may take much longer to add new features or make changes to the software. This can result in missed deadlines, higher costs, and a negative impact on the development team morale and job satisfaction. If they are constantly working with suboptimal code or struggling to fix defects that could have been avoided with better design decisions, they may become frustrated and demotivated. This can lead to higher turnover rates and a less productive team.

How to reduce/avoid Technical Debt?

Now that we have explored some common causes of technical debt, let’s discuss some strategies for reducing and avoiding it:

  1. Focus on Delivering Value

One of the most effective ways to reduce and avoid technical debt is to prioritize design of product and code quality from the beginning. This means taking the time to carefully plan out the architecture and design of your software, and ensuring that all code is thoroughly reviewed and tested before it is merged into the main source code branch. By prioritizing code quality, you can avoid many of the shortcuts and compromises that can lead to technical debt.

  1. Use Agile Methodologies

Agile methodologies, such as Scrum and Kanban, can also be effective at reducing and avoiding technical debt. By breaking down development into smaller, manageable sprints and focusing on delivering value in each sprint, you can ensure that your software is being developed with sustainability in mind from the beginning. Agile methodologies also prioritize communication and collaboration with in the team and across the different teams, which can help ensure that everyone is on the same page and working towards the same OKRs (Objectives and Key Results).

  1. Code Reviews and Automated Testing

Code reviews can help establish coding standards and best practices that can help prevent technical debt from accumulating in the first place. Automated testing is another effective way to reduce and avoid technical debt. By implementing a robust suite of automated tests, you can catch defects much earlier in the development process, before they have a chance to accumulate and become a burden. Automated testing can also help ensure that all code is thoroughly tested before it is merged into the main source code branch.

  1. Manage Technical Debt as part of Sprint Planning

Managing technical debt should be a part of your sprint planning process. As you prioritize features and improvements, you should also consider the impact they will have on your codebase and the potential for accumulating technical debt. This means weighing the benefits of delivering features quickly against the long-term sustainability of your software.

  1. Prioritize Technical Debt Reduction

While it may be tempting to prioritize new features and improvements over technical debt reduction, it is important to keep technical debt reduction a priority. By taking the time to reduce and avoid technical debt, you can ensure that your software is being developed with long-term sustainability in mind.

  1. Make Technical Debt Reduction a Team Effort

Reducing and avoiding technical debt should not be the sole responsibility of one team member. Instead, it should be a team effort, with everyone working together to ensure that technical debt is kept to a minimum. This means encouraging everyone on the team to prioritize code quality, communicate effectively, and stay on top of technical debt reduction.

  1. Continuously Monitor Technical Debt

Finally, it is important to continuously monitor the technical debt associated with your software. One way to accomplish this is by integrating tools like SonarQube and Kiuwan into your CI/CD pipeline, allowing you to continuously assess technical debt and communicate it to the stake holders. This means keeping track of technical debt metrics, such as code complexity and defect counts, and regularly assessing the impact of technical debt on your software’s sustainability. By staying on top of technical debt, you can ensure that your software remains sustainable over the long-term.

Ultimately, managing technical debt is about taking a proactive approach to software development. By prioritizing quality of design and development, communication, and technical debt reduction, you can ensure that your software is developed with both speed and sustainability in mind, and avoid the costs and risks associated with technical debt.

Advertisement
Tagged with: , , , ,
Posted in General

The Rise of Low-Code/No-Code Application Development

low-code/no-code
Photo by Ilya Pavlov on Unsplash

What is Low-code?

Low-code is a software development approach that allows developers to create and maintain applications with minimal hand-coding and minimal effort. It enables users to visually design, build, and deploy software applications using a graphical user interface rather than traditional coding. Low-code platforms typically provide pre-built, reusable components and drag-and-drop functionality, making it easy for non-technical users to create and customize applications without the need for programming skills. This approach can help organizations to build and deploy software faster, with less cost, and with more flexibility.

Low-code platforms typically provide pre-built, reusable components that users can drag and drop onto a visual canvas to create their application. These components may include things like forms, buttons, and data tables. Users can then customize the components by adjusting properties, such as layout and style, or by creating simple scripts or formulas to handle data and logic.

What is No-code?

No-code is a type of software development approach that allows users to create, customize, and maintain software applications without the need for writing any code. It is an even higher abstraction level than low-code, which allows non-technical users to create and deploy software applications by using a drag-and-drop interface or pre-built templates, similar to low-code. The idea behind no-code is to allow anyone, regardless of their technical background, to create software applications without the need for programming skills. The end goal is to enable faster and more efficient application development, while also reducing the need for specialized development resources.

No-code platforms, on the other hand, take this abstraction a step further, providing pre-built templates and workflows that can be used to create an application by configuring the pre-built components, without any coding required. These templates can be used to create a wide range of applications, from simple forms to more complex business process automation.

Low-code and no-code tools typically work by providing a visual, drag-and-drop interface for users to design, build, and deploy software applications. This interface allows users to create and customize software applications without the need for traditional coding. Both low-code and no-code tools also typically provide a way to connect to external data sources and services, allowing users to easily integrate their applications with other systems and tools.

What are the benefits of Low-code/No-code?

Low-code and no-code platforms offer a number of benefits for organizations looking to create and maintain software applications. Some of the main benefits include:

  1. Faster development: Low-code and no-code platforms allow developers to create and deploy software applications faster than traditional coding, by providing pre-built, reusable components and drag-and-drop functionality.
  2. Reduced costs: Low-code and no-code platforms can help organizations to reduce costs by allowing them to build and deploy software faster and with less need for specialized development resources.
  3. Increased productivity: Low-code and no-code platforms allow non-technical users, such as business analysts and domain experts, to create and customize software applications without the need for programming skills, increasing productivity and aligning IT with business needs.
  4. Greater flexibility: Low-code and no-code platforms can be used to create a wide range of software applications, from simple forms to more complex business process automation, allowing organizations to respond to changing business needs and stay competitive.
  5. Better data and business process management: Low-code and no-code platforms can be used to create and maintain databases and analytics platforms, allowing organizations to better manage and analyze their data, as well as automate business processes, such as data entry and workflow management.
  6. Cloud and IoT integration: Low-code and no-code platforms can be used to create cloud and IoT applications, allowing organizations to leverage the power of cloud computing and the Internet of Things to improve efficiency and drive innovation.
  7. Cross-platform compatibility: Many low-code and no-code platforms have built-in support for multi-platform deployment, allowing organizations to develop once and deploy on multiple platforms, such as web, mobile, and desktop.

What are the use cases of Low-code/No-code platforms?

Low-code and no-code platforms have seen growing adoption across various industries in recent years, as organizations look to improve efficiency and reduce costs while developing software applications.

  1. Healthcare: Low-code and no-code platforms are increasingly being used in healthcare to improve patient care and automate workflows. For example, hospitals and clinics can use these platforms to create and maintain electronic health records, appointment scheduling systems, and patient portals.
  2. Financial Services: Low-code and no-code platforms are being used in financial services to automate workflows and improve the customer experience. Banks and insurance companies can use these platforms to create mobile apps, customer portals, and automated loan origination systems.
  3. Retail: Low-code and no-code platforms are being used in retail to improve the customer experience and automate workflows. Retailers can use these platforms to create mobile apps, e-commerce platforms, and inventory management systems.
  4. Manufacturing: Low-code and no-code platforms are being used in manufacturing to automate workflows, improve efficiency, and reduce costs. Manufacturers can use these platforms to create and maintain manufacturing execution systems, quality control systems, and supply chain management systems.
  5. Government: Low-code and no-code platforms are being used in government to automate workflows, improve efficiency, and reduce costs. Governments can use these platforms to create and maintain systems for managing public services, such as property tax systems, driver’s license systems, and voter registration systems.
  6. IT Services: Low-code and no-code platforms are being used in IT services to improve the efficiency of IT operations. IT services companies can use these platforms to create automation scripts, custom portals, and service management systems.
  7. Supply chain: Low-code and no-code platforms can be used in the supply chain to improve efficiency, automate workflows, and gain real-time visibility into the supply chain operations. These platforms can be used to create and maintain inventory management systems, supply chain visibility systems, order management systems, supply chain automation systems, supplier management systems, and logistics management systems.

Low-code and no-code platforms have seen a significant increase in popularity in recent years, and there are a number of statistics that demonstrate this trend.

  1. According to a survey by Forrester Research, low-code development platforms are projected to be used for 65% of all application development by 2024.
  2. A report by Gartner suggests that by 2024, low-code application development will be responsible for more than 65% of application development activity.
  3. A study by IDC found that the worldwide low-code development platform market is expected to grow from $10.3 billion in 2019 to $45.8 billion by 2024, at a CAGR of 34.5% during the forecast period.
  4. According to a report by MarketsandMarkets, the no-code development platform market size is expected to grow from $3.8 billion in 2020 to $10.7 billion by 2025, at a CAGR of 22.9% during the forecast period.
  5. Another study by IDC suggests that the no-code development platform market is expected to reach $13.9 billion by 2023.

There are many low-code platforms available in the market, some examples include:

  1. Pega: A Low code platform also enables collaboration between business and IT teams, allowing them to work together to create and deploy applications that meet the needs of the business. The platform is designed to be flexible and scalable, allowing users to easily modify and update applications as business needs change. Pega provides features like, A visual drag-and-drop interface for building applications, Pre-built, reusable components and features, A library of pre-built connectors to popular systems and services, A built-in rules engine for making decisions and automating processes, A built-in case management system for handling customer inquiries and complaints.
  2. Salesforce Lightning: A low-code platform from Salesforce that allows developers to build custom applications and automate business processes on the Salesforce platform.
  3. OutSystems: A low-code platform that allows developers to create web and mobile applications, and automate business processes using drag-and-drop visual development.
  4. Mendix: A low-code platform that allows developers to create web and mobile applications using a visual development environment, and automate business processes using pre-built connectors to external systems.
  5. Appian: A low-code platform that allows developers to create web and mobile applications, automate business processes, and manage data using a visual development environment.
  6. Microsoft PowerApps: A low-code platform that allows developers to create web and mobile applications, automate business processes, and manage data using a visual development environment, and it’s integrated with Microsoft Power Platform.
  7. Zoho Creator: A low-code platform that allows developers to create custom applications, automate business processes, and manage data using a drag-and-drop interface.

There are many no-code platforms available in the market, some examples include:

  1. Bubble.io: A no-code platform that allows users to create web applications, automate workflows and manage data using a visual drag-and-drop interface.
  2. Webflow: A no-code platform that allows users to create and design responsive websites and web applications using a visual drag-and-drop interface.
  3. Adalo: A no-code platform that allows users to create and design mobile applications using a visual drag-and-drop interface.
  4. Wix: A no-code platform that allows users to create and design websites, web applications and e-commerce sites using a visual drag-and-drop interface.
  5. Airtable: A no-code platform that allows users to create custom databases and automate workflows using a visual drag-and-drop interface.
  6. Unqork: A no-code platform that allows users to create and design web applications, automate workflows and manage data using a visual drag-and-drop interface, specifically designed for enterprise use cases.

In summary, low-code and no-code platforms offer faster development, reduced costs, increased productivity, greater flexibility, better data and business process management, cloud and IoT integration and cross-platform compatibility. These benefits make it a popular choice for organizations looking to create and maintain software applications quickly, easily, and with minimal coding.

Tagged with: ,
Posted in Other

Will ChatGPT replace Google Search?

ChatGPT
Photo by DeepMind on Unsplash

In this article, let’s explore ChatGPT and its use cases and see ChatGPT replaces Google Search or not?

ChatGPT (short for “Conversational Generative Pre-training Transformer”) is a large language model developed by OpenAI. It is a variant of the GPT (Generative Pre-training Transformer) model, which is trained on a massive amount of text data to generate human-like text.

Unsupervised learning is a technique in which the model is not provided with any labeled or annotated data. Instead, the model is trained on a large dataset of text data and is able to learn patterns and relationships in the data on its own. This approach is used to train ChatGPT, allowing it to generate human-like text.You can preview ChatGPT https://chat.openai.com/chat by entering some questions. By interacting you will feel you are doing conversation with your friend.

Chatbots and virtual assistants have come a long way in recent years. One of the most advanced of these technologies is ChatGPT, a language model developed by OpenAI. But, can ChatGPT replace Google search? In this blog post, we’ll explore the capabilities of ChatGPT and compare them to those of Google search.

First, let’s take a look at ChatGPT. This language model is trained on a massive amount of text data and can generate human-like text. It can answer questions, write essays, and even generate code. ChatGPT has been used to create chatbots and virtual assistants that can answer questions and perform various tasks.

Now, let’s compare ChatGPT to Google search. Google search is a powerful tool that can find information on almost any topic. It can search the web, images, videos, news, and more. Google search also uses machine learning to provide personalized results and has a wide range of advanced features like voice search, autocomplete, and the Knowledge Graph.

While ChatGPT can answer questions and generate text, it’s not designed to search the web. Google search, on the other hand, is specifically designed to search the web and has a vast array of features to help users find the information they need.

Additionally, ChatGPT is best suited for generating human-like text, but it can’t match the speed, efficiency and accuracy of a search engine that has been fine-tuned over the years and uses advanced algorithms to find and rank the most relevant information.

While Google Search is a powerful tool for finding information on the web, there are certain situations where ChatGPT may be superior. Here are a few examples:

  1. Natural Language Processing: ChatGPT is trained on a massive amount of text data and can understand and respond to natural language queries. This makes it well-suited for answering questions and having conversations in a more human-like manner. In contrast, Google Search may not always provide the most accurate or helpful results when dealing with more complex or nuanced queries.
  2. Writing and Text Generation: ChatGPT can generate human-like text. It can be used to write essays, articles, emails, and even generate code. Google Search is not designed for this purpose and may not provide the same level of text generation capabilities.
  3. Privacy and Security: ChatGPT can be used to answer questions and perform tasks without sending data to a third-party server. This can be beneficial in situations where privacy and security are a concern, such as when dealing with sensitive information.
  4. Personalization: ChatGPT can be trained on specific data and fine-tuned to a particular use case, which allows it to provide highly personalized results and responses. Google search, while it can personalize results, it may not be as specific and accurate as ChatGPT in certain scenarios.
  5. Complex and specific use cases: ChatGPT can be used to answer highly specific and complex queries that might not be easily searchable through a search engine. For example, it could be used to answer technical questions in a specific industry or generate reports on specific topics, something that Google search may not be able to do as easily.

While ChatGPT is a powerful language model, there are certain situations where Google Search may be superior. Here are a few examples:

  1. Searching the web: Google Search is specifically designed for searching the web and has a vast array of features to help users find the information they need. It can search for text, images, videos, news, and more, while ChatGPT is not designed for this purpose.
  2. Speed and Efficiency: Google Search has been fine-tuned over the years and uses advanced algorithms to find and rank the most relevant information. It can provide results in a matter of seconds, while ChatGPT may take longer to process a query, especially if it requires more complex natural language understanding.
  3. Relevancy and accuracy: Google Search can provide highly relevant and accurate results by using algorithms that take into account factors such as page relevance, user location and search history, among others.
  4. Multilingual support: Google Search supports a wide range of languages and can provide results in multiple languages, while ChatGPT is primarily designed to work with English.
  5. Advanced features: Google Search provides a wide range of advanced features such as voice search, autocomplete, and the Knowledge Graph, which can be useful in certain situations. ChatGPT does not have these features and is primarily focused on generating text.
  6. Large scale search: Google Search can handle large scale search queries, whereas ChatGPT might struggle in terms of handling and providing the results for large scale queries.

It’s worth noting that Google Search and ChatGPT are different tools with different strengths and weaknesses, and both can be useful depending on the task at hand.

In conclusion, ChatGPT is a powerful language model that can generate human-like text and answer questions. However, it is not intended to replace Google search, which is specifically designed for searching the web and has a wide range of advanced features. While ChatGPT can be a useful tool for certain tasks, it can’t match the speed and efficiency of a search engine like Google.

Tagged with: ,
Posted in Artificial Intelligence

Red Hat Summit 2021

Tagged with: ,
Posted in Events

Reactive Summit 2021

Tagged with: ,
Posted in Events

Audit Database Changes with Debezium

Debezium

In this article, we will explore Debezium to capture data changes. Debezium is a distributed open-source platform for change data capture. Point the Debezium connector to the database and start listening to the change data events like inserts/updates/deletes right from the database transaction logs that other applications commit to your database.

Debezium is a collection of source connectors of Apache Kafka Connect. Debezium’s log-based Change Data Capture (CDC) allows ingesting the changes directly from the database’s transaction logs. Unlike other approaches, such as polling or dual writes, the log-based approach brings the below features.

  • Ensures that all data changes are captured. The data changes may come from multiple applications, SQL editors, etc. Debezium captures every change event.
  • Produces change events with a very low delay while avoiding increased CPU usage required for frequent polling.
  • As the changes are captured at the database transaction log level, no changes required to your data model, such as a “Last Updated” column.
  • It captures deletes.

Let us discuss a use case to audit the database table changes for compliance purposes. There are different approaches to audit the databases.

  1. Using database triggers to monitor the DDL/DML changes. But, database triggers come with pain if you don’t use them wisely and hence lot of enterprise applications avoid them.
  2. Envers. The Envers module aims to provide an easy auditing/versioning solution for entity classes. It does a good job but, below are the issues we have.
    1. The audit logging is synchronous.
    2. The audit logging and the actual database changes for business logic need to be wrapped with the same transaction. If the audit logging fails, the whole transaction needs to be rolled back.
    3. If we decide to push the changes to another database instance, we might end up using distributed transactions. This will add performance overhead to the application.
    4. If we need to push the changes to other systems like analytics, search, etc. will be problematic.
    5. Mixing audit logging with the actual business logic creates a codebase maintenance issue.
    6. Not able to capture the changes coming from other applications/SQL shell.

3. Writing our own audit framework to capture the data changes. This works but, has the same issues highlighted on #2 above.

Now, let us see how Debezium solves the use case of database audit. The below design depicts the components involved to audit the DB with Debezium.

Follow the below steps to setup the Debezium connector.

Step1: Download the connectors from https://debezium.io/releases/1.4/#installation . In this example I am using MySql. Hence, I downloaded Debezium MySql connector. Debezium has connectors for variety of databases.

Step2: Install Kafka cluster. I used a simple Kafka cluster with one Zookeeper and one broker. Under the same Kafka installation, you will find Kafka connect related properties. Set the Debezium related jar files into the Kafka connect classpath by updating the plugin.path under connect-distributed.properties file.

Step3: Enable the bin log for MySql database.

Step4: Launch the Kafka cluster and the Kafka connect by launching the below commands.

#To start the Zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
#To start the Kafka broker
/bin/kafka-server-start.sh /config/server.properties
#To start the Kafka connect
/bin/connected-distributed.sh /config/connected-distributed.properties

Step5: Add the MySql source connector configuration to the Kafka connect.

curl -k -X POST -H "Accept:application/json" -H "Content-Type:application/json" http://localhost:8083/connectors/ -d '{
"name": "mysql-connector-demo",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "localhost",
"database.port": "3306",
"database.user": "debezium",
"database.password": "dbz",
"database.server.id": "1",
"database.server.name": "dbserver1",
"database.history.kafka.bootstrap.servers": "localhost:9092",
"database.history.kafka.topic": "customers_audit",
"table.include.list": "inventory.customers",
"transforms": "Reroute",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"transforms.Reroute.topic.regex": "([^.]+)\\.([^.]+)\\.([^.]+)",
"transforms.Reroute.topic.replacement": "$3"
}
}'

The details of the configuration is explained below.

Step6: Now, run some inserts/updates/deletes on the table which we configured to audit to see the events on the topic.

Below are some of the events we received on the topic for insert/update/delete DML. The actual JSON will have other properties. But, I am showing the trimmed version for simplicity.

"payload": {
"before": null,
"after": {
"id": 1016,
"first_name": "Smart",
"last_name": "Techie",
"email": "smarttechie@gmail.com"
},
"source": {
"version": "1.4.2.Final",
"connector": "mysql",
"name": "dbserver1",
"ts_ms": 1615928467000,
"snapshot": "false",
"db": "inventory",
"table": "customers",
"server_id": 223344,
"gtid": null,
"file": "mysql-bin.000003",
"pos": 4015,
"row": 0,
"thread": 36,
"query": null
},
"op": "c",
"ts_ms": 1615928467236,
"transaction": null
}
"payload": {
"before": {
"id": 1016,
"first_name": "Smart",
"last_name": "Techie",
"email": "smarttechie@gmail.com"
},
"after": {
"id": 1016,
"first_name": "Smart",
"last_name": "Techie",
"email": "smarttechie_updated@gmail.com"
},
"source": {
"version": "1.4.2.Final",
"connector": "mysql",
"name": "dbserver1",
"ts_ms": 1615928667000,
"snapshot": "false",
"db": "inventory",
"table": "customers",
"server_id": 223344,
"gtid": null,
"file": "mysql-bin.000003",
"pos": 4331,
"row": 0,
"thread": 36,
"query": null
},
"op": "u",
"ts_ms": 1615928667845,
"transaction": null
}
"payload": {
"before": {
"id": 1016,
"first_name": "Smart",
"last_name": "Techie",
"email": "smarttechie_updated@gmail.com"
},
"after": null,
"source": {
"version": "1.4.2.Final",
"connector": "mysql",
"name": "dbserver1",
"ts_ms": 1615928994000,
"snapshot": "false",
"db": "inventory",
"table": "customers",
"server_id": 223344,
"gtid": null,
"file": "mysql-bin.000003",
"pos": 4696,
"row": 0,
"thread": 36,
"query": null
},
"op": "d",
"ts_ms": 1615928994823,
"transaction": null
}

You can find list of clients who uses Debezium here. I hope you enjoyed this article. We will meet in another blog post. Till then, Happy Learning!!

Tagged with: , , ,
Posted in Apache Kafka

Let’s think Kafka cluster without Zookeeper with KIP-500

Right now, Apache Kafka® utilizes Apache ZooKeeper™ to store its metadata. Information such as the partitions, configuration of topics, access control lists, etc. metadata stored in a ZooKeeper cluster. Managing a Zookeeper cluster creates an additional burden on the infrastructure and the admins. With KIP-500, we are going to see a Kafka cluster without the Zookeeper cluster where the metadata management will be done with Kafka itself.

Before KIP-500, our Kafka setup looks like depicted below. Here we have a 3 node Zookeeper cluster and a 4 node Kafka cluster. This setup is a minimum for sustaining 1 Kafka broker failure. The orange Kafka node is a controller node.

Let us see what issues we have with the above setup with the involvement of Zookeeper.

  • Making the Zookeeper cluster highly available is an issue as without the Zookeeper cluster the Kafka cluster is DEAD.
  • Availability of the Kafka cluster if the controller dies. Electing another Kafka broker as a controller requires pulling the metadata from the Zookeeper which leads to the Kafka cluster unavailability. If the number of topics and the partitions is more per topic, the failover Kafka controller time increases.
  • Kafka supports intra-cluster replication to support higher availability and durability. There should be multiple replicas of a partition, each stored in a different broker. One of the replicas is designated as a leader and the rest of the replicas are followers. If a broker fails, partitions on that broker with a leader temporarily become inaccessible. To continue serving the client requests, Kafka will automatically transfer the leader of those inaccessible partitions to some other replicas. This process is done by the Kafka broker who is acting as a controller. The controller broker should get metadata from the Zookeeper for each of the affected partition. The communication between the controller broker and the Zookeeper happens in a serial manner which leads to unavailability of the partition if the leader broker dies.
  • When we delete or create a topic, the Kafka cluster needs to talk to Zookeeper to get the updated list of topics. To see the impact of topic deletion or creation with the Kafka cluster will take time.
  • The major issue we see is the SCALABILITY issue.

Let’s see how the Kafka cluster looks like post KIP-500. Below is the Kafka cluster setup.

If you look at the post KIP-500, the metadata is stored in the Kafka cluster itself. Consider that cluster as a controller cluster. The controller marked in orange color is an active controller and the other nodes are standby controllers. All the brokers in the cluster will be in sync. So, during the failure of the active controller node, electing the standby node as a controller is very quick as it doesn’t require syncing the metadata. The brokers in the Kafka cluster will periodically pull the metadata from the controller. This design means that when a new controller is elected, we never need to go through a lengthy metadata loading process.

Post KIP-500 will speed up the topic creation and deletion. Currently, the topic creation or deletion requires to get the full list of topics in the cluster from the Zookeeper metadata. Post KIP-500, just the entry needs to add to the metadata partition. This speeds up the topic creation and deletion. Post KIP-500, the metadata scalability increases which eventually improves the SCALABILITY of Kafka.

In the future, I want to see the elimination of the second Kafka cluster for controllers and eventually, we should be able to manage the metadata within the actual Kafka cluster. That reduces the burden on the infrastructure and the administrator’s job to the next level. We will meet with another topic. Until then, Happy Messaging!!

Tagged with: ,
Posted in Apache Kafka

Building a 12-factor principle application with AWS and Microsoft Azure

                                                                                        Photo by Krisztian Tabori on Unsplash

                        In this article, I want to provide the services available to build the 12-factor applications on AWS and Microsoft Azure.

12-Factor Principles Amazon Web Services Microsoft Azure
Codebase
One codebase tracked in revision control, many deploys
AWS CodeCommit Azure Repos
Dependencies
Explicitly declare and isolate dependencies
AWS S3 Azure Artifacts
Config
Store config in the environment
AWS AppConfig App Configuration
Backing services
Treat backing services as attached resources
Amazon RDS, DynamoDB, S3, EFS and RedShift, messaging/queueing system (SNS/SQS, Kinesis), SMTP services (SES) and caching systems (Elasticache) Azure Cosmos DB, SQL databases, Storage accounts, messaging/queueing system(Service Bus/Event Hubs), SMTP services, and caching systems (Azure Cache for Redis)
Build, release, run
Strictly separate build and run stages
AWS CodeBuild
AWS CodePipeline
Azure Pipelines
Processes
Execute the app as one or more stateless processes
Amazon ECS services
Amazon Elastic Kubernetes Service
Container services
Azure Kubernetes Service (AKS)
Port binding
Export services via port binding
Amazon ECS services
Amazon Elastic Kubernetes Service
Container services
Azure Kubernetes Service (AKS)
Concurrency
Scale-out via the process model
Amazon ECS services
Amazon Elastic Kubernetes Service
Application Auto Scaling
Container services
Azure Kubernetes Service (AKS)
Disposability
Maximize robustness with fast startup and graceful shutdown
Amazon ECS services
Amazon Elastic Kubernetes Service
Application Auto Scaling
Container services
Azure Kubernetes Service (AKS)
Dev/prod parity
Keep development, staging, and production as similar as possible
AWS Cloud​Formation Azure Resource Manager
Logs
Treat logs as event streams
Amazon CloudWatch
AWS CloudTrail
Azure Monitor
Admin processes
Run admin/management tasks as one-off processes
Amazon Simple Workflow Service (SWF) Logic Apps
Tagged with: , ,
Posted in Cloud Computing

Containerizing Spring Boot Applications with Buildpacks

Spring Boot Buildpacks

                           In this article, we will see how to containerize the Spring Boot applications with Buildpacks. In one of the previous articles, I discussed Jib. Jib allows us to build any Java application as the docker image without Dockerfile. Now, starting with Spring Boot 2.3, we can directly containerize the Spring Boot application as a Docker image as Buildpacks support is natively added to the Spring Boot. With Buildpacks support any Spring Boot 2.3 and above applications can be containerized without the Dockerfile. I will show you how to do that with a sample Spring Boot application by following the below steps.

Step 1: Make sure that you have installed Docker.

Step 2: Create a Spring Boot application using version Spring Boot 2.3 and above. Below is the Maven configuration of the application.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.0.RELEASE</version>
<relativePath/> <!– lookup parent from repository –>
</parent>
<groupId>org.smarttechie</groupId>
<artifactId>spingboot-demo-buildpacks</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>spingboot-demo-buildpacks</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<!– Configuration to push the image to our own Dockerhub repository–>
<configuration>
<image>
<name>docker.io/2013techsmarts/${project.artifactId}:latest</name>
</image>
</configuration>
</plugin>
</plugins>
</build>
</project>

If you want to use Gradle, here is the Spring Boot Gradle plugin.

Step 3: I have added a simple controller to test the application once we run the docker container of our Spring Boot app. Below is the controller code.

package org.smarttechie.controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class DemoController {
@GetMapping
public String hello() {
return "Welcome to the Springboot Buildpacks!!. Get rid of Dockerfile hassels.";
}
}

Step 4: Go to the root folder of the application and run the below command to generate the Docker image. Buildpacks uses the artifact id and version from the pom.xml to choose the Docker image name.

./mvnw spring-boot:build-image

Step 5: Let’s run the created Docker container image and test our rest endpoint.

docker run -d -p 8080:8080 –name springbootcontainer spingboot-demo-buildpacks:0.0.1-SNAPSHOT

Below is the output of the REST endpoint.

Step 6: Now you can publish the Docker image to Dockerhub by using the below command.

docker push docker.io/2013techsmarts/spingboot-demo-buildpacks

Here are some of the references if you want to deep dive into this topic.

  1. Cloud Native Buildpacks Platform Specification.
  2. Buildpacks.io
  3. Spring Boot 2.3.0.RELEASE Maven plugin documentation
  4. Spring Boot 2.3.0.RELEASE Gradle plugin documentation

That’s it. We have created a Spring Boot application as a Docker image with Maven/Gradle configuration. The source code of this article is available on GitHub. We will connect with another topic. Till then, Happy Learning!!

Tagged with: , , , , ,
Posted in Containers, Java, Spring, Spring Boot

Build Reactive REST APIs with Spring WebFlux – Part3

Reactive Streams, Project Reactor, Spring WebFlux, Spring, Spring Boot Photo by Chris Ried on Unsplash

In continuation of the last article, we will see an application to expose reactive REST APIs. In this application, we used,

  • Spring Boot with WebFlux
  • Spring Data for Cassandra with Reactive Support
  • Cassandra Database

Below is the high-level architecture of the application.

Reactive Streams, Project Reactor, Spring WebFlux, Spring, Spring Boot

Let us look at the build.gradle file to see what dependencies are included to work with the Spring WebFlux.


plugins {
id 'org.springframework.boot' version '2.2.6.RELEASE'
id 'io.spring.dependency-management' version '1.0.9.RELEASE'
id 'java'
}
group = 'org.smarttechie'
version = '0.0.1-SNAPSHOT'
sourceCompatibility = '1.8'
repositories {
mavenCentral()
}
dependencies {
implementation 'org.springframework.boot:spring-boot-starter-data-cassandra-reactive'
implementation 'org.springframework.boot:spring-boot-starter-webflux'
testImplementation('org.springframework.boot:spring-boot-starter-test') {
exclude group: 'org.junit.vintage', module: 'junit-vintage-engine'
}
testImplementation 'io.projectreactor:reactor-test'
}
test {
useJUnitPlatform()
}

view raw

build.gradle

hosted with ❤ by GitHub

In this application, I have exposed the below mentioned APIs. You can download the source code from GitHub.

Endpoint URI Response
Create a Product /product Created product as Mono
All products /products returns all products as Flux
Delate a product /product/{id} Mono
Update a product /product/{id} Updated product as Mono

The product controller code with all the above endpoints is given below.


package org.smarttechie.controller;
import org.smarttechie.model.Product;
import org.smarttechie.repository.ProductRepository;
import org.smarttechie.service.ProductService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpStatus;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
@RestController
public class ProductController {
@Autowired
private ProductService productService;
/**
* This endpoint allows to create a product.
* @param product – to create
* @return – the created product
*/
@PostMapping("/product")
@ResponseStatus(HttpStatus.CREATED)
public Mono<Product> createProduct(@RequestBody Product product){
return productService.save(product);
}
/**
* This endpoint gives all the products
* @return – the list of products available
*/
@GetMapping("/products")
public Flux<Product> getAllProducts(){
return productService.getAllProducts();
}
/**
* This endpoint allows to delete a product
* @param id – to delete
* @return
*/
@DeleteMapping("/product/{id}")
public Mono<Void> deleteProduct(@PathVariable int id){
return productService.deleteProduct(id);
}
/**
* This endpoint allows to update a product
* @param product – to update
* @return – the updated product
*/
@PutMapping("product/{id}")
public Mono<ResponseEntity<Product>> updateProduct(@RequestBody Product product){
return productService.update(product);
}
}

As we are building reactive APIs, we can build APIs with a functional style programming model without using RestController. In this case, we need to have a router and a handler component as shown below.


package org.smarttechie.router;
import org.smarttechie.handler.ProductHandler;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.http.MediaType;
import org.springframework.web.reactive.function.server.RouterFunction;
import org.springframework.web.reactive.function.server.RouterFunctions;
import org.springframework.web.reactive.function.server.ServerResponse;
import static org.springframework.web.reactive.function.server.RequestPredicates.*;
@Configuration
public class ProductRouter {
/**
* The router configuration for the product handler.
* @param productHandler
* @return
*/
@Bean
public RouterFunction<ServerResponse> productsRoute(ProductHandler productHandler){
return RouterFunctions
.route(GET("/products").and(accept(MediaType.APPLICATION_JSON))
,productHandler::getAllProducts)
.andRoute(POST("/product").and(accept(MediaType.APPLICATION_JSON))
,productHandler::createProduct)
.andRoute(DELETE("/product/{id}").and(accept(MediaType.APPLICATION_JSON))
,productHandler::deleteProduct)
.andRoute(PUT("/product/{id}").and(accept(MediaType.APPLICATION_JSON))
,productHandler::updateProduct);
}
}

view raw

ProductRouter

hosted with ❤ by GitHub


package org.smarttechie.handler;
import org.smarttechie.model.Product;
import org.smarttechie.service.ProductService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Component;
import org.springframework.web.reactive.function.server.ServerRequest;
import org.springframework.web.reactive.function.server.ServerResponse;
import reactor.core.publisher.Mono;
import static org.springframework.web.reactive.function.BodyInserters.fromObject;
@Component
public class ProductHandler {
@Autowired
private ProductService productService;
static Mono<ServerResponse> notFound = ServerResponse.notFound().build();
/**
* The handler to get all the available products.
* @param serverRequest
* @return – all the products info as part of ServerResponse
*/
public Mono<ServerResponse> getAllProducts(ServerRequest serverRequest) {
return ServerResponse.ok()
.contentType(MediaType.APPLICATION_JSON)
.body(productService.getAllProducts(), Product.class);
}
/**
* The handler to create a product
* @param serverRequest
* @return – return the created product as part of ServerResponse
*/
public Mono<ServerResponse> createProduct(ServerRequest serverRequest) {
Mono<Product> productToSave = serverRequest.bodyToMono(Product.class);
return productToSave.flatMap(product ->
ServerResponse.ok()
.contentType(MediaType.APPLICATION_JSON)
.body(productService.save(product), Product.class));
}
/**
* The handler to delete a product based on the product id.
* @param serverRequest
* @return – return the deleted product as part of ServerResponse
*/
public Mono<ServerResponse> deleteProduct(ServerRequest serverRequest) {
String id = serverRequest.pathVariable("id");
Mono<Void> deleteItem = productService.deleteProduct(Integer.parseInt(id));
return ServerResponse.ok()
.contentType(MediaType.APPLICATION_JSON)
.body(deleteItem, Void.class);
}
/**
* The handler to update a product.
* @param serverRequest
* @return – The updated product as part of ServerResponse
*/
public Mono<ServerResponse> updateProduct(ServerRequest serverRequest) {
return productService.update(serverRequest.bodyToMono(Product.class)).flatMap(product ->
ServerResponse.ok()
.contentType(MediaType.APPLICATION_JSON)
.body(fromObject(product)))
.switchIfEmpty(notFound);
}
}

view raw

ProductHandler

hosted with ❤ by GitHub

So far, we have seen how to expose reactive REST APIs. With this implementation, I have done a simple benchmarking on reactive APIs versus the non-reactive APIs (built non-reactive APIs using Spring RestController) using Gatling. Below are the comparison metrics between the reactive and non-reactive APIs. This is not an extensive benchmarking. So, before adopting please make sure to do extensive benchmarking for your use case. 

Load Test Results Comparison (Non-Reactive vs Reactive APIs)

The Gatling load test scripts are also available on GitHub for your reference. With this, I conclude the series of “Build Reactive REST APIs with Spring WebFlux“. We will meet on another topic. Till then, Happy Learning!!

Tagged with: , , , , ,
Posted in Java, Microservices, Reactive Programming, Spring, Spring Boot
Dzone.com
DZone

DZone MVB

Java Code Geeks
Java Code Geeks
OpenSourceForYou