Lakshay Kumar

Contributing to Open Source in Python

Lakshay Kumar — Sun, 01 Oct 2023 05:55:57 GMT

Open-source software development is a collaborative effort where developers from around the world work together to build and improve software that is freely available for anyone to use. Python, a versatile and widely used programming language, has a vibrant open-source community that offers numerous opportunities for individuals to contribute. In this blog post, we will explore the world of open-source contributions in Python, including forking repositories on GitHub, setting up a virtual environment, editing code, and creating pull requests.

Introduction
Understanding Open Source Contributions
Getting Started
- Creating a GitHub Account
- Forking a GitHub Repository
Setting Up Your Development Environment
Making Your First Contribution
- Editing Code
- Testing Your Changes
Sharing Your Contribution
- Pushing Changes to GitHub
- Creating a Pull Request
Conclusion

1. Introduction

Contributing to open-source projects is not only a way to give back to the community but also an excellent opportunity to learn, collaborate, and build your skills as a developer. Python, with its extensive library of open-source projects, provides an ideal platform for aspiring contributors.

2. Understanding Open Source Contributions

Can I contribute to open source using Python?

Absolutely! Python is one of the most popular programming languages for open-source development. Whether you are interested in web development, data science, machine learning, or any other domain, there are numerous Python-based open-source projects awaiting contributions.

What are open-source contributions?

Open-source contributions refer to actively participating in the development of open-source software. This can include writing code, fixing bugs, creating documentation, providing support to users, or even helping with design and testing.

What is open source in Python?

Open source in Python refers to Python-based projects or libraries whose source code is publicly available, allowing anyone to view, modify, and contribute to it. These projects often have repositories hosted on platforms like GitHub, where collaboration takes place.

How do you write an open-source contribution?

Writing an open-source contribution involves several steps, including forking a repository, setting up a development environment, making code changes, testing those changes, pushing them to the repository, and finally creating a pull request (PR) to submit your contribution.

3. Getting Started

Creating a GitHub Account

To contribute to open-source projects, you'll need a GitHub account. GitHub is a popular platform for hosting and collaborating on code repositories. Go to GitHub and sign up for an account if you don't already have one.

Forking a GitHub Repository

Once you have a GitHub account, you can start contributing by forking a repository. Forking creates a copy of the original repository under your account, allowing you to work on it independently.

Visit the repository you want to contribute to. In this example, we'll use the lkcomputervision repository by Laksh Agarwal.
Click the "Fork" button in the top right corner of the repository's page. This action will create a copy of the repository under your GitHub account.

4. Setting Up Your Development Environment

Installing Python

Before you can start contributing to Python open source projects, you need to have Python installed on your computer. You can download the latest version of Python from the official Python website and follow the installation instructions for your operating system.

Using Virtual Environments

It's a best practice to work in a virtual environment when contributing to open source projects. Virtual environments isolate project-specific dependencies and prevent conflicts with other Python projects.

To create a virtual environment, open your terminal or command prompt and run the following commands:

# Create a virtual environment (replace 'myenv' with your preferred name)python -m venv myenv# Activate the virtual environment# On Windows:myenv\Scripts\activate# On macOS and Linux:source myenv/bin/activate

Installing Dependencies

Projects often have specific dependencies required for development. To install these dependencies, navigate to your project's directory and use the following command:

pip install -r requirements.txt

5. Making Your First Contribution

Editing Code

Now that your environment is set up, you can start making code contributions. Open the code files you want to work on using your preferred code editor.

Make sure to follow the project's contribution guidelines and coding style. These guidelines can usually be found in the project's README or CONTRIBUTING.md file.

Edit the code to implement your contribution, fix a bug, or add a new feature. Be sure to document your changes within the code and write clear commit messages to describe what you've done.

Testing Your Changes

Before sharing your contribution, it's crucial to test your changes to ensure they work as intended. Run any tests provided by the project and conduct additional testing if necessary. Make sure your code doesn't introduce new issues or break existing functionality.

Pushing Changes to GitHub

Once you are satisfied with your code changes and tests have passed, it's time to push your changes to your forked repository on GitHub. Use the following commands:

# Add your changes to the staging areagit add .# Commit your changes with a descriptive messagegit commit -m "Your commit message here"# Push your changes to your GitHub repositorygit push origin master

Replace master with the branch name you are working on if it's not the default branch.

Creating a Pull Request

After pushing your changes, you can create a pull request (PR) to submit your contribution to the original repository. Follow these steps:

Visit your forked repository on GitHub.
Click the "Pull Request" tab.
Click the "New Pull Request" button.
Select the base repository (the original repository you forked from) and the base branch (usually "master" or "main").
Select your fork as the head repository and your branch with the changes.
Add a title and description for your pull request, explaining the purpose and details of your contribution.
Click "Create Pull Request."

Your pull request is now submitted for review by the project maintainers. Be patient, as it may take some time for them to review your code and provide feedback.

7. Conclusion

Contributing to open source in Python is a rewarding journey that allows you to collaborate with the global developer community, improve your coding skills, and give back to the projects you rely on. In this blog post, we've covered the essential steps to get started with open-source contributions in Python, from forking repositories on GitHub to creating pull requests. Remember to always follow the project's guidelines, communicate with maintainers and other contributors, and enjoy the process of making meaningful contributions to the world of open-source software.

Start your open-source journey today, and let your Python skills shine in the world of collaborative development!

Contributing to open-source projects in Python is an excellent way to grow as a developer and give back to the community. By following the steps outlined in this guide, you can embark on your open-source journey with confidence and make a positive impact on the Python ecosystem.

Activation Functions in Deep Learning

Lakshay Kumar — Thu, 11 May 2023 07:29:22 GMT

In the vast realm of deep learning, where intricate neural networks simulate the complexities of human cognition, lies a pivotal element that empowers these models to learn and make predictions with astonishing accuracy: activation functions. Often overshadowed by the glitz and glamour of advanced architectures and massive datasets, these unassuming mathematical formulas play a critical role in shaping the behavior and expressiveness of neural networks.

Imagine a world where neural networks are merely passive observers, incapable of distinguishing signal from noise, unable to unlock the hidden patterns buried within vast amounts of data. Activation functions hold the key to breathing life into these networks, empowering them to transform inputs into meaningful outputs, and facilitating the non-linearity required to tackle real-world challenges.

In this blog, we embark on a journey to explore the inner workings of activation functions in deep learning. We'll dive into their significance, understand the diverse range of functions available, and uncover the impact they have on model performance. By the end of this exploration, you'll gain a deeper understanding of how these seemingly inconspicuous formulas shape the very essence of deep learning and unleash its true potential.

Activation Functions

Activation functions are mathematical functions applied to the outputs of individual neurons or nodes within a neural network. They introduce non-linearity into the network, enabling it to learn and model complex relationships between inputs and outputs.

Activation functions play a vital role in determining the activation levels or "firing" of neurons. They define the output values based on the weighted sum of inputs received by a neuron and introduce non-linear transformations that allow neural networks to approximate highly nonlinear functions.

Activation functions serve two primary purposes:

Introducing non-linearity: Linear operations, such as simple weighted sums, are limited in their ability to model complex relationships. Activation functions provide the non-linear element required for neural networks to learn and represent intricate patterns and mappings in the data.
Enabling decision-making: Activation functions determine the level of activation or firing of a neuron, thereby deciding whether the information it carries is relevant or not. This enables the network to make decisions and classify inputs into different categories.

There are various types of activation functions commonly used in deep learning, each with its own characteristics and suitability for different tasks. Examples include the sigmoid function, hyperbolic tangent (tanh), rectified linear unit (ReLU), and variants like leaky ReLU and exponential linear unit (ELU).

Choosing an appropriate activation function is crucial as it impacts the network's ability to learn, convergence speed, and generalization capabilities. Deep learning researchers and practitioners often experiment with different activation functions to find the most suitable one for a specific task or architecture.

Sigmoid function:

The sigmoid function is a mathematical function commonly used in machine learning and neural networks. It maps any real-valued number to a value between 0 and 1, providing a smooth and continuous "S"-shaped curve. The function is defined as:

$$\sigma(x) = \frac{1}{1 + e^{-x}}$$

In this equation, "e" represents the base of the natural logarithm, and "x" is the input value. The sigmoid function has a range between 0 and 1, with an output of approximately 0.5 when the input is zero. As the input becomes more positive, the output approaches 1, and as the input becomes more negative, the output approaches 0.

The sigmoid function is particularly useful in machine learning because it allows us to convert an input into a probability. The output of the sigmoid function can be interpreted as the probability of a binary event occurring, where values close to 1 indicate a high probability, and values close to 0 indicate a low probability. Additionally, the smooth and differentiable nature of the sigmoid function facilitates efficient training of neural networks using gradient-based optimization algorithms.

Hyperbolic Tangent (tanh) Function:

The hyperbolic tangent function, denoted as tanh, is a mathematical function commonly used in various fields, including mathematics, physics, and machine learning. It is an activation function that maps real numbers to a range between -1 and 1. The tanh function is derived from the regular tangent function but adjusted to fit within the range of -1 to 1.

Mathematically, the tanh function can be expressed as:

$$\tanh(x) = \frac{{e^x - e^{-x}}}{{e^x + e^{-x}}}$$

Here, 'e' represents Euler's number, a mathematical constant approximately equal to 2.71828, and 'x' is the input value.

The tanh function has several key properties. Firstly, it is an odd function, meaning tanh(-x) = -tanh(x), which results in symmetry around the origin. Additionally, the tanh function is bounded, with its range limited between -1 and 1. As 'x' approaches positive or negative infinity, tanh(x) approaches 1 and -1, respectively.

As an activation function. It can introduce non-linearity into the network, allowing it to model complex relationships between input and output. The tanh function is advantageous over other activation functions, such as the sigmoid function, as it produces values centred around zero, which can aid in faster convergence during training.

ReLU Function:

The Rectified Linear Unit (ReLU) is a widely used activation function in artificial neural networks. It introduces non-linearity by outputting the input value if it is positive, and zero otherwise. The ReLU function is defined as:

$$f(x) = max(0,x)$$

Mathematically, this means that for any input value 'x', the ReLU function returns 'x' if it is positive, and zero otherwise. This simple thresholding behavior allows the ReLU function to model complex relationships between the inputs and outputs of a neural network.

The ReLU activation function offers several advantages. Firstly, it is computationally efficient to evaluate compared to other activation functions like the sigmoid or tanh. The ReLU function also helps to mitigate the vanishing gradient problem, which can occur during backpropagation in deep neural networks. Additionally, ReLU activations produce sparse representations, as many neurons can be activated while others remain inactive.

However, one limitation of ReLU is that it can cause dead neurons when the input is negative, leading to a zero gradient and no learning. To address this, various modifications have been proposed, such as Leaky ReLU, Parametric ReLU, and Exponential ReLU, which introduce small non-zero slopes for negative inputs to alleviate the dead neuron issue.

Leaky ReLU:

Leaky ReLU (Rectified Linear Unit) is an activation function commonly used in artificial neural networks. It is an improvement over the standard ReLU function, addressing one of its limitations. In the Leaky ReLU function, for inputs less than zero, instead of simply outputting zero as in ReLU, a small linear component is introduced. This linear component helps overcome the "dying ReLU" problem where neurons that have negative inputs during training become inactive, resulting in dead neurons that do not contribute to learning.

The mathematical representation of the Leaky ReLU function is as follows:

$$f(x) = max(ax, x)$$

where x is the input to the function, and a is a small positive constant known as the leakage coefficient. Typically, the value of a is set to a small value such as 0.01. If x is positive, the output is equal to x. If x is negative, the output is equal to ax, where ax is a small fraction of x.

By introducing this small linear component, the Leaky ReLU function ensures that even neurons with negative inputs can contribute to the network's learning process. This helps prevent the dying ReLU problem and improves the overall performance and robustness of the neural network.

Exponential Linear Unit (ELU):

The Exponential Linear Unit (ELU) is an activation function commonly used in deep learning models. It addresses the limitations of other activation functions, such as the vanishing gradient problem, and helps improve the performance of neural networks.

ELU is defined by a piecewise function that consists of two parts: the linear region for negative inputs and the exponential region for non-negative inputs. Mathematically, ELU is expressed as follows:

$$f(x) = \begin{cases} x, & \text{if } x \geq 0 \\ \alpha \cdot (\exp(x) - 1), & \text{if } x < 0 \end{cases}$$

Softmax Function:

The softmax function is commonly used as an activation function in machine learning and neural networks. It is particularly useful in multi-class classification problems, where the goal is to assign an input to one of several possible categories. The softmax function takes a vector of real-valued numbers as input and transforms them into a probability distribution over the classes.

Mathematically, given an input vector z = [z, z, ..., zn], the softmax function computes the exponential of each element, yielding exp(z), exp(z), ..., exp(zn). These values are then normalized by dividing each element by the sum of all exponentiated values, exp(z) + exp(z) + ... + exp(zn). The result is a vector of values between 0 and 1 that add up to 1, representing probabilities.

The softmax function's mathematical formula can be expressed as follows:

$$\operatorname{softmax}(z_i) = \frac{\exp(z_i)}{\sum_{j=1}^n \exp(z_j)}, \qquad i=1,2,\dots,n$$

For above in matrix form

$$\operatorname{softmax}(\mathbf{z}) = \frac{\exp(\mathbf{z})}{\sum_{j=1}^n \exp(z_j)}, \qquad \mathbf{z}=[z_1, z_2, \dots, z_n]^\top$$

By applying the softmax function, the largest value in the input vector is amplified, while smaller values are suppressed. This makes the output vector suitable for interpreting as class probabilities, allowing us to choose the class with the highest probability as the predicted class during classification tasks.

Conclusion

In conclusion, we have explored several popular activation functions used in machine learning and neural networks. Each activation function offers unique properties and advantages that can impact the performance and behavior of the model.

The sigmoid function, with its S-shaped curve, is commonly used in binary classification problems. It squashes the input into a range between 0 and 1, representing probabilities. However, it suffers from the vanishing gradient problem, which can hinder training in deep networks.

The hyperbolic tangent (tanh) function also squashes the input between -1 and 1, but it is symmetric around the origin. It addresses the vanishing gradient problem to some extent and is often used in recurrent neural networks (RNNs) and certain types of architectures.

The rectified linear unit (ReLU) function is widely popular due to its simplicity and effectiveness. It sets negative inputs to zero, providing faster convergence during training. However, ReLU suffers from the dying ReLU problem, where neurons can get stuck in a state of inactivity.

Leaky ReLU and Parametric ReLU (PReLU) are variants of ReLU that address the dying ReLU problem by introducing a small slope for negative inputs. This helps prevent the complete "death" of neurons and improves the performance of deep networks.

The softmax function is commonly used in multi-class classification problems. It transforms a vector of real values into a probability distribution over classes, allowing us to select the class with the highest probability as the predicted class.

Choosing the appropriate activation function depends on the specific problem, network architecture, and desired behaviour. Experimentation and understanding the characteristics of each activation function are crucial for achieving optimal performance in machine learning tasks.

That's all for this blog, For any queries, feel free to write in the comments or reach out to me over different social media platforms. Know more at lakshaykumar.tech

Happy Learning!

Generative Artificial Intelligence

Lakshay Kumar — Mon, 08 May 2023 07:08:49 GMT

Artificial Intelligence (AI) has been one of the most revolutionary advancements in the field of technology in recent years. The ability of machines to perform complex tasks previously only possible for humans has transformed the way we live, work, and interact with each other. One of the latest and most exciting advancements in the field of AI is Generative AI. This technology allows machines to create new and unique content such as images, videos, and even entire stories without human intervention. In this blog post, we will explore the maths behind Generative AI. The images in this blog are taken from 'youtube'.

What is this?

Generative AI is a rapidly evolving field of Artificial Intelligence that is focused on developing algorithms capable of generating new content autonomously. This is achieved through the use of generative models that are trained on a dataset of existing content. The models then use this data to generate new examples that are similar in style, structure, and content to the original dataset.

There are various types of generative models used in AI, such as the Naive Bayes Classifier, Gaussian Mixture Model, and Generative Adversarial Networks (GANs). These models are designed to create new examples that mimic the features of the training data by learning patterns, relationships, and distributions within the dataset.

GANs

GAN stands for Generative Adversarial Networks. It is Deep learning-based generative models are used for unsupervised learning. Two neural networks compete with each other to generate variations in the data.

This concept was first published in "Alec Radford Paper in the year 2016 - DCGAN (Deep Convolutional General Adversarial Networks)".

It uses two sub-models

Generator: The generator network takes a sample and generates a sample of data. Used to generate fake data for negative sampling.
Discriminator: Discriminator network decides whether the data is generated or taken from the real sample using binary classification using the sigmoid function giving output 0 or 1.

How does GAN work

First, the GAN model is trained on a dataset of real samples. This dataset serves as a reference for the generator network to generate new samples that resemble the real data.
During the training process, the generator network takes random noise as input and generates a sample. This generated sample is then passed on to the discriminator network.
The discriminator network is designed to differentiate between real and fake samples. It takes both the real and generated samples as input and provides an output that indicates whether the sample is real or fake.
The generator network then adjusts its parameters based on the feedback from the discriminator network. If the generated sample is classified as fake, the generator network modifies its parameters to generate a sample that is more similar to the real data. This process continues until the generator network can generate samples that are indistinguishable from the real data.
The discriminator network is also simultaneously updated during the training process. Its goal is to accurately distinguish between real and fake samples. The updates to the discriminator network are based on the accuracy of its classification of the generated samples.
This process of training and updating both the generator and discriminator networks continues until the generator network can generate samples that are virtually indistinguishable from the real data.

Training a GAN Model

Train the discriminator and freeze the generator, which means the training set for the generator is turned as False and the network will only do the forward pass and no back-propagation will be applied.

Train the generator and freeze the discriminator. In this phase, we get the results from the first phase and can use them to make better from the previous state to try and fool the discriminator better.

Mathematical Formulation of GAN

$$\min_G \max_D V(D,G) = \mathbb{E}_{x \sim p_{\text{data}}(x)}[\log D(x)] + \mathbb{E}_{z \sim p_z(z)}[\log(1 - D(G(z)))]$$

The function is the objective function of a generative adversarial network (GAN) and is used to train the generator and discriminator networks in the GAN.

The objective of the GAN is to generate synthetic data that closely resembles the real data. The function represents a minimax game between the generator network (G) and the discriminator network (D), where G tries to minimize the function and D tries to maximize it.

The function takes as input two probability distributions: the distribution of the real data samples (p_data(x)) and the distribution of the noise vector (p_z(z)). It has two terms, each representing an expected value.

The first term represents the expected value of the logarithm of the discriminator's output when fed real data samples. The discriminator network tries to maximize this term by correctly classifying real data samples as real.

The second term represents the expected value of the logarithm of 1 minus the discriminator's output when fed generated samples from the generator network. The generator network tries to minimize this term by generating samples that the discriminator network classifies as real.

By minimizing this function, the generator network learns to generate synthetic data that is similar to the real data, while the discriminator network learns to distinguish between real and fake samples. The training process continues until the generator network can generate synthetic data that is indistinguishable from the real data.

For Fake Data

For Real Data

Here the unsupervised model (non-labelled data, real and fake) is being converted to a supervised model through Adversarial Framework

E(x~p_data(x))[log D(x)] - Discriminative prediction on real data. The expectation of log when the input is from the real data distribution. Basically, the average of the discriminator's predictions when data is real. Discriminator wants a high D(x) value
E(z~p_data(z))[log D(G(x))] - Discriminative prediction on fake data. The expectation of the log is when we pull a lot of input noise from the generator. Discriminator wants a low D(G(z)) value showing the confidence

Types of GAN

There are several types of generative adversarial networks (GANs) that have been developed, each with its own unique architecture and training procedure. Here are some of the most common types of GANs:

Vanilla GANs: These are the standard GANs that were introduced by Ian Goodfellow in 2014. They consist of a generator network that generates fake samples and a discriminator network that distinguishes between real and fake samples.
Conditional GANs: These are GANs that are conditioned on additional information, such as class labels or images. The additional information is typically fed into both the generator and discriminator networks to help them generate and distinguish between samples that belong to different categories.
Deep Convolutional GANs (DCGANs): These are GANs that use deep convolutional neural networks (CNNs) as both the generator and discriminator networks. They are particularly effective for generating high-resolution images.
Wasserstein GANs (WGANs): These are GANs that use a different loss function based on the Wasserstein distance between the real and fake data distributions. This loss function is more stable than the one used in vanilla GANs and can lead to better results.
CycleGANs: These are GANs that learn to translate between two different domains, such as turning a photo into a painting. They consist of two GANs, one for each domain, that are trained in an adversarial manner to generate samples that can be translated between the two domains.
Progressive GANs: These are GANs that generate high-resolution images by gradually increasing the resolution of the generated images during training. They are able to generate images that are much higher in resolution than other types of GANs.

These are just some of the many types of GANs that have been developed. Each type has its own advantages and disadvantages, and the choice of which type to use depends on the specific application and the desired outcome.

Conclusion

To sum up, generative AI has the potential to revolutionize many fields by enabling machines to create new data and generate novel solutions to complex problems. The development of generative models such as GANs, VAEs, and autoregressive models has opened up new avenues for creativity, innovation, and problem-solving. From generating art, music, and literature to synthesizing new materials, drugs, and molecules, generative AI is unlocking new possibilities in many domains. While there are still many challenges to be overcome, such as improving the stability and diversity of generated samples, the future of generative AI looks bright, and we can expect to see many exciting advances in this field in the years to come.

That's all for this blog, For any queries, feel free to write in the comments or reach out to me over different social media platforms. Know more at https://www.lakshaykumar.tech/

Edge computing and Tensorflow Lite

Lakshay Kumar — Wed, 01 Feb 2023 15:25:49 GMT

We all know that ML requires a lot of computing speed and even cloud services are required on large scale. This blog covers details about how to integrate ML and run TensorFlow on a microcontroller that will help in saving a lot of costs. This blog emphasises that this doesnt need high computing power if we learn to use Tiny ML effectively.

So, all this started with a problem, I wanted to control the environmental conditions of a plant using Machine Learning, but surprisingly the Microcontroller i.e. ESP32 have just 512 kB of space and my model itself was 1.23 MB. So the question is, how can I use that model in my ESP32 board?

After some research, I got to know about Edge Computing which refers to the process of performing data processing and storage at the edge of a network, rather than in a centralized location such as a data centre. This allows for faster and more efficient data processing, as well as reduced latency and bandwidth usage.

For example, Self-driving cars rely on edge computing to process sensor data in real-time, allowing them to make quick decisions and navigate safely.

Talking about the advantages of Edge computing, here are a few -

Reduced Latency: Edge computing minimizes the distance data must travel, reducing latency and enabling real-time decision-making.
Improved Performance: Edge computing can handle large amounts of data generated by IoT devices and other sources, improving the overall performance of the system.
Increased Security: Edge computing can improve security by reducing the amount of sensitive data that needs to be transmitted over the network.
Cost Savings: Edge computing has the potential to reduce costs and increase efficiency by reducing the amount of data that needs to be transmitted and stored in the cloud.
Offline Operation: Edge computing enables devices to continue to operate even when disconnected from the network, providing continuity of operations in remote or offline locations.
Increased Reliability: Edge computing can provide a more reliable solution by reducing the risk of data loss due to network outages or other disruptions.
Scalability: Edge computing can be scaled easily as the number of devices and data sources grows, without requiring significant changes to the central infrastructure.
Increased Flexibility: Edge computing allows for the deployment of machine learning models and other advanced technologies at the edge, enabling more flexible and innovative solutions.

Delving more into the problem, Another concept that can help me reach the solution is Tiny ML. It is a subset of the broader field of machine learning (ML) that focuses on implementing ML algorithms on small, low-power devices such as microcontrollers, sensors, and embedded systems.

So, my approach to the solution

Deploy Model to cloud: This requires internet that might not be available always to the ESP32.
Using Model Directly: Again space issues?

Tflite

TensorFlow Lite (TFLite) is a lightweight version of TensorFlow, an open-source machine learning framework developed by Google. It is specifically designed to run on resource-constrained devices such as mobile phones, embedded systems, and microcontrollers.

Refer to this colab notebook- https://colab.research.google.com/drive/1P3KrhuXFPjKb8Ub9lLcnSek2KZH1xQ4k?usp=sharing

This colab shows how to reduce the size of our model.

The repository uses the process of Quantization

Quantization is the process of converting a continuous, high-precision representation of a signal or data into a lower-precision, discrete form. The goal of quantization is to reduce the memory and computational requirements of a machine-learning model, without significantly affecting its accuracy. This is achieved by reducing the number of bits used to represent the data, thus reducing the number of possible values that can be represented. Quantization can be applied to weights, activations, and gradients in a neural network, and is a common technique used in hardware acceleration and deployment of deep learning models on embedded devices with limited resources.

Model Accuracy

There was a slight change in the accuracy of the model.

The goal here is to show how to compress the size of the model, so as not to focus much on the accuracy part.

Now we have a model, how can we use this model in Arduino code. `model.tflite` is the file generated from the python code.

#include // Load the model into a TensorFlow Lite interpretertflite::FlatBufferModel model("model.tflite");tflite::ops::builtin::BuiltinOpResolver resolver;tflite::InterpreterBuilder builder(model, resolver);std::unique_ptr interpreter;builder(&interpreter);interpreter->AllocateTensors();// Get pointers to the input and output tensorsTfLiteTensor* input = interpreter->input(0);TfLiteTensor* output = interpreter->output(0);// Create a buffer to hold the input datafloat input_data[3] = { 45,40,8 };// Fill the input tensor with datamemcpy(input->data.f, input_data, sizeof(input_data));// Run the modelinterpreter->Invoke();// Get the results from the output tensorfloat* result = output->data.f;for (int i = 0; i < 3; i++) {    int prediction = round(result[i]);    Serial.print("Prediction for input ");    Serial.print(i);    Serial.print(" is ");    Serial.println(prediction);}

To process the image, use the following piece of code. Make sure you have a camera connected to ESP32. This code converts the image to a matrix form that can be used as an input for your model.

#include #include #include #include #include Adafruit_VC0706 cam = Adafruit_VC0706(&Serial1);void setup() {  Serial.begin(115200);  if (!cam.begin()) {    Serial.println("Couldn't find camera");    while (1);  }  // Take a photo  if (!cam.takePicture()) {    Serial.println("Failed to take picture");    while (1);  }  // Read the image data  uint8_t *image;  uint16_t jpglen = cam.frameLength();  image = new uint8_t[jpglen];  if (!cam.readPicture(image, jpglen)) {    Serial.println("Failed to read picture");    while (1);  }  // Convert the image to a matrix  int rows = cam.getSize().height;  int cols = cam.getSize().width;  uint8_t matrix[rows][cols];  int index = 0;  for (int i = 0; i < rows; i++) {    for (int j = 0; j < cols; j++) {      matrix[i][j] = image[index];      index++;    }  }  // Use the matrix as desired  // ...  delete[] image;}void loop() {  // nothing to do here}

Challenges

Finding the working code available: Have to work with lots of hit and trial codes
Model compatibility with .tflite library version in Arduino: Since .tflite is still in the development process, its version compatibility is a major problem
Handling tensors in C++ is a bit complex: Working with tensors and matrices in C++ is challenging, as C++ is closer to CPU than Python.

In conclusion, TFLite and Edge computing are revolutionizing the way we approach machine learning, making it possible to deploy complex models on low-power devices. This opens up exciting new possibilities for creating intelligent applications that can run on a wide range of devices, from smartphones and wearables to industrial equipment and IoT devices. The future of TFLite and Edge computing is bright, and it will be exciting to see how developers leverage this technology to build the next generation of smart and connected devices. Whether you're a seasoned machine learning practitioner or just starting to explore this field, the benefits of TFLite and Edge computing are undeniable. Get ready to witness a world of limitless possibilities, where devices are no longer just smart, but truly intelligent.

For any queries, feel free to write in the comments or reach out to me over different social media platforms. Know more at https://www.lakshaykumar.tech/

Machine Learning Simplified!

Lakshay Kumar — Tue, 17 Jan 2023 04:49:45 GMT

We have been hearing about machine learning for a long time. IBM defines Machine Learning as a "Branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy"

We categorize ML as Supervised, Unsupervised and Reinforcement Learning. All these types have many Algorithms under them. Every algorithm has its own maths behind its work.

I am exploring this world of applied Mathematics for the last 2 years as an absolute beginner. Starting from concepts of Regression, and exploring algorithms like SVM, decision tree and advanced algorithms such as Random forest, there is rigorous mathematics involved. But there is a very basic equation being followed in most of the algorithms, i.e.

$$y = mx + c$$

You might have studied this in your high school under the topic Equation of Straight Line. This is the very basic equation to determine the equation of the straight line. In this blog, we'll see how this equation is modified in different algorithms and how they work to provide accurate output.

x - Independent Variable

y - Dependent Variable

Simple Linear Regression -

It is the type of regression that shows how two variables (x and y) are changing. We have points on the XY plane and we fit a line across all the points such that the distance between the points and the line is minimum. Here is an example.

We see the y and x are increasing linearly, so we can fit a line like this and find the values of m and c defining the slope and intercept respectively.

This is a very simple example. That can predict the value of y as per the value of x.

Multiple Linear Regression -

The above simple example gave us the idea of how to find a relationship between two variables. Instead of a single independent variable x, what if we have multiple values of x i.e. my output depends on multiple variables? In that case, we will have a messy graph like this

We see three variables here with 3 different lines fit. Here we have multiple values of x and an intercept. Here the equation is like this.

$$y = x{1}m{1} + x{2}m{2} + x{3}m{3} + c$$

Note here that for every independent variable, we have a different slope. We can generalize the equation here as-

$$y = \sum{i=1}^{n}x{i}m_{i} + c$$

Neural Networks

You might be surprised, but actually, the neural network works on this slope-intercept equation. But the twist is, we have to work with different equations at different times.

We see that at the end, we have an equation something similar to the above in Multiple Linear Regression. In a complex neural network, we have multiple biases, these biases will continue till we reach the output layer.

Here I have discussed three major use cases of this equation. But there are numerous use cases. For any queries, feel free to comment or contact me over email. Don't forget to subscribe on my website - lakshaykumar.tech

Let numbers speak!

Lakshay Kumar — Thu, 12 Jan 2023 14:58:40 GMT

Machine Learning and Data Science are all about playing with numbers. In this blog, I'll be covering some basic mathematical functions/terms that can help you in data analysis.

Logarithm

Imagine you have to visualize the following table

Company	Revenue
Tesla	31
Uber	11
Amazon	386
Jindal Steel	4.7
Axis Bank	5.6
Vedanta	11.3

Plotting this through a bar plot, this is what we'll get

We can see that Amazon have such a high revenue that it is making it difficult to read the revenue for Jindal Steel and Axis Bank. So what we do now, here comes the concept of log. The Log is the inverse of an exponent. In the given equation -

this means that b raised to the power c gives us a*,* where b is the base. The common base used is a base of 10. So in the above data, if I add a column for log base 10, this is what we'll get

Company	Revenue	Log (base 10)
Tesla	31	1.491361694
Uber	11	1.041392685
Amazon	386	2.586587305
Jindal Steel	4.7	0.6720978579
Axis Bank	5.6	0.748188027
Vedanta	11.3	1.053078443

We can see that in the log column, the numbers are very much comparable so as its plot, we can clearly see the difference in revenue of the companies in much better way than the above.

In the log, every consecutive number is base times greater than the preceding one, for example, log₅125 is 10 times greater than log₅25.

Practical examples include Earthquakes, where on a Richter Scale 5 is 10 times more powerful than 4.

Mean and Deviations

Consider the following data on salaries for employees -

Employee Name	Salary (in USD)
Jack	1678.3
Alan	2242.9
Kelvin	998.8
Jake	1938.6
Elly	1200.4

We can say that the average salary of a company is 1611.8 USD, but there is a huge difference between the salary of Kelvin and Alan when compared to the average. Imagine we have a new employee Addie with a salary of 25000 USD and then our new mean salary would be 5509.83 USD, but this isn't the case. We see except 1, everyone has their salary less than 2500 USD. So we can calculate the deviation from mean -

Employee Name	Salary (in USD)	Deviation (Abs)
Jack	1678.3	3831.533333
Alan	2242.9	3266.933333
Kelvin	998.8	4511.033333
Jake	1938.6	3571.233333
Elly	1200.4	4309.433333
Addie	25000	-19490.16667

We see there is a lot of deviation from the mean salary. Now how to sum this up in a single number?

Average won't work, because we have negative values here, so instead of normal averaging, we can square the numbers, add them, divide by the total number of samples, and then square root. This is called Standard Deviation.

Employee Name	Salary (in USD)	Deviation (Abs)	Deviation Sq.
Jack	1678.3	3831.533333	14680647.68
Alan	2242.9	3266.933333	10672853.4
Kelvin	998.8	4511.033333	20349421.73
Jake	1938.6	3571.233333	12753707.52
Elly	1200.4	4309.433333	18571215.65
Addie	25000	-19490.16667	379866596.7
Mean			8726.343666

We can see here that there is so much deviation from the mean salary here..

Dev Retro 2022

Lakshay Kumar — Tue, 13 Dec 2022 09:21:52 GMT

The year 2015, when I got my independent computer. It was an HP desktop with 2GB ram, 128GB storage and a Pentium processor PC. At that time I wasn't aware much of these configurations, so I kept on downloading as many software, IDEs, languages etc. in that system. Not even this, I installed many games too. Due to a lack of proper guidance about how to build a career in Computer. I kept doing whatever comes to my mind.

Then one day, during my computer class, we were given a demo on how to build a basic webpage using the HTML programming language in the class. Initially, I was planning to bunk the class, but thankfully I didn't. That day made me curious about how a very basic tool like notepad can be so helpful in generating results like a basic webpage. I was so fascinated by this simple output that made me explore more on my own. From a basic website, I completed all the chapters related to HTML on my own from the school textbook (lack of internet access). I'll say, that the demo class was the major reason why I liked coding. Slowly during my vacations, I completed the major coding chapters on my own like BASIC (Visual Basic) and C++.

I developed a strong interest in coding till then. I was able to solve questions using C++ and explore many tech stuff on my own. After grade 8, I continued with Java for the next four years along with Python simultaneously. Towards the end of my class 12, I created several apps and brought laurels from various reputed organisations.

Neorky : App for giving short-term employment to people. (IIT Roorkee App Innovation Challenge, 2nd Runner up).

AntiBully : Anti-bullying app (IIT Guwhati Alcheringa Code for Change, 2nd Runner Up).

covid19helps: A system with ML Symptom-based predictor, important medications for covid-19 infected patients and emergency services. Helped a lot during the covid-19 crisis. (Times NIE, App Innovation Challenge).

More of my projects are here - https://lakshaykumar.tech/#sec5

After completing High School, I started my Undergraduate at India's First Liberal Science University - Atria University in Bangalore.

Bangalore as a Tech City (Silicon Valley of India) gave me numerous opportunities to grow. During the initial days of my college, I was more focussed on building solutions and industry-level projects. After working on several projects individually, I realised the importance of teamwork with the community.

On April 5 2022, I attended my first networking tech event - Github Tech Tab. I made many useful connections here that helped me later in several projects. Following this, on April 28, at Microsoft Reactor, it was the first tech conference I attended. Later I kept on attending tech conferences like Global Azure day, Dev Nation, etc.

During my college hours, I was working on the integration of Computer Vision with Drones, this gave me many real insights and a rough idea of the solutions to new-age problems. This was one of my most successful projects, I built autonomous surveillance drones.

Later on August 6, 2022. I was invited for the first time to give a talk at Microsoft Reactor and here my journey as a Community speaker started. Then I was more involved in the community space. I engaged myself in giving talks, helping members of the discord community, writing blogs, building packages for the developer community and hosting competitions/hackathons.

This year, I also got the opportunity to demonstrate some of my projects in front of the Finance Minister of India ~ Nirmala Sitaraman and the Chief Minister of Karnataka State ~ Basvaraj Bommai.

Overall, I'll say that this year 2022 was full of opportunities for me in terms of personal development and growth.

Here are some of my learnings throughout my journey as a developer :

Initial guidance is very important for a developer. It gives you the kick to start learning new things.
Mentors are important to guide you, but initiation to learn new things has to grow from the inner side.
Never work because of opportunities, they will come at the right time.
The best way to learn something is to build on your own instead of solely following a series of theoretical videos or documentation.

I am always looking for new people in my team to work on any project, if you wish to collaborate, feel free to contact me. Details are available on my website.

Publish your own Python Package

Lakshay Kumar — Thu, 03 Nov 2022 17:36:52 GMT

Hello folks, You might have seen some of my packages in the previous blogWhile working on some project / solving any programming problem, have you developed a new algorithm or utility that can help developers in a more efficient way?So this blog is for you! This blog will discuss How can you publish your own python package?So Let's get started.

There are few terms you should be familiar with -

class : Blueprint of object. Object is simply a collection of variables and functions.
self : Python keyword, a parameter refers to the current instance of the class.
__init__() : A function that runs by default whenever a class is executed.
import : Keyword to include package in our program

The links mentioned along with the keywords will help you know more, I have added a summary for your reference above.

In this blog, we will create a simple package that can add numbers from any data structure. I'll be discussing the following ways of doing it.

Direct Function Implementation
Class Usage
Through Function of Class
Through the external class of package

We are going to publish our package on pypi.org - A popular collection of all the libraries. First, create an account here and save the login credentials.

Now let's set up a project. Make sure your project name matches the name of the package you intend to publish. I'll be using the different package name for each way I mentioned above.

Direct Function Implementation

Project Name: addition_by_direct_function_implementationInside this project, create one more subfolder with the same name and one python file main.py. Inside the subfolder create one python file __init__.py

Explanation: We will test our code in main.py, __init__.py is the python script that will run by default when we call our package from the project folder.

Your file structure should look something like this.

__init__.pyHere we have simply defined a function addnumbers with the numbers as parameter. The function returns the sum. Here's the code

def addnumbers(numbersdatastructure):    add=0    for i in numbersdatastructure:        add+=i    return add

Now our function-based package is ready, let's test on our local system.main.py

from addition_by_direct_function_implementation import addnumberssum = addnumbers([1,2,3,4,5,6,7,8,9])print(sum)

Since we have a subfolder with the name addition_by_direct_function_implementation its calling that folder and by default __init__.py gets called, its referring that code. Hence its calling function addnumbers() from __init__.py of addition_by_direct_function_implementation sub folder.

Class Usage

Project Name: addition_by_class_usageInside this project, create one more subfolder with the same name and one python file main.py. Inside the subfolder create one python file __init__.py

Explanation: We will test our code in main.py, __init__.py is the python script that will run by default when we call our package from the project folder.

Your file structure should look something like this.

__init__.py: Writing a class and within that class, we are creating 2 functions for multiplying and adding the numbers in the data structure.

class addition():    def addthenumbers(self,numbersdatastructure):        self.numbersdatastructure = numbersdatastructure        add=0        for i in self.numbersdatastructure:            add+=i        return add    def multiplythenumbers(self,numbersdatastructure):        self.numbersdatastructure = numbersdatastructure        mul=1        for i in self.numbersdatastructure:            mul*=i        return mul

Here the self keyword is referring to the variable of current function.

Now our function-based package is ready, let's test on our local system.main.py

import addition_by_class_usage as acuobjectOfClass = acu.addition()addition = objectOfClass.addthenumbers([1,2,3,4,5])multiply = objectOfClass.multiplythenumbers([1,2,3,4,5])print(addition,"\n",multiply)

First I am importing the packing (here package name is project folder name with __init__.py and to make its usage easy in the code, I'll call it as acu.Then I am creating object of class acu and then calling then simply functions

Through Function of Class

Project Name: functions_in_classInside this project, create one more subfolder with the same name and one python file main.py. Inside the subfolder create one python file __init__.py

Explanation: We will test our code in main.py, __init__.py is the python script that will run by default when we call our package from the project folder.

Your file structure should look something like this.

__init__.py

class addition():    def addthenumbers(self,numbersdatastructure):        self.numbersdatastructure = numbersdatastructure        add=0        for i in self.numbersdatastructure:            add+=i        return add    def multiplythenumbers(self,numbersdatastructure):        self.numbersdatastructure = numbersdatastructure        mul=1        for i in self.numbersdatastructure:            mul*=i        return muldef main():    print("Main function called")if __name__ == "__main__":    main()

Here we have defined additional function main() and in that function I am just printing a statement.

main.py

import functions_in_class as acufrom functions_in_class import mainobjectOfClass = acu.addition()addition = objectOfClass.addthenumbers([1,2,3,4,5])multiply = objectOfClass.multiplythenumbers([1,2,3,4,5])print(addition,"\n",multiply)main()

Now we are calling the function main() from the class functions_in_class. Here it will ignore the class and directly call the function.

Through the external class of package

Here the import syntax is just like direct function implementation, but instead of function, we are calling a python file that have the code.For example

from packagename import classname

Here packagename is the name of folder and classname is another python file in the folder other than __init__.py

For any queries, feel free to ask in comments or contact me over email

Publishing Python Package

So now, we are ready with our package, lets publish. We'll be publishing on test pypi.org because this is only for testing purposes.

Create Liscence.txt file : Refrence : https://choosealicense.com/
Create README.md : Reference : https://dillinger.io/
Create setup.py and add the following code. Replace with your own detailsReference : geekforgeeks.org

import setuptoolswith open("README.md", "r") as fh:    long_description = fh.read()setuptools.setup(    # Here is the module name.    name="addition_by_class_usage",    # version of the module    version="0.0.1",    # Name of Author    author="Lakshay Kumar",    # your Email address    author_email="contact@lakshaykumar.tech",    # #Small Description about module    # description="adding number",    # long_description=long_description,    # Specifying that we are using markdown file for description    long_description=long_description,    long_description_content_type="text/markdown",    # if module has dependencies i.e. if your package rely on other package at pypi.org    # then you must add there, in order to download every requirement of package     install_requires=[ ],#Write packages that needs to be preinstalled for using this package.    license="MIT",    # classifiers like program is suitable for python3, just leave as it is.    classifiers=[        "Programming Language :: Python :: 3",        "License :: OSI Approved :: MIT License",        "Operating System :: OS Independent",    ],)

Your folder structure should look like this.

Now to install a package called twine using this line -

pip install twine

Then

python setup.py bdist_wheel

You'll see more folders like this

Upload to test env.

twine upload -r testpypi dist/*

Woohoo!You finally uploaded it to the test environment. To upload in production environment, use this command

twine upload dist/*

You'll see your project link at the bottom

Test

To test, go to project link and copy the pip command. In my case its -
pip install -i https://test.pypi.org/simple/ addition-by-class-usage

And install in your project.

Update

To update make changes in your code and It is important to make changes in the version number. Then you can use the build and upload again to make changes.

That's it. For any queries, feel free to comment or contact me over email.Don't forget to subscribe on my website - https://www.lakshaykumar.tech/

Computer Vision On Web

Lakshay Kumar — Mon, 17 Oct 2022 13:07:22 GMT

Have you ever wondered about running your computer vision program on your website? Maybe you are working with Django or flask to create a web app and now you want to deploy it on your website.Well, it's straightforward to do it. You just need to have a basic understanding of flask or Django. Don't worry, even if you don't have one, we will build a simple hand-tracking application using flask. So let's get started.

Project setup

I am using Pycharm Community Edition here, personally my favourite for all my Python Projects. You can use any IDE for the same.

Open Pycharm and create a project flaskWebApp (You can use any name here).
After the project indexing is done, you will see one folder venv and main.py with some code written in it.
Delete all the code in main.py and rename it to app.py (This is the basic flask project naming convention, let's follow this only).
Create two more folders templates and static in the same project flaskWebApp.
Finally we have set up a basic flask web app structure here. Your project should look something like this.

Explanation
static: This folder will have external static files like your CSS, images, JavaScript files
templates: This folder contains your main HTML files where they have to be changed.
app.py: Python file to generate templates (dynamic html files in templates folder).

Install packages

We need some of the packages to be installed in order to get the desired output. Here are the packages you need flask (web development framework with python), opencv-python (Computer vision library for python), lkhandmapping (For tracking hand without writing tones of code). Use the syntax given below in the Pycharm terminal.

pip install flaskpip install opencv-pythonpip install lkhandmapping

###Let's Code

Create an index.html file in templates folder and write the following code.

Open app.py and add the following code

from flask import Flask, render_template, Responseapp = Flask(__name__)@app.route('/')def index():    return render_template('index.html')if __name__ == "__main__":    app.run(debug=True)

When you run app.py you will see a local IP address in the terminal like this.

Click on the IP address.You will see a blank webpage.

What's happening here....The function @app.route('/') is defining the home URL of your project. So whenever you are at the home of your project, the function index() will run. In this function, we are just rendering the index.html file that has nothing as of now.
By default running the index() function.

Now import the necessary packages and add some additional functions to make our code work.

from flask import Flask, render_template, Responseimport cv2from lkhandmapping import handTrackerapp = Flask(__name__)camera = cv2.VideoCapture(0)@app.route('/')def index():    return render_template('index.html')@app.route('/video_feed')def video_feed():    return Response(gen_frames(), mimetype='multipart/x-mixed-replace; boundary=frame')def gen_frames():    while True:        success, frame = camera.read()  # read the camera frame        if not success:            break        else:            frame = handTracker(frame)            ret, buffer = cv2.imencode('.jpg', frame[0])            frame = buffer.tobytes()            yield (b'--frame\r\n'                   b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')if __name__ == "__main__":    app.run(debug=True)

Explanation
Here we are getting video by changing the image continuously in image tag in our HTML file.The line in index.html.
Here the line {{ url_for('video_feed') }} is specifying the URL of the video feed that we are sending as the response from our python file by triggering the function gen_frames() in our video_feed. Let's see what this function is doing. Here is the code with comments as the explanation.

I hope the syntax is clear. If you have any queries, feel free to ask in the comments, or you can contact me here

Don't forget to subscribe on my website - https://www.lakshaykumar.tech/

Flights Delay Prediction

Lakshay Kumar — Sat, 01 Oct 2022 07:08:11 GMT

You must have faced flight delays ever in your life if you are a frequent flight traveller. This might have caused you a lot of trouble, especially when you are running on a tight or busy schedule.To address the same problem my team - Aryan Bakle, Aryan Saxena and me have prepared this project for prediction of flight delays on India's Busiest air Route: DEL to BOM

We started off with some research. We observed that on this route 5 major airlines - SpiceJet, Vistara, Indigo, Go First and AirIndia are operating approx. 36 flights daily.Data Source - SkyScanner

Using Selenium and Python we scraped the data for all the flights and their delay history for the past 100 days. Source : FlightRadar24The above method gave us very raw data with tones of missing values and non-segregated data.

After cleaning manually, here is what we came up with -

Cleaned data -

After doing some basic measures of central tendency, here's what we came up with -

On average, 20% of total flights were delayed with an average delay time of 27 minutes. (not much)
The maximum delay was 327 minutes ~ 5 hours for Flight number SG8169 (SpiceJet)
From the past data, we saw that the maximum flights were delayed on Thursdays and Fridays. Maybe because people travel more on weekends than weekdays, flights were less booked.

Probability Distribution for flights delay

We segregated the cleaned sheet into multiple sheets, specific for each airline and flight number. We got a total of 53 individual excel sheets for analysis. [Cheers to our team]

Later we tried Logistic regression on our dataset to check if a flight (based on the number and past history) is delayed or not. Here's the confusion Matrix of our model

We see that there were no predicted true values. This might be because we did the analysis on past data i.e. identifying patterns, not the actual factors causing flight delays i.e. weather conditions, delays due to passenger's arrival time, technical issues, airline's carelessness etc. Our logistic regression model score was 0.7747 which I think is quite enough. Here's our google colab sheet - %[https://colab.research.google.com/drive/17eTmUZG65MuoLQkz8Jq3B5W-DaQCRGd4?usp=sharing]

In the end, we created a dashboard in google sheet where based on flight and airline history, The system was able to display the probability of that particular flight being delayed more than a threshold time input by the user.For example - Analysis for Go-First Flight No. G8336

We will get all the analysis of airline and flight numbers. In this picture, the Probability of this flight for 30 minutes delay is 58.33% on Friday.

Thanks to our mentors Saurabh Mahajan, Mathew George and Vishrut Patel from Atria University for their guidance throughout the project.

Do let me know in the comments for the scope of further improvements.

TripGo DALV

Lakshay Kumar — Mon, 19 Sep 2022 15:32:10 GMT

Did you ever get confused while choosing a destination for a crazy outing with your squad?There might be a situation when someone wants to go shopping, some like natural places and what about the spiritual peeps. Finding it challenging to select a location that fulfils each and every one's preference. I was facing the same problem, so thought of solving this problem using python.

Problem

How to find a location for a casual outing with your group, in Bangalore City.

Approach

This is a kind of simple recommendation system, where based on preference, it will recommend places. Here we have different preferences like Nature, Adventure, Fun & Thrill, Shopping and CulturalWe started by collecting data from the web and manually rating each place based on these preferences out of 5. Here's an example below -

We collected the data of around 60 top places within Bangalore city.

On the frontend form, we are collecting values of these 5 parameters through a frontend form developed in basic HTML/CSS

Tech Solution

The frontend form was connected with python using Flask Framework. The data received from the form was converted into a list.

nature = int(request.form['nature'])adventure = int(request.form['adventure'])fun = int(request.form['fun'])shopping = int(request.form['shopping'])cultural = int(request.form['cultural'])userPref = [nature, adventure, fun, shopping, cultural]

Now we are creating a nested list of travel places

df = pd.read_csv('URL_OF_OUR SHEET')travelplaces = []for i in df.index:        rows = [df['place'][i], df['nature'][i],df['adventure'][i], df['funAndThrill'][i], df['shopping'][i], df['cultural'][i],                df['budget'][i], df['Distance'][i], df['ratings'][i]]        travelplaces.append(rows)

Now finding the vector distance for each place and finding vector distance using np.linalg.norm for each place and storing into a dictionary.

for i in travelplaces:        user = np.array(userInput)        dbData = np.array(i[1:8])        dis = np.linalg.norm(user - dbData)        placedict[i[0]] = dissorted_dict = {}sorted_keys = sorted(placedict, key=placedict.get)

Finally sorting the dictionary in ascending order of vector distance and finding the top three recommendations. Displaying the same to user.This system also sends an email to the user, in case they want to save for future reference.Here's the complete code :%[https://github.com/laksh-2193/TripgoDalv]

Cheers to my team Aryan Bakle, Diya hafiz and Vishwa Shah for their effort.

Thanks to our mentor Saurabh Mahajan from Atria University for his guidance throughout the project.

Do let me know in comments for the scope of further improvements.

Sentiment Analysis using Python

Lakshay Kumar — Sat, 17 Sep 2022 17:05:37 GMT

Deep learning is the first choice when you want to train neural networks for high-end Machine Level projects. It is the subset of machine learning where the neural networks learn by observing intricate structures in the data that they experience, this includes statistics and predictive modelling. Deep learning model comprises computational models consisting of multiple neural layers that build up the networks at multiple layers of abstraction to represent the data.In this blog, we are going to learn how can we build a model for predicting human sentiments. We will train models on hundreds of images available online. So lets get started.

File Structure

Download this data folder that have images for training and put it in the main project.

Make sure you download this haarcascade_frontalface_default.xml file that will help in detecting face.

These files will be generated as our code compiles, so don't worry about them

Code.ipynb (Not required)
model.h5
sentimentAnalyser.h5

Lets Code

Create a python file main.py and install the following libraries :

OpenCV
Numpy
Keras

Import all the required libraries

from keras.preprocessing.image import ImageDataGeneratorfrom keras.models import Sequentialfrom keras.layers import Dense,Dropout,Flattenfrom keras.layers import Conv2D,MaxPool2Dimport os

Declare the location for test and training data folders

train_data_dir = 'data/train'validation_data_dir = 'data/test'

Since all the images are coloured, we need to scale them with respect to the 255 RGB value. In addition to that, we are adding some more parameters like rotation, fixing a dimension for a better training dataset

train_datagen = ImageDataGenerator(rescale=1./255,rotation_range=30,shear_range=0.3,zoom_range=0.3, horizontal_flip=True,fill_mode='nearest')validation_datagen = ImageDataGenerator(rescale=1./255)

Now we are creating two objects different for training and testing datasets. It takes the image directory e.g. train_datagen as input, converts to grayscale, fixes the dimension (48,48), since we are categorising image into angry, disgust, fear etc. so we our operation on all these classes are going to be categorical. We will shuffle images for better training

train_genrator = train_datagen.flow_from_directory(    train_data_dir,    color_mode='grayscale',    target_size=(48,48),    batch_size=32,    class_mode='categorical',    shuffle=True)validation_genrator = validation_datagen.flow_from_directory(    validation_data_dir,    color_mode='grayscale',    target_size=(48, 48),    batch_size=32,    class_mode='categorical',    shuffle=True)

Now we will declare the list of labels in which we will classify images

class_labels = ['Angry','Disgust','Fear','Happy','Neutral','Sad','Surprise']

Creating a neural network of 5 layers. The last layer will result in the probability of all 7 classification classes and the max probability will be the final sentiment. Since we want a linear neural network i.e. a stack of layers we will use Sequential() model. Convo2D produces a matrix based on kernal_size and relu activation function. MaxPool2D finds the feature with maximum value for each matrix referenced as pool_size . Finally Dropout is the number of layer that nullifies the node of current layer before doing to next.

model = Sequential()model.add(Conv2D(32,kernel_size=(3,3),activation='relu',input_shape=(48,48,1)))model.add(Conv2D(64,kernel_size=(3,3),activation='relu'))model.add(MaxPool2D(pool_size=(2,2)))model.add(Dropout(0.1))model.add(Conv2D(128,kernel_size=(3,3),activation='relu'))model.add(MaxPool2D(pool_size=(2,2)))model.add(Dropout(0.1))model.add(Conv2D(256,kernel_size=(3,3),activation='relu'))model.add(MaxPool2D(pool_size=(2,2)))model.add(Dropout(0.1))model.add(Flatten())model.add(Dense(512,activation='relu'))model.add(Dropout(0.2))model.add(Dense(7,activation='softmax'))

Here Dense layer classifies images based on output from convolutional layers. Flatten reshapes the layer into a single matrix.

Finally we will compile the model with certain parameters and see the summary for how it worked

model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])print(model.summary())

Now we are going to train the images on our neural network and save the mode as sentimentanalyser.h5

num_train_images=0for root, dirs, files in os.walk(train_path):    num_train_images+=len(files)num_test_images=0for root, dirs, files in os.walk(test_path):    num_test_images+=len(files)epochs=100history = model.fit(train_genrator, steps_per_epoch=num_train_images//32,                    epochs=epochs,                    validation_data=validation_genrator,                    validation_steps=num_test_images//32)model.save('sentimentanalyser.h5')

This step will take time, based on the number of epochs, it is the number of cycles for training the images.

So yay! you have finally created a CNN for detection of sentiments. Find the full code below.

https://github.com/laksh-2193/SentimentsAnalysis

Python + Drones

Lakshay Kumar — Fri, 16 Sep 2022 07:03:39 GMT

You might have flown a Drone or a toy helicopter, it's super easy, isn't it? Use a controller, move joysticks and done. But as a coder, do you want to integrate this with your python code? Imagine you are making autonomous drones, How exciting, isn't it?Let's learn how can you integrate python and drones.

Pre-requisites

DJI Tello (A mini pocket-friendly drone perfect for beginners)
https://www.amazon.in/Renewed-DJI-Camera-Quadcopter-Professional/dp/B08VJ9PLFP/
Prior Experience with Python programming (Basic)
Basics of motion (Newton's laws of motion & Momentum)

You can browse online for these pre-reqs if you want.So let's get started...

Firstly you need to install a library djitellopy and opencv-python that has the support for this type of drones. Use the following syntax

pip install djitellotpy
pip install opencv

NOTE: Every drone has its own library, but coding is more or less the same for every drone.

Let's start by importing the library

from djitellopy import telloimport cv2import time

Create object of tello and make connection connect

me = tello.Tello()me.connect()print(me.get_battery())

The above code will print the battery % of the drone. me is the object of drone

Put stream on using the following code

me.streamon()

Controlling the movement of drone.Function : send_rc_control(left_right_velocity: int, forward_backward_velocity: int, up_down_velocity: int, yaw_velocity: int)

Code Snippet for controlling motion of drone

speed = 30 # this is in cm/sme.takeoff() #Take offme.send_rc_control(0,0,0,speed,0) #move 30cm upme.send_rc_control(0,0,0,-speed,0) #move 30cm downme.send_rc_control(0,speed,0,0) #move 30cm forwardme.send_rc_control(0,-speed,0,0) #move 30cm backwardme.send_rc_control(speed,0,0,0,0) #move 30cm leftme.send_rc_control(-speed,0,0,0,0) #move 30cm rightme.send_rc_control(0,0,0,0,speed) #rotate 30cm rightme.send_rc_control(0,0,0,0,-speed) #rotate 30cm leftme.land() #land the drone

Visualise the drone camera stream in your PC

while True:        img = me.get_frame_read().frame #Capture each frame        img = cv2.resize(img, (360, 240)) #Optional but recommended        cv2.imshow("Image", img) #Display video        cv2.waitKey(1)

That's super duper easy, try it out and let me know in the comments...

Any queries, feel free to type in comments or connect me through mail

Control drone using finger gesture

Lakshay Kumar — Fri, 16 Sep 2022 06:25:05 GMT

Mediapipe has a lot of amazing computer vision modules. Among all, my personal favourite is Hand Tracking Module. You can do a lot of customization using this module and play around, it's literally fun.So in this blog, let's see how we can control the drone using finger count. Here are the gestures and commands I am going to add to the drone -

Zero Fingers (Fist) : Land the drone
One Finger up : Move Forward
Two Fingers up : Move Backward
Three Fingers up : Move Left
Four Fingers up : Move Right
Five Fingers up : Take off

So let's start

Firstly we'll import all the required libraries - opencv, mediapipe, djitellopy

import cv2import mediapipe as mpfrom djitellopy import tello

Making constructor for media pipe and hand tracking modules, here we are using only one hand for the gestures, multiple hands can affect the accuracy of the output

mp_drawing = mp.solutions.drawing_utilsmp_drawing_styles = mp.solutions.drawing_stylesmp_hands = mp.solutions.handshands = mp_hands.Hands(model_complexity=0,min_detection_confidence=0.5,min_tracking_confidence=0.5,max_num_hands=1)

Then we are using webcam in cap variable and setting a good height / width according to the frame

cap = cv2.VideoCapture(0)width = 720height = 280cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)

This is the time to connect to our dji tello drone and use its camera. Make sure your drone is switched on and connected to your system's wifi. Before this kindly install djitellopy library using this method

pip install djitellopy

me = tello.Tello()me.connect()me.streamoff()me.streamon()isDroneFlying = False #Intialising the variable to check if drone is flying or not

Defining a function which takes the parameter as input and count the number of fingers up in that frame if hands are detected. Here is an image for tracking the hand landmark

For each point, we have x and y values on the hand. Now here's a question how will you say that all the fingers are closed in a fist? What's the condition or criteria?The tip must be lower than the middle ring of the finger. For e.g. here for the index finger, the 8th point must be lower than the 6th point. Since it's a vertical movement, if the y coordinate of the 8th point is greater than the y coordinate of the 6th point, then we can say that the index finger is closed. For all the fingers, we can apply the same logic and find the finger count. Then we can define our actions accordingly.Here is the function, kindly drop your doubts in comments if you think something isn't working

def droneGestureController(image):    image.flags.writeable = False    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)    results = hands.process(image)    image.flags.writeable = True    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)    if results.multi_hand_landmarks:        for hand_landmarks in results.multi_hand_landmarks:            mp_drawing.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS,                                      mp_drawing_styles.get_default_hand_landmarks_style(),                                      mp_drawing_styles.get_default_hand_connections_style())            handlms = []            c = 0            for i in hand_landmarks.landmark:                height, width, fc = image.shape                x = (i.x) * width                y = (i.y) * height                handlms.append([c, int(x), int(y)])                c = c + 1            totalFingers = 0            if (len(handlms) != 0):                fingerTips = [8, 12, 16, 20]                if(handlms[4][1]>handlms[3][1]):                    totalFingers+=1                for i in fingerTips:                    if (handlms[i][2] < handlms[i - 2][2]):                        totalFingers += 1            droneAction = ""            if (totalFingers==0):                droneAction="Land"                me.land()            elif (totalFingers == 1):                droneAction = "Move forward"                me.send_rc_control(0,30,0,0)            elif (totalFingers == 2):                droneAction = "Move backward"                me.send_rc_control(0, -30, 0, 0)            elif (totalFingers == 3):                droneAction = "Left"                me.send_rc_control(-30, 0, 0, 0)            elif(totalFingers == 4):                droneAction = "Right"                me.send_rc_control(30, 0, 0, 0)            elif(totalFingers==5):                droneAction = "Takeoff"                me.takeoff()                me.send_rc_control(0, 0, 50, 0)            else:                droneAction = "No Action"            cv2.putText(image, droneAction+" "+str(totalFingers), (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)            return [image,handlms]        return [image,[0]]    return [image,[0]]

For the syntax me.send_rc_control(0,0,0,0), it is an inbuilt function from djitellopy library that controls the drone movement after takeoff. Here's is how it accepts the input send_rc_control(self, left_right_velocity: int, forward_backward_velocity: int, up_down_velocity: int, yaw_velocity: int)The term left_right_velocity defined the speed at which drone should move from left to right, say 10 cm, so to move from right to left, we will input the value -10. Default unit for this is cm.

At the end we are printing the action & finger count on the frame, and returning the value. Here's the complete code for the same.

#YOUTUBE LINK : https://www.youtube.com/shorts/SuQzK4p_Mnwimport cv2import mediapipe as mpfrom djitellopy import tellomp_drawing = mp.solutions.drawing_utilsmp_drawing_styles = mp.solutions.drawing_stylesmp_hands = mp.solutions.handshands = mp_hands.Hands(model_complexity=0,min_detection_confidence=0.5,min_tracking_confidence=0.5,max_num_hands=1)cap = cv2.VideoCapture(0)width = 720height = 280cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)me = tello.Tello()me.connect()me.streamoff()me.streamon()isDroneFlying = False #Intialising the variable to check if drone is flying or notdef droneGestureController(image):    image.flags.writeable = False    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)    results = hands.process(image)    image.flags.writeable = True    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)    if results.multi_hand_landmarks:        for hand_landmarks in results.multi_hand_landmarks:            mp_drawing.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS,                                      mp_drawing_styles.get_default_hand_landmarks_style(),                                      mp_drawing_styles.get_default_hand_connections_style())            handlms = []            c = 0            for i in hand_landmarks.landmark:                height, width, fc = image.shape                x = (i.x) * width                y = (i.y) * height                handlms.append([c, int(x), int(y)])                c = c + 1            totalFingers = 0            if (len(handlms) != 0):                fingerTips = [8, 12, 16, 20]                if(handlms[4][1]>handlms[3][1]):                    totalFingers+=1                for i in fingerTips:                    if (handlms[i][2] < handlms[i - 2][2]):                        totalFingers += 1            droneAction = ""            if (totalFingers==0):                droneAction="Land"                me.land()            elif (totalFingers == 1):                droneAction = "Move forward"                me.send_rc_control(0,30,0,0)            elif (totalFingers == 2):                droneAction = "Move backward"                me.send_rc_control(0, -30, 0, 0)            elif (totalFingers == 3):                droneAction = "Left"                me.send_rc_control(-30, 0, 0, 0)            elif(totalFingers == 4):                droneAction = "Right"                me.send_rc_control(30, 0, 0, 0)            elif(totalFingers==5):                droneAction = "Takeoff"                me.takeoff()                me.send_rc_control(0, 0, 50, 0)            else:                droneAction = "No Action"            cv2.putText(image, droneAction+" "+str(totalFingers), (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)            return [image,handlms]        return [image,[0]]    return [image,[0]]while True:    try:        success, image = cap.read()        droneImage = me.get_frame_read().frame        droneImage = cv2.resize(droneImage, (360, 240))        image = droneGestureController(image)[0]        isDroneFlying = True        cv2.imshow('YourPC', image)        cv2.imshow('Drone', droneImage)        k = cv2.waitKey(1) & 0xFF        if k == 27:            cv2.destroyAllWindows()            break    except:        continuecap.release()

Any queries, feel free to type in comment or connect me through mail

Here's the video tutorial for the same. Video credits : Aryan Bakle

https://www.youtube.com/shorts/SuQzK4p_Mnw

Computer Vision Made Easy!

Lakshay Kumar — Thu, 15 Sep 2022 17:47:36 GMT

Computer vision fascinates me a lot, these days I am exploring the world of Computer Vision. While working I realised that we have to write hundreds of lines of code just for basic computer vision techniques like face detection, hand detection, pose classification etc.I explored opencv and mediapipe library which helped me gain a lot of insight into computer vision. I have developed some packages that might help the developer community to focus more on the implementation rather than on coding from scratch. Here are three packages that I developed recently.

1. Face Detection

Try it out

Function

faceDetector(image, draw=False)

Function Parameter : This function takes image (a single frame) as input and a variable draw with default value False. You can change the value of parameter draw to True if you want to draw the rectangular box over the face on the image frame.
Output : This function returns a nested list of length 2. The element at index 1 is the frame and a list of [x,y,w,h]. x is the minimum x co-ordinates of the face, y is the minimum y co-ordinate of face, w is the width and h is the height of the face. NOTE that the frame will have rectangular box over the face if value of draw is set to True in the function.

Usage

faceDetector(image, draw=False)

With detection over the face directly through function

from lkfacedetection import faceDetectorimport cv2cap = cv2.VideoCapture(0)while True:    success, image = cap.read()    functionValues = faceDetector(image,draw=True) #draw over the frame from function    frame = functionValues[0]    cv2.imshow('Face', frame)    cv2.waitKey(1)cap.release()

With detection externally using the values from function

from lkfacedetection import faceDetectorimport cv2cap = cv2.VideoCapture(0)while True:    success, image = cap.read()    functionValues = faceDetector(image) #doesn't draw over the frame    frame = functionValues[0]    x,y,w,h = functionValues[1]    cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 2) #Draws a rectangle over the face    cv2.imshow('Face', frame)    cv2.waitKey(1)cap.release()

2. Hand Tracking Package

Try it out

Function

handTracker(image,draw=True)

Function Parameter : This function takes image (a single frame) as input and a variable draw with default value False. You can change the value of parameter draw to True if you want to the mapping of hands over the image frame.
Output : This function returns a nested list of length 2. The element at index 1 is the frame and a list of handLandmarks. Know more about these handLandmarks on this link. NOTE that the function will return [0] at index 1 in the list if no hands are detected.

Usage

handTracker(image,draw=False)

Mapping over the hands directly from function.

from lkhandtracking import handTrackerimport cv2cap = cv2.VideoCapture(0)while True:    success, image = cap.read()    functionValues = handTracker(image,draw=True)    image = functionValues[0]    handLms = functionValues[1]    print(handLms)    cv2.imshow('Hands', image)    cv2.waitKey(1)cap.release()

Track hands without mapping.

from lkhandtracking import handTrackerimport cv2cap = cv2.VideoCapture(0)while True:    success, image = cap.read()    functionValues = handTracker(image,draw=False)    image = functionValues[0]    handLms = functionValues[1]    print(handLms)    cv2.imshow('Hands', image)    cv2.waitKey(1)cap.release()

3. Body Segmentation

Try it out

Function

bodySegmentation(orignalImg,backgroundImg=(255,255,255),threshold=0.3)

Function Parameter : This function takes orignalImg i.e. the image on which the segmentation has to happen. Other parameters i.e. backgroundImg sets the background color, by default it is white, but it can be changed as per user's need. threshold defines the level at which background has to be removed.
Output : The functions returns a frame as output where the background is segmented.

Usage

bodySegmentation(img)

Default Segmentation (White Background)

from lkbodysegmentation import bodySegmentationimport cv2cap = cv2.VideoCapture(0)while True:    success, img = cap.read()    img = bodySegmentation(img)    cv2.imshow('Image',img)    cv2.waitKey(1)

bodySegmentation(img, backgroundColor, threshold)

Customized Segmentation

from lkbodysegmentation import bodySegmentationimport cv2cap = cv2.VideoCapture(0)while True:    success, img = cap.read()    backgroundColor = (255,0,255) #You can replace with an image too    threshold = 0.45 #Level of background to be erased    img = bodySegmentation(img,backgroundColor,threshold)    cv2.imshow('Image',img)    cv2.waitKey(1)

Developer

This package is developed by Lakshay Kumar an enthusiastic AI Researcher. This is developed keeping in mind the pain to write lengthy lines of code just to detect faces. This will enable other developers to focus more on implementation part rather than spending time on coding the face detection module.
Feel free to share your feedback via mail

Lakshay Kumar

Contributing to Open Source in Python

Table of Contents

1. Introduction

2. Understanding Open Source Contributions

Can I contribute to open source using Python?

What are open-source contributions?

What is open source in Python?

How do you write an open-source contribution?

3. Getting Started

Creating a GitHub Account

Forking a GitHub Repository

4. Setting Up Your Development Environment

Installing Python

Using Virtual Environments

Installing Dependencies

5. Making Your First Contribution

Editing Code

Testing Your Changes

6. Sharing Your Contribution

Pushing Changes to GitHub

Creating a Pull Request

7. Conclusion

Activation Functions in Deep Learning

Activation Functions

Sigmoid function:

Hyperbolic Tangent (tanh) Function:

ReLU Function:

Leaky ReLU:

Exponential Linear Unit (ELU):

Softmax Function:

Conclusion

Generative Artificial Intelligence

What is this?

GANs

Training a GAN Model

Mathematical Formulation of GAN

Types of GAN

Conclusion

Edge computing and Tensorflow Lite

Tflite

Model Accuracy

Challenges

Machine Learning Simplified!

Let numbers speak!

Logarithm

Mean and Deviations

Dev Retro 2022

Publish your own Python Package

Direct Function Implementation

Class Usage

Through Function of Class

Through the external class of package

Publishing Python Package

Test

Update

Computer Vision On Web

Project setup

Install packages

Flights Delay Prediction

TripGo DALV

Problem

Approach

Tech Solution

Sentiment Analysis using Python

File Structure

Lets Code

Python + Drones

Control drone using finger gesture

Computer Vision Made Easy!

1. Face Detection

Function

Usage

2. Hand Tracking Package

Function

Usage

3. Body Segmentation

Function

Usage

Developer