Control drone using finger gesture

Control drone using finger gesture

Mediapipe and drone

Mediapipe has a lot of amazing computer vision modules. Among all, my personal favourite is Hand Tracking Module. You can do a lot of customization using this module and play around, it's literally fun. So in this blog, let's see how we can control the drone using finger count. Here are the gestures and commands I am going to add to the drone -

Zero Fingers (Fist) : Land the drone
One Finger up : Move Forward
Two Fingers up : Move Backward
Three Fingers up : Move Left
Four Fingers up : Move Right
Five Fingers up : Take off

So let's start

Firstly we'll import all the required libraries - opencv, mediapipe, djitellopy

import cv2
import mediapipe as mp
from djitellopy import tello

Making constructor for media pipe and hand tracking modules, here we are using only one hand for the gestures, multiple hands can affect the accuracy of the output

mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands
hands = mp_hands.Hands(model_complexity=0,min_detection_confidence=0.5,min_tracking_confidence=0.5,max_num_hands=1)

Then we are using webcam in cap variable and setting a good height / width according to the frame

cap = cv2.VideoCapture(0)
width = 720
height = 280
cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)

This is the time to connect to our dji tello drone and use its camera. Make sure your drone is switched on and connected to your system's wifi. Before this kindly install djitellopy library using this method

pip install djitellopy

me = tello.Tello()
me.connect()
me.streamoff()
me.streamon()
isDroneFlying = False #Intialising the variable to check if drone is flying or not

Defining a function which takes the parameter as input and count the number of fingers up in that frame if hands are detected. Here is an image for tracking the hand landmark

2410344[1].png

For each point, we have x and y values on the hand. Now here's a question how will you say that all the fingers are closed in a fist? What's the condition or criteria? The tip must be lower than the middle ring of the finger. For e.g. here for the index finger, the 8th point must be lower than the 6th point. Since it's a vertical movement, if the y coordinate of the 8th point is greater than the y coordinate of the 6th point, then we can say that the index finger is closed. For all the fingers, we can apply the same logic and find the finger count. Then we can define our actions accordingly. Here is the function, kindly drop your doubts in comments if you think something isn't working

def droneGestureController(image):
    image.flags.writeable = False
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    results = hands.process(image)
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    if results.multi_hand_landmarks:

        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS,
                                      mp_drawing_styles.get_default_hand_landmarks_style(),
                                      mp_drawing_styles.get_default_hand_connections_style())
            handlms = []

            c = 0
            for i in hand_landmarks.landmark:
                height, width, fc = image.shape
                x = (i.x) * width
                y = (i.y) * height
                handlms.append([c, int(x), int(y)])
                c = c + 1
            totalFingers = 0

            if (len(handlms) != 0):
                fingerTips = [8, 12, 16, 20]
                if(handlms[4][1]>handlms[3][1]):
                    totalFingers+=1

                for i in fingerTips:
                    if (handlms[i][2] < handlms[i - 2][2]):
                        totalFingers += 1

            droneAction = ""

            if (totalFingers==0):
                droneAction="Land"
                me.land()

            elif (totalFingers == 1):
                droneAction = "Move forward"
                me.send_rc_control(0,30,0,0)

            elif (totalFingers == 2):
                droneAction = "Move backward"
                me.send_rc_control(0, -30, 0, 0)

            elif (totalFingers == 3):
                droneAction = "Left"
                me.send_rc_control(-30, 0, 0, 0)

            elif(totalFingers == 4):
                droneAction = "Right"
                me.send_rc_control(30, 0, 0, 0)

            elif(totalFingers==5):
                droneAction = "Takeoff"
                me.takeoff()
                me.send_rc_control(0, 0, 50, 0)

            else:
                droneAction = "No Action"

            cv2.putText(image, droneAction+" "+str(totalFingers), (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)
            return [image,handlms]
        return [image,[0]]
    return [image,[0]]

For the syntax me.send_rc_control(0,0,0,0), it is an inbuilt function from djitellopy library that controls the drone movement after takeoff. Here's is how it accepts the input send_rc_control(self, left_right_velocity: int, forward_backward_velocity: int, up_down_velocity: int, yaw_velocity: int) The term left_right_velocity defined the speed at which drone should move from left to right, say 10 cm, so to move from right to left, we will input the value -10. Default unit for this is cm.

At the end we are printing the action & finger count on the frame, and returning the value. Here's the complete code for the same.

#YOUTUBE LINK : https://www.youtube.com/shorts/SuQzK4p_Mnw


import cv2
import mediapipe as mp
from djitellopy import tello


mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands
hands = mp_hands.Hands(model_complexity=0,min_detection_confidence=0.5,min_tracking_confidence=0.5,max_num_hands=1)
cap = cv2.VideoCapture(0)
width = 720
height = 280
cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
me = tello.Tello()
me.connect()
me.streamoff()
me.streamon()
isDroneFlying = False #Intialising the variable to check if drone is flying or not

def droneGestureController(image):
    image.flags.writeable = False
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    results = hands.process(image)
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    if results.multi_hand_landmarks:

        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(image, hand_landmarks, mp_hands.HAND_CONNECTIONS,
                                      mp_drawing_styles.get_default_hand_landmarks_style(),
                                      mp_drawing_styles.get_default_hand_connections_style())
            handlms = []

            c = 0
            for i in hand_landmarks.landmark:
                height, width, fc = image.shape
                x = (i.x) * width
                y = (i.y) * height
                handlms.append([c, int(x), int(y)])
                c = c + 1
            totalFingers = 0

            if (len(handlms) != 0):
                fingerTips = [8, 12, 16, 20]
                if(handlms[4][1]>handlms[3][1]):
                    totalFingers+=1

                for i in fingerTips:
                    if (handlms[i][2] < handlms[i - 2][2]):
                        totalFingers += 1

            droneAction = ""

            if (totalFingers==0):
                droneAction="Land"
                me.land()

            elif (totalFingers == 1):
                droneAction = "Move forward"
                me.send_rc_control(0,30,0,0)

            elif (totalFingers == 2):
                droneAction = "Move backward"
                me.send_rc_control(0, -30, 0, 0)

            elif (totalFingers == 3):
                droneAction = "Left"
                me.send_rc_control(-30, 0, 0, 0)

            elif(totalFingers == 4):
                droneAction = "Right"
                me.send_rc_control(30, 0, 0, 0)

            elif(totalFingers==5):
                droneAction = "Takeoff"
                me.takeoff()
                me.send_rc_control(0, 0, 50, 0)

            else:
                droneAction = "No Action"

            cv2.putText(image, droneAction+" "+str(totalFingers), (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)
            return [image,handlms]
        return [image,[0]]
    return [image,[0]]

while True:
    try:
        success, image = cap.read()
        droneImage = me.get_frame_read().frame
        droneImage = cv2.resize(droneImage, (360, 240))
        image = droneGestureController(image)[0]
        isDroneFlying = True
        cv2.imshow('YourPC', image)
        cv2.imshow('Drone', droneImage)
        k = cv2.waitKey(1) & 0xFF
        if k == 27:
            cv2.destroyAllWindows()
            break
    except:
        continue
cap.release()

Any queries, feel free to type in comment or connect me through mail

Here's the video tutorial for the same. Video credits : Aryan Bakle