Annual: 2019

AS016 »
The Sliding Desktop Robot
📁Machine Learning
👤Jose Francisco Sanchez Rosales
 (Independent Consultant)
📅Jun 17, 2019
Regional Final


👀 186   💬 1

AS016 » The Sliding Desktop Robot


The Sliding Desktop Robot is a modified PLA 3D printed CNC platform able to make complex task in a wide working area through deep learning using neural networks deployed in a FPGA. To train the model, a customized environment is created to adjust the precision of the movements. The autonomous of the robot is made through visual recognition over the environment with a camera. Intel OpenVINO toolkit will be used to process the tensorflow model resulting after training for two different kinds of neural networks: classification for visual recognition and reinforcement learning for the robot automatic movements.
Python programming is used for final inference simulation, so all the tasks can be simulated without any peripheral at development stage, before production. For demonstration purposes, some tasks will be implemented as a bakery decorator and the classic pick and place, but with shape recognition added.

Project Proposal

1. High-level Project Description

Year by year, automatic electro-mechanisms (robots) have been evolving since basic structures to complex designs, but in the last years, since the neuron networks were used for AI, the robot industry have took a big jump toward the future. The purpose of the current work is aimed to help to understand how use this technique in to a simple non-conventional robot using FPGA for inference through Intel OpenVINO starter kit. We will create our neuron network and apply deep learning in order to train the robot in the desired tasks. The current prototype is based on a modified version of the MPCNC from site, licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Not working as CNC, instead is working as a 3 axis, 1 grip general purpose desktop rail robot drove by a 8 bits microcontroller with a fixed program where the positions for movements are defined at compilation time, because the math involved in a 3D space requires many intense computational power. FPGA’s fits perfectly for this kind of scenarios due the parallelism to accelerate tasks and with the Intel OpenVino tool kit, allow as to use IA for the new control approach allowing the robot to take decisions in real time.

X, Y and Z axis are controlled through 5 stepper motors, one servo motor is used to the hand grip with a optical fiber detector to indicate the presence of an object is installed below. The main structure of the robot is made with PLA 3D printed parts.

As stepper motors in the prototype are not using encoders to control them, they are being controlled with a fixed program in a microcontroller (Arduino) where the positions for movements are defined at compilation time. The classic 'Pick and place' can be performed easily while round movements are almost a great math challenge, and everything before compilation. We use the power of AI to improve this behavior by using Reinforcement Learning in Continuous Action Spaces. To complete the scenario, we will place a camera over the top of the robot, so we can identify the objects on the table.

When we talk about reinforcement learning and AI, the first place coming to our mind is gym ( Actually is the best starting point to these kind of tasks, but we have a little inconvenient: there is not an environment which match with our robot, so we have to create a new and customized environment.

There are many ways to do this, but we chose a fast and easy way, using python and the pyglet library. We build the environment by emulating the movements of our robot and making equivalents between pixels and steps, so we can make sure our model will be accurate. If our motors have encoders, we would use a training mode to make our states directly in real actions as big companies do, but this is not possible in our case.

2. Block Diagram

Training Process:

To train the model we use a Python programming language. As we mention previously, we need to build our own environment where the robot moves and act. Then we chose the appropriated learning network to start building the algorithm, and in this case, we use Deep Deterministic Policy Gradients (DDPG). For this demo, we only build a set of actions to perform, a set of movements around a circle surface as a cake, to be used as a decorator for a bakery.


In a python environment with keras and tensorflow we build our learning process to get the model. We make a simple object classification for the camera and train the model as well. After finishing the training process made in a GPU and once we get the model from tensorflow, we need to transform in the correct format to use it with the Intel OpenVINO toolkit.


We use the Intel OpenVINO toolkit to process the model and use the result to build the app to handle the robot, to make the job without human intervention, just detecting the object on the table and making the process we coded.

Then, we use the OpenVINO FPGA starter kit board to make the inference according to the images but without displaying them, we don’t need the images, just need the recognition of the shape viewed from the top, so we can avoid to use OpenCV.

Finally, through USB connection we send the orders to the IO board to control the motors in the robot.

3. Intel FPGA virtues in Your Project

Easy training process:

Through a customized interface, the robot can be trained for multipurpose tasks and it doesn’t need use the physical robot to do this, everything is made in a controlled software environment.

Complex Multitasking:

Due the use of neural networks in the FPGA the robot can perform many complex tasks and all made with precision as object recognition.

Autonomous Behavior:

With shape classification through the FPGA we make sure the robot will do the required task without human intervention.


Many tasks can be uploaded in the main app, so don’t need to make the math over and over, just the training process can make grow the capabilities of the robot.

OpenCV could be integrated in the resulting application for future visual impact.

Real time shape and object detection:

FPGA is one of the best solutions for object recognition in real time.

Software simulation allowed:

System is designed in the way it is possible make the inference using a python application allowing us to perform the simulation entirely in a controlled python software environment using our customized training environment.

4. Design Introduction



5. Function Description

6. Performance Parameters

7. Design Architecture


Mandy Lei
Looking forward to your final work!
🕒 Jun 27, 2019 08:52 PM

Please login to post a comment.