Annual: 2019

EM018 »
Mushroom Recognition
📁Machine Learning
👤David Castells
 (Universitat Autonoma de Barcelona)
📅Oct 06, 2019
Regional Final



👀 2590   💬 5

EM018 » Mushroom Recognition

Description

Collecting Mushrooms for human consumption is a very popular activity in Catalonia. It is so popular that the goverment is often discussing methods to control access to the forests of the country, including tolls. There are hundreds of different mushroom especies, some are very appreciated, and some are poisonous, and even can cause death. There are a number of fatalities every day because of mushroom intoxication.
The exact identification of mushroom especies is an important challenge that can save lives. The aim of this project is to build a machine learning system to identify each mushroom especies from photos taken with mobile phones.

Demo Video

  • URL: https://youtu.be/sa9eC2T6l8I

  • Project Proposal

    1. High-level Project Description

    Mushroom and fungi gathering is a popular activity around the world. There are thousands of fungi species and some of them resemble to each other. It requires an expert to identify and classify species and a neophit usually runs in some risks when identifying collected fungi.

     

    Amanita Flavorubens

    Traditionally classifying objects through images required a thorough work on the poblem and build complicated hand crafted algorithms. Furthermore, for complex classification tasks, such as mushroom identification, with a high color, form and context diversity, it was almost imposible. With the advent of Convolutional Neuronal Networks the possibility to build applications for complex classification was opened.

    The problem in certain applications is the need to perform classification without a connection to a server. This obliges to port the heavy CNN structures to edge devices. FPGAs stand as a well suited and efficient hardware solution to solve this problem.

    Nevertheless, CNNs are complicated and heavy networks which require specific languages for building and deploying them. This situation altogether with the traditionally complicated FPGA programming and design made difficult porting such structures to FPGA hardware. Recently, however, dedicated software for porting neural networks to FPGA has appeared. Among them, Intel OpenVINO platform appears as an outstanding solution for this task.

    For this reason, the CNN implemented will vary depending on the memory available and also on the accuracy attained. A priori, the neural network that will be used will be a MobileNetV2. However, as the size of this network might not be enough, further experiments with networks, such as quantized Inception are considered.

    The main problem training NNs is the quantity and quality of data. Our funghi dataset consists of around 300k labelled images. With this dataset previous experiments with Mobile NetV2 have shown an accuracy at top of 40%. However, we expect to improve this results by adding more data, adding preprocessing and choosing the right network.

    Our intention is to build a fungi classifier which is efficient, well performing and portable. This will solve the proble for mushroom pickers of identifying dubious species in field situations and help spread knowledge of this still expertise requiring field.

     

    2. Block Diagram

    The pipeline will consist of:

    • Fungi image from camera
    • FPGA mapped CNN feature extraction, processing and classification.
    • Top 5 species probability display

    3. Intel FPGA Virtues in Your Project

    The two oustanding capabilities of the OpenVINO Kit and FPGAs are:

    • Flexibility
    • Improved performance

    Flexibility

    Mapping CNNs or any other kind of neural networks to hardware is not an easy task. For this reason a variety of frameworks have been born over the last years: CMSIS-NN, ARM-NN or Tensorfow for Microcontrollers, among others. In all the cases, two main important points stand out: the level of human interaction needed for porting the NNs and the hardware in which those frameworks can port NNs.

    In the case of FPGAs, OpenVINO offers a simple and flexible but effective framework which allows to port different CNNs without varying the API calls. This flexibility allows for an excellent and fast prototyping in which different nets are tested.

    Improved performance

    Traditionally, target platforms of neural networks were GPUs. However,  with the advent of the previusly named frameworks, those networks can now be ported easily to resource constrained platforms. But, what happens if more power is needed? In the case of GPUs, the energy efficiency is not suitable since they have high energy profiles while FPGA offer a good tradeoff between fast and powerful conputation and lower energy profiles

     

     

    4. Design Introduction

    The purpose of the design is to be able to carry out  funghi classification thorugh CNNS in an efficient and fast manner. For this matter, Intel FPGAs altogether with OpenVINO offer a perfect solution.

    The application is devoted to funghi classification and the targeted users are expert or novice fungi gatherers that want to have an in field double check or first identification of a funghi.

    Usually, for high complex classification tasks, GPus are used. However, this hardware target  has a high energy profile. For this reason, if a server was to be used to carry out Funghi classification for images coming through internet, a more efficient hardware would be desired. In this case, FPGAs offer a good tradeoff between the energy profile and the inference speed.

    5. Function Description

    Funghi seeking is a popular activity in many countries. However, it is not an easy task since around 75,000 identified funghi species exist. For this reason, a mechanism who can indentify mushrooms or funghies when the seeker is doubtful is helpful and can help avoid risky situations.

    The main process consists in the following steps:

    • The funghi seeker doubts about a specific funghi and takes a photo
    • Runs the photo through a classifier CNN embedded in the FPGA and obtains the identification of the funghi.

    An improvement over this process would be to implement a web API in the computer where the FPGA is connected. With this implementation anyone who had access could send the photo from their mobile phone and retrieve a result while they are in fornt of the mushroom.

    To implement the simplest case, there are several components that are needed: the dataset, the neural networks and the ONNX converter, the OpenVINO converters and the FPGA OpenVIno starter kit.

    The dataset is the 2018 FGCVx Fungi Classification Challenge dataset (https://www.kaggle.com/c/fungi-challenge-fgvc-2018/data). It contains around 100,000 samples of funghi images with around 1,500 different species. It is a challenging dataset due to differences among funghies and the surroundings in which they are found. Another difficulty with this dataset is the low number of images per specie, which impeds a robust training for some of them. The dataset is divided in training, validation and test followwing the percentages 60%, 20% and 20% respectively,

    For the neural networks, Pytorch is used as framework. Four different networks with different number of parameters and components are used: Resnet18, Densenet121, Squeezenet and MobilenetV2. The networks are finetuned choosing which layers are frozen; usually 90% percent of layers frozen. Once the nets are trained they are converted to the ONNX framework.

    Once they are in the ONNX framework, they can be converted to OpenVINO .bin and .xml intermediary representations with the appropriate converters in the OpenVINO suite. Then, the network can be called for inference in the FPGA with the commands with the appropriate commands: InferenceEngine::CNNNetReaderclass. For this, the classification demo is used to include the converted models.

     

     

    6. Performance Parameters

    There will be three main performance parameters:

    • Network size
    • Accuracy
    • Latency

    For a good functioning of the system the network has to be as small as possible to give good latency and to be portable to memory contrained devices while also giving a good performance not to confuse mushroom seekers.

    The results obtained are the following:

    Performance and latency results
    Data/Network MobilenetV2 Densenet121 Resnet18 SqueezeNet
    Accuracy Test 35.93% 38.53% 30.13% 31.83%
    Nº Parameters 2,581,732 8,040,356 11,720,292 1,279,204
    Latency (fps) GTX 1080 344 285 370 370
    Latency (fps) Hetero FPGA/CPU 45 9 25 75

    In the results can be seen the differences in efficiency among the networks. Resnet is the biggest and it incorporates the residual connections among consecutive convolutional blocks. Next is Densenet, which is a little bit smaller. The main characteristic is the dense connections among all the layers: each layer is connected to all the others.

    Differentiated from these two networks, MobileNEt and Squeezent stand out by its efficient use of resources. The depthwise convolutions and bottleneck layers in the first case, and the fire modules and efficient use of convolutions in the second, make those two netowkrs more effficient. This can  be seen in the number of parameters and the attained accuracy in the test set.

    7. Design Architecture

    The main software flow is the following:

    • Tensorflow training of chosen CNN with Funghi dataset
    • Porting the network to a suited for for FPGA deplotying with OpenVINO
    • Inference on FPGA

    The software flow is the following:

     

    The used networks are trained and finetuned in Pytorch. The networks come from the torchvision model zoo. The adaptation to the desired dataset only requires one architecture change: the classification layer output. The code is simple and in the case of Mobilenet is:

    model_conv = torchvision.models.mobilenet_v2(pretrained=True)

    num_ftrs = model_conv.classifier[1].in_features

    model_conv.classifier[1] = nn.Linear(num_ftrs, classes)

    The number of parameters frozen can be defined  activating the tracking of the gradient. For this Pytorch offers a simple interface:

    total_param = round(percentage_frozen*len(list(model_conv.parameters())))

    for param in list(model_conv.parameters())[:total_param]:

        param.requires_grad = False

    where the percentage_frozen is the percentage of layers which are frozen.

    When the network is trained, the conversion to ONNX is mandatory in order to be able to pass the networks to the FPGA. The pytorch networks can be easly ported to ONNX with the following command: torch.onnx.export . In order to convert the model a sample batch or image is needed to be run through the FPGA.

    Finnally, to convert the ONNX model to the OpenVINO Intermediate Representation the python file python3 mo.py --input_model .onnx. And the .bin and .xml files are obtained and can be called with the OpenVINO suite.

    For implementing the demo, we use the classification demo and substitute the image and IR.

    /my_classification_sample -i \
    /opt/intel/2019_r1/openvino/deployment_tools/terasic_demo/demo/pic_video/car.png \
    -m /opt/intel/2019_r1/openvino/deployment_tools/terasic_demo/\
    demo/my_ir/squeezenet1.1.xml -d "HETERO:FPGA,CPU"

    The results are straightforward.



    5 Comments

    Aleksandr Amerikanov
    I think you have too high hopes for Intel OpenVINO platform. Have you done preliminary experiments? How large is the marked dataset of mushroom images you prepared? What kind of neural network (number of layers of neurons, type of convolution, etc.) do you plan to use?
    🕒 Jul 03, 2019 05:12 PM
    EM018🗸
    Yes, we have done preliminary experiments but only on PC. The labelled dataset contains around 300k labelled images. The first neural network to be used will be a MobileNEt V2. however, more experiments with quantized versions of larger networks are expected to be made.

    If you have more questions, please do not doubt to ask.

    Thank you for your interest
    🕒 Jul 06, 2019 02:46 PM
    Aleksandr Amerikanov
    Thanks for the answer. I will be watching your project.

    By the way, MobileNet is also used in this project http://www.innovatefpga.com/cgi-bin/innovate/teams.pl?Id=EM031. And, as far as I understand, they already have a realization implementation for de 10-nano.

    In such projects on an FPGA, the problem of where to store the coefficients of a trained neural network is always a problem. How are you going to solve it?
    🕒 Jul 06, 2019 03:20 PM
    EM018🗸
    First of all thank you for your interest and apologies for the late reply.

    Our intention, as a first step, is to use the Open VINO Starter Kit, where we have space (1 GB DDR3 and 64 MB SDRAM) for a model such as MobileNet.

    One we achieve this, we will go donwgrading model requirements for the platform and port it to a more restricted hardware.

    If you have more questions, please do not doubt to ask.

    Thank you for your interest
    🕒 Jul 15, 2019 02:17 PM
    Bing Xia
    Hi team, please upload your project design a.s.a.p, the deadline is closing.
    🕒 Jun 28, 2019 08:14 AM

    Please login to post a comment.