DevDays Moscow 2022 

Tomas Neubauer

Должность: Co-Founder & CTO

Компания: Quix

Страна: Czech Republic

Биография

Tomas Neubauer is co-founder and CTO at Quix, responsible for the technical direction of the company across the full technical stack, and working as a technical authority for the engineering team. He was previously technical lead at McLaren, where he led architecture uplift for Formula One racing real-time telemetry acquisition. He later led platform development outside motorsport, reusing the know-how he gained from racing.

Javier Blanco Cordero
Должность: Senior Data Scientist

Компания: Quix

Страна: Spain

Биография

Javier Blanco Cordero is Senior Data Scientist at Quix, where he helps customers getting the most out of their data science projects. He was previously a Senior Data Scientist at Orange, developing churn prediction, marketing mix modelling, propensity to purchase models and more. Javier is a masters’ lecturer and speaker, specializing in pragmatic data science and causality.

Мастер-класс

Sentiment Analysis: Building a Real-Time ML Application From Scratch

Throughout this workshop we are going to create together something cool: a python service that will read the audience messages sent to a chat, admit them into the chat if they are not abusive, reject them and send a warning SMS to the user if they are, and then perform sentiment analysis on the chat messages, all in real time!

So, you will use two ML models: one for sentiment analysis assessment and one for abusive language detection. We will see how to train these two and how to use state of the art model artifacts already trained. As crucial as getting the maths right, we will be building the MLOps that will cope with this real time use case. We will build a streaming infrastructure that reads from the chat topic, connects to the twilio API and uses big deep learning models all in real time.

For data scientists, data engineers or ML engineers, this is an amazing opportunity to learn about streaming, ML deployment, real time data processing, and some cool libraries that offer state of the art deep learning models. We encourage you to be excited but never scared if some of these areas sound way out of your skill set. We have experience teaching and two of us will be there to assist with any doubts/blockages. We will assume no previous knowledge and still will manage to get quite deep! All you need is python programming skills as this will be the language used, although certain data science experience would also be recommendable.

Join us for a fun and useful workshop!

Содержание

5 one-hour sections with breaks in between:

  • Part 1: Introduction (1 hour):
    • Intro and detail of the problem we will solve
    • Streaming VS Batch: differences and advantages
    • About Quix: streaming data platform founded by McLaren F1 engineers
  • Part 2: Basic Streaming Infrastructure (1 hour):
    • Introduction to the problem we will solve
    • Designing the streaming architecture
    • Basic streaming infrastructure without Quix
    • Basic streaming infrastructure with Quix
    • Create topic to stream audience messages
  • Part 3: Basic NLP & sentiment use case (1 hour):
    • Understanding the data science problem
    • Finding useful public datasets
    • Basic NLP preprocessing:
      • Text tokenization
      • Padding
    • Neural network fundamentals:
      • Theoretical concepts
      • Tensorflow/Keras
    • Train first sentiment analysis model
    • Deploy model with Quix
  • Part 4: NLP & abusive language use case (1 hour):
    • Understanding the data science problem
    • Finding useful public datasets
    • Train first abusive language model
    • Check huggingface models and other available pre-trained models
    • Deploy model with Quix
  • Part 5: Deploying ML models with Quix and integrating SMS service (1 hour):
    • Create twilio account
    • Set up alarm for abusive messages within Twilio
    • Deploy the complete infrastructure
    • Check how everything works in real time!
Цель
  • Understanding streaming (as opposed to batch)
  • Learning about the Quix platform and sdk
  • Learning about streaming ML deployment
  • Learning about real time processing
  • Learning about huggingface and other state of the art deep learning freely available models
  • Learning NLP and neural network fundamentals
Целевая аудитория

Data Scientists, ML Engineers, Data Engineers, Data Architects, Developers who are comfortable with Python

Технические требования
  • Installations:
    • We will use colab and Quix, web-based solutions, so no need for any local installation.

  • Technical knowledge:
    • Basic programming in Python
    • Very basic data science and ML knowledge

« Назад