About the conference
Around 350 participants from all over Poland (and from abroad) gathered in Warsaw for our conference. They listened to excellent talks given by 15 experts on different topics of Machine Learning. Among them: Krzysztof Geras from New York University, Krzysztof Chormański from Google Brain and Columbia University and Szymon Sidor from OpenAI. All attendants have praised the great atmosphere during the event.
You can see the photos from the conference here and the videos from lectures here .
Speakers
Krzysztof Choromański
Google Brain Robotics & Columbia University
Krzysztof Choromanski works at Google since 2013. Currently he is a member of Google Brain Robotics Team in New York. His area of research is robotics, in particular reinforcement learning, applications of neural networks in optimal control and compact machine learning models based on structured random feature maps. As an adjunct assistant professor at Columbia University, he also gives lectures on machine learning and data mining. Before joining Google, he obtained his Ph.D at Columbia University working on structural graph theory.
Karol Kurach
Google Brain (Zurich)
Karol Kurach is a researcher in Google Brain team in Zurich, currently focusing on learning multi-modal data representations and Generative Adversarial Networks. He is a co-author of intelligence systems inside Gmail, such as Smart Reply and Categories (a.k.a. Tabs).
During his PhD at the University of Warsaw, he worked on deep neural architectures with external memory, in collaboration with researchers from Brain and OpenAI. Before that, he represented Poland in team programming competitions and won silver medal in the ACM ICPC World Finals.
Łukasz Bolikowski
BCG Gamma
Lead Data Scientist at BCG Gamma, where he builds advanced mathematical models working on big data sets for the largest global clients. Lukasz is also a scientific advisor and technical code reviewer for Gamma cases. Before joining BCG, founder and leader of Applied Data Analysis Lab at Interdisciplinary Centre for Mathematical and Computational Modelling at University of Warsaw. PhD in Computer Science from Systems Research Institute, Polish Academy of Sciences, MSc in Computer Science from MIM, University of Warsaw. Independent expert of the OECD and European Commission.
Stanisław Jastrzębski
Jagiellonian University
Stanisław is a 3rd year PhD student co-advised by Prof. Jacek Tabor (Jagiellonian University) and Prof. Amos Storkey (University of Edinburgh). His research interests include deep learning theory and deep representation learning. He has worked on applications of deep learning to drug design, fraud detection, and protein folding. Most recently, he interned with Prof. Yoshua Bengio, researching optimization in deep learning.
Website: kudkudak.github.io
Zbigniew Wojna
Tensorflight
Zbigniew Wojna is deep learning researcher, and founder of TensorFlight Inc. company focused on extracting actionable insight from aerial and satellite imagery through his research on convolutional networks. He is currently in the final stage of his Ph.D. at University College London under the supervision of Professor Iasonas Kokkinos and professor John Shawe-Taylor.
His primary interest lays in finding research problems around machine vision products for real-world applications usually in big scale. Zbigniew in his threes year2 of Ph.D. career spent most of the time working in different industrial research groups. It includes DeepMind Health Team, Deep Learning Team for Google Maps in collaboration with Google Brain, Machine Perception with Kevin Murphy, Weak Localization Team with Vittorio Ferrari and currently Facebook AI Research in Paris. His company TensorFlight Inc. was featured as top 2 AI startups among few hundreds by Capgemini InnovatorsRace50 and recently secured funding for further developments.
Krzysztof Geras
New York University
Krzysztof is a postdoctoral researcher at New York University. He graduated from the University of Warsaw and received his PhD from the University of Edinburgh. He works on methods of diagnosing breast cancer from medical images using convolutional networks. His interests in the field of deep learning include unsupervised learning, model compression and transfer learning.
Jan Chorowski
University of Wrocław
Jan Chorowski is an Associate Professor at Faculty of Mathematics and Computer Science at the University of Wrocław. He received his M.Sc. degree in electrical engineering from the Wrocław University of Technology, Poland and EE PhD from the University of Louisville, Kentucky in 2012. He has visited several research teams, including Google Brain Mountain View, Microsoft Research in Redmond and Yoshua Bengio’s lab at the University of Montreal. His research interests are applications of neural networks to problems which are intiutive and easy for humans and difficult for machines, such as speech and natural language processing.
Błażej Osiński
Błażej Osiński is a data scientist at deepsense.ai. His professional experience include working at Google, Microsoft and Facebook. He was also the first software engineer at Berlin-based startup Segment of 1. Błażej holds Masters Degree in Computer Science and Bachelors in Mathematics, both from the University of Warsaw.
Piotr Miłoś
University of Warsaw
Piotr Miłoś is an Assistant Professor at Faculty of Mathematics, Mechanics and Informatics at University of Warsaw. His research interests lie mainly in probability theory, stochastic processes, limit theorems and models of random surfaces. Some time ago he started active research in machine learning, in particular in reinforcement learning.
Conference Agenda
Inauguration
Krzysztof Geras
New York University
Advances in breast cancer screening with deep neural networks
Recent advances in deep learning for natural images has prompted a surge of interest in applying similar techniques to medical images. Most of the initial attempts focused on replacing the input of a deep convolutional neural network with a medical image, which does not take into consideration the fundamental differences between these two types of images. Specifically, fine details are necessary for detection in medical images, unlike in natural images where coarse structures matter. This difference makes it inadequate to use the existing network architectures developed for natural images, because they work on an heavily downsampled image to reduce the memory requirements. This hides details necessary to make accurate predictions. Additionally, a single exam in medical imaging often comes with a set of views which must be fused in order to reach a correct conclusion. In our work, we propose to use a multi-view deep convolutional neural network that handles a set of high-resolution medical images. We evaluate it on large-scale mammography-based breast cancer screening (BI-RADS prediction) using 886 thousand images. We focus on investigating the impact of training set size and image size on the prediction accuracy. Our results highlight that performance increases with the size of training set, and that the best performance can only be achieved using the original resolution. This suggests that medical imaging research using deep learning must utilize as much data as possible with the least amount of potentially harmful preprocessing.
Coffee break
Łukasz Bolikowski
BCG Gamma
How to Optimize Market Strategy using Game Theory and How to Recognize Vehicle Types using Deep Learning
BCG Gamma is an Advanced Analytics division of The Boston Consulting Group, consisting of researchers specializing in Mathematical Modelling, Machine Learning, Operations Research, Big Data Analytics and Software Engineering. In this presentation we will walk you through two of our recent projects: one is an optimization problem solved using game-theoretic approach, the other is a classification problem solved using deep learning. In the first case, we optimized our client's market strategy, taking into account how other players would react to our client's moves. In the second case, we built a system identifying vehicle types using CCTV feeds from a highway operator in order to charge the right toll and notifying of anomalous behavior on the roads. In both cases, we were able to generate multi-million savings per year for our clients.
Krzysztof Choromański
Google Brain Robotics & Columbia University
Charming kernels, colorful Jacobians and Hadamard-minitaurs
Deep mathematical ideas is what drives innovation in machine learning even though it is often underestimated in the era of massive computations. In this talk we show few mathematical ideas that can be applied in many important and unrelated at first glance machine learning problems.We will talk about speeding up algorithms that approximate certain similarity measures used on a regular basis in machine learning via random walks in the space of orthogonal matrices. We show how these can be also used to improve accuracy of several machine learning models, among them some recent RNN-based architectures that already beat state-of-the-art LSTMs. We explain how to "backpropagate through robots" with compressed sensing, Hadamard matrices and strongly-polynomial LP-programming. We will teach robots how to walk and show you that what you were taught in school might be actually wrong - there exist free-lunch theorems and after this lecture you will apply them in practice.
Coffee break
Maciej Dziubiński
Nethone
User identification based on keystroke dynamics (workshop)
Behavioral data can be easily accessed and used for augmenting user verification. In this workshop we will introduce and focus on a particular type of behavioral data, and see how we can use machine learning techniques to verify users using such data. We will go through a standard pipeline of: feature extraction, building a model for user verification, and model validation. Participants will be encouraged to play with this pipeline, and to come up with better features and models, but keeping the validation strategy fixed.
The purpose of this workshop is to familiarize the audience with several machine learning libraries, with the process of creating a standard pipeline, and with one of Nethone’s upcoming research projects. We will be using Python, and libraries like: scikit-learn, xgboost, and keras.
The workshop will be hands-on, and will require active participants to have an environment prepared beforehand (either a virtualenv, or a docker image). Instructions for preparing the environment will be available at https://github.com/nethone/keystroke-workshop-env a day before the workshop.
Stanisław Jastrzębski
Jagiellonian University
Understanding how Deep Network learn
The reason why deep neural networks can be trained using simple gradient descent continues to defy our understanding. In this talk we will cover the basics as well as some of the most recent research on optimization of deep networks. We will especially focus on the connection of optimization to topics in physics and information theory. I will finish by discussing the most important open problems, as well as some speculation about how training deep networks might look in the near future.
Coffee break
Lunch break
Piotr Miłoś
University of Warsaw
Hierarchical Reinforcement Learning
Reinforcement learning is one of most important parts of machine learning. It is inspired by behavioural biology and economies trying to solve maximisation problems. While the field have witnessed spectacular successes (Atari games, AlphaGo, Dota) it still suffers from many problems, e.g. very poor sample efficiency. In my talk I will present basics of RL and our results of "Hierarchical Reinforcement Learning with Parameters” (CoRL 2017). We experimented in a robotic setup with a manager being able to compose a relatively simple skills (like moving to a point an grasping) to solve more complicated tasks.
Karol Kurach
Google Brain (Zurich)
Generative Adversarial Networks
Generative adversarial networks (GAN) are a powerful subclass of generative models, mostly known for being able to generate samples of photo-realistic images. In the first part of this talk I will present the main idea behind GAN and give an overview of several popular models.
In the second part I will discuss the problem of evaluating GANs and present a recent large scale study comparing as fairly as possible some of the most popular GAN algorithms (and VAE) on several datasets.
Lunch break
Coffee break
Julien Simon
Amazon
Deep Learning for Developers
In recent months, Deep Learning has become the hottest topic in the IT industry. However, its arcane jargon and its intimidating equations often discourage software developers, who wrongly think that they’re “not smart enough”. We’ll start with an explanation of how Deep Learning works. Then, through code-level demos based on Apache MXNet and Tensorflow, we’ll demonstrate how to build, train and use models based on different network architectures (MLP, CNN, LSTM, GAN). Finally, you will learn about Amazon SageMaker, a new service that lets you train and deploy models into a production-ready hosted environment.
Jan Chorowski
University of Wrocław
Deep neural networks for speech and natural language processing
Deep neural networks yield state of the art performance in speech recognition and allow for end-to-end training in which of a model's components collaborate to solve the task at hand. I will present end-to-end trainable attention-based recurrent neural networks that directly directly transcribe speech features into sequences of phonemes or characters. The networks learn the alignment between the speech and its transcription and are trained directly to optimize the probability of the correct transcription. I will show the advantages and challenges, such a as language model integration, related to successful application of this family of neural networks. I will conclude the talk with a review of other applications of attention-based recurrent networks in NLP, such as parsing. And with other uses of neural networks in speech processing, such as voice conversion and style transfer.
Coffee break
Rafał Pilarczyk
Samsung
Is Artificial Intelligence a threat to musicians? – Music generation techniques
Conference party
Conference party for all participants of the conference and invited guests
Zbigniew Wojna
Tensorflight
Architectures for big scale machine vision applications (remote lecture)
I will present research that I worked on during my Ph.D. My primary interest lays in the basics of architectures for big scale applications. I will explain the idea behind Inception and what had we change in inception-v3 to have it the best single model on ImageNet 2015. I will present our winning submission to MS COCO detection challenge and how did we adopt the feature extractors for that use case. A few months ago, we have announced the work of automatically updating Google Maps based on Google Street View imagery, where we have used the inception-v3 for the text transcription. I will also cover our latest works on dense prediction problems i.e. instance segmentation through pix2vec pixel embeddings and search for optimal decoder architecture in dense prediction problems.
Błażej Osiński
Deep learning - basics and beyond (workshop)
Deep learning has succeeded at such difficult tasks as driving cars or winning a game of Go and Dota2. It all sounds spectacular... but how do you create a state-of-the-art neural network for an even simpler task - image classification? In this workshop we will try to make every set approachable, from setting up the environment through building first models to tinkering with experiments.
During the hands-on session, you will experiment with an artificial neural network for image classification and learn practical hacks for how to tune the network for your needs, using techniques such as transfer learning and data augmentation. By the end of the workshop you will be able to create and optimize a deep learning project from scratch.
In this workshop we’ll be using PyTorch, a deep learning framework in Python. You will have a chance to understand why it is a tool of choice for machine learning researchers and data scientists. And why Andrej Karpathy’s skin has improved since he started using it!
Coffee break
Coffee break
Tomasz Trzciński
Warsaw University of Technology
Siamese architecture
In this talk, I will present an overview of a Siamese neural network architecture. This architecture, as well as its extension to a triplet network, is eagerly used in several machine learning applications that require distance learning, such as image retrieval and face recognition. Although the fundamental idea behind these architectures is fairly straightforward, its successful application often requires some tricks. To discuss them, I will use a few case study examples including a feature descriptor learning method for simultaneous localisation and mapping (SLAM) developed as part of a Google Tango collaboration.
Szymon Sidor
OpenAI
Topics in Reinforcement Learning
I will talk about reinforcement learning with neural networks. As an introduction, I will discuss a spectrum of RL algorithms focusing on actor-critic methods. Then I will attempt to give an overview of questions currently pursued by the community, e.g. exploration, transfer, scaling up etc. Finally I will present two applications of multi-agent RL pursued at OpenAI - competitive robotics and Dota 2.
Lunch break
Lunch break
Discussion panel
"What is possible?"
Details to be announced
Paweł Gora
University of Warsaw
Applications of machine learning in traffic optimization
I will be talking about possible applications of machine learning in traffic optimization (and in optimizing some other complex processes). I will describe the process of building traffic metamodels by approximating outcomes of traffic simulations using machine learning algorithms (e.g., deep neural networks) and explain how it may be used in real-time traffic signal control and transport planning (e.g., to find optimal locations and capacities of parkings and charging stations for electric vehicles). I will also tell about possible applications of machine learning in the area of connected and autonomous vehicles, which are expected to revolutionize transportation in the near future.
Closing remarks
Closing remarks
Registration
Registration is closed!
Organizers
Machine Learning Student Research Group at the University of Warsaw
Machine Learning Student Research Group at the University of Warsaw was established in early 2016. We have been meeting every week at the MIM UW faculty since then.
The participants are not only students - the meetings are open and there are also enthusiasts of machine learning, entrepreneurs and professionals. During the meetings, we conduct lectures and discussions about the modern methods of Machine Learning with particular emphasis on Deep Learning. We want to not only to widen our knowledge, but also to meet other people interested in this relatively young and fascinating field.