Thank you to the incredible PyTorch Community for making the first ever PyTorch Ecosystem Day a success! Ecosystem Day was hosted on Gather.Town utilizing an auditorium, exhibition hall, and breakout rooms for partners to reserve for talks, demos, or tutorials. In order to cater to the global community, the event held two sessions: a morning session from 8am PT - 1pm PT and an evening session from 3:00pm -7:00pm PT. The day was filled with discussions on new developments, trends and challenges showcased through 71 posters, 32 breakout sessions and 6 keynote speakers.

Special thanks to our 6 keynote speakers: Piotr Bialecki, Ritchie Ng, Miquel Farré, Joe Spisak, Geeta Chauhan, and Suraj Subramanian. You can find the opening talks here:


Posters

Bring quantum machine learning to PyTorch with PennyLane
Josh Izaac, Thomas Bromley

PennyLane allows you to train quantum circuits just like neural networks!, This poster showcases how PennyLane can be interfaced with PyTorch to enable training of quantum and hybrid machine learning models. The outputs of a quantum circuit are provided as a Torch tensor with a defined gradient. We highlight how this functionality can be used to explore new paradigms in machine learning, including the use of hybrid models for transfer learning.

http://www.pennylane.ai

Platform, Ops & Tools

PyTorch development in VS Code
Jeffrey Mew

Visual Studio Code, a free cross-platform lightweight code editor, has become the most popular among Python developers for both web and machine learning projects. We will be walking you through an end to end PyTorch project to showcase what VS Code has a lot to offer to PyTorch developers to boost their productivity. Firstly, get your PyTorch project quickly up and running with VS Code's environment/dependency management and built-in Jupyter Notebook support. Secondly, breeze through coding with help from our AI-powered IntelliSense. When it's time to run your code, use the built-in Tensorboard integration to monitor your training along with the integrated PyTorch profiler to analyze and debug your code. Once you're ready for the cloud, VS Code has Azure service integration to allow you to scale your model training and deployment, along with deployment. Combing the power of the code editor with easy access to the Azure services, VS Code can be the one-stop shop for any developers looking to build machine learning models with PyTorch.

https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool/

Compiler & Transform & Production

Upcoming features in TorchScript
Yanan Cao, Harry Kim, Jason Ansel

TorchScript is the bridge between PyTorch's flexible eager mode to more deterministic and performant graph mode suitable for production deployment. As part of PyTorch 1.9 release, TorchScript will launch a few features that we'd like to share with you earlier, including a) a new formal language specification that defines the exact subset of Python/PyTorch features supported in TorchScript; b) Profile-Directed Typing that reduces the burden of converting a loosely-typed eager model into a strictly-typed TorchScript model; c) A TorchScript profiler that can shed light on performance characteristics of TorchScript model. We are constantly making improvements to make TorchScript easier to use and more performant.

http://fb.me/torchscript

Compiler & Transform & Production

Quantization-Aware Training with Brevitas
Alessandro Pappalardo

Brevitas is an open-source PyTorch library for quantization-aware training. Thanks to its flexible design at multiple levels of abstraction, Brevitas generalizes the typical uniform affine quantization paradigm adopted in the deep learning community under a common set of unified APIs. Brevitas provides a platform to both ML practitioners and researchers to either apply built-in state-of-the-art techniques in training for reduced-precision inference, or to implement novel quantization-aware training algorithms. Users can target supported inference toolchains, such as onnxruntime, TVM, Vitis AI, FINN or PyTorch itself, or experiment with hypothetical target hardware platforms. In particular, when combined with the flexibility of Xilinx FPGAs through the FINN toolchain, Brevitas supports the co-design of novel hardware building blocks in a machine-learning driven fashion. Within Xilinx, Brevitas has been adopted by various research projects concerning quantized neural networks, as well as in large scale deployments targeting custom programmable logic accelerators.

https://github.com/Xilinx/brevitas/

Compiler & Transform & Production

PyTorch Quantization: FX Graph Mode Quantization
Jerry Zhang, Vasiliy Kuznetsov, Raghuraman Krishnamoorthi

Quantization is a common model optimization technique to speedup runtime of a model by upto 4x, with a possible slight loss of accuracy. Currently, PyTorch support Eager Mode Quantization. FX Graph Mode Quantization improves upon Eager Mode Quantization by adding support for functionals and automating the quantization process. To use FX Graph Mode Quantization, one might need to refactor the model to make the model compatible with FX Graph Mode Quantization (symbolically traceable with torch.fx).

https://pytorch.org/docs/master/quantization.html#prototype-fx-graph-mode-quantization

Compiler & Transform & Production

Accelerate deployment of deep learning models in production with Amazon EC2 Inf1 and TorchServe containers
Fabio Nonato

Deep learning models can have game-changing impact on machine learning applications. However, deploying and managing deep learning models in production is complex and requires considerable engineering effort - from building custom inferencing APIs and scaling prediction services, to securing applications, while still leveraging the latest ML frameworks and hardware technology. Amazon EC2 Inf1 instances powered by AWS Inferentia deliver the highest performance and lowest cost machine learning inference in the cloud. Developers can deploy their deep-learning models to Inf1 instances using the AWS Neuron SDK that is natively integrated with PyTorch. Attend this poster session to learn how you can optimize and accelerate the deployment of your deep learning models in production using Inf1 instances and TorchServe containers. You will learn how to deploy TorchScript models on Inf1 and optimize your models with minimal code changes with features such as NeuronCore Groups and NeuronCore Pipeline, to meet your throughput and latency requirements. You can directly integrate these model level optimizations into the inference endpoint using TorchServe. We will also deep dive into how we optimized performance of a natural language processing endpoint and showcase the workflow for deploying the optimized model using TorchServe containers on Amazon ECS.

https://bit.ly/3mQVowk

Compiler & Transform & Production

Torch.fx
James Reed, Zachary DeVito, Ansley Ussery, Horace He, Michael Suo

FX is a toolkit for writing Python-to-Python transforms over PyTorch code. FX consists of three parts: > Symbolic Tracing – a method to extract a representation of the program by running it with "proxy" values. > Graph-based Transformations – FX provides an easy-to-use Python-based Graph API for manipulating the code. > Python code generation – FX generates valid Python code from graphs and turns that code into executable Python `nn.Module` instances.

https://pytorch.org/docs/stable/fx.html

Compiler & Transform & Production

AI Model Efficiency Toolkit (AIMET)
Abhijit Khobare, Murali Akula, Tijmen Blankevoort, Harshita Mangal, Frank Mayer, Sangeetha Marshathalli Siddegowda, Chirag Patel, Vinay Garg, Markus Nagel

AI is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. To make AI truly ubiquitous, it needs to run on the end device within a tight power and thermal budget. Quantization and compression help address these issues. In this tutorial, we'll discuss: The existing quantization and compression challenges Our research in novel quantization and compression techniques to overcome these challenges How developers and researchers can implement these techniques through the AI Model Efficiency Toolkit

Compiler & Transform & Production

Pytorch via SQL commands: A flexible, modular AutoML framework that democratizes ML for database users
Natasha Seelam, Patricio Cerda-Mardini, Cosmo Jenytin, Jorge Torres

Pytorch enables building models with complex inputs and outputs, including time-series data, text and audiovisual data. However, such models require expertise and time to build, often spent on tedious tasks like cleaning the data or transforming it into a format that is expected by the models. Thus, pre-trained models are often used as-is when a researcher wants to experiment only with a specific facet of a problem. See, as examples, FastAI's work into optimizers, schedulers, and gradual training through pre-trained residual models, or NLP projects with Hugging Face models as their backbone. We think that, for many of these problems, we can automatically generate a "good enough" model and data-processing pipeline from just the raw data and the endpoint. To address this situation, we are developing MindsDB, an open-source, PyTorch-based ML platform that works inside databases via SQL commands. It is built with a modular approach, and in this talk we are going to focus on Lightwood, the stand-alone core component that performs machine learning automation on top of the PyTorch framework. Lightwood automates model building into 5 stages: (1) classifying each feature into a "data type", (2) running statistical analyses on each column of a dataset, (3) fitting multiple models to normalize, tokenize, and generate embeddings for each feature, (4) deploying the embeddings to fit a final estimator, and (5) running an analysis on the final ensemble to evaluate it and generate a confidence model. It can generate quick "baseline" models to benchmark performance for any custom encoder representation of a data type and can also serve as scaffolding for investigating new hypotheses (architectures, optimizers, loss-functions, hyperparameters, etc). We aim to present our benchmarks covering wide swaths of problem types and illustrate how Lightwood can be useful for researchers and engineers through a hands-on demo.

https://mindsdb.com

Database & AI Accelerators

PyTorch on Supercomputers Simulations and AI at Scale with SmartSim
Sam Partee , Alessandro Rigazzi, Mathew Ellis, Benjamin Rob

SmartSim is an open source library dedicated to enabling online analysis and Machine Learning (ML) for traditional High Performance Computing (HPC) simulations. Clients are provided in common HPC simulation languages, C/C++/Fortran, that enable simulations to perform inference requests in parallel on large HPC systems. SmartSim utilizes the Redis ecosystem to host and serve PyTorch models alongside simulations. We present a use case of SmartSim where a global ocean simulation, used in climate modeling, is augmented with a PyTorch model to resolve quantities of eddy kinetic energy within the simulation.

https://github.com/CrayLabs/SmartSim

Database & AI Accelerators

Model agnostic confidence estimation with conformal predictors for AutoML
Patricio Cerda-Mardini, Natasha Seelam

Many domains leverage the extraordinary predictive performance of machine learning algorithms. However, there is an increasing need for transparency of these models in order to justify deploying them in applied settings. Developing trustworthy models is a great challenge, as they are usually optimized for accuracy, relegating the fit between the true and predicted distributions to the background [1]. This concept of obtaining predicted probability estimates that match the true likelihood is also known as calibration. Contemporary ML models generally exhibit poor calibration. There are several methods that aim at producing calibrated ML models [2, 3]. Inductive conformal prediction (ICP) is a simple yet powerful framework to achieve this, offering strong guarantees about the error rates of any machine learning model [4]. ICP provides confidence scores and turns any point prediction into a prediction region through nonconformity measures, which indicate the degree of inherent strangeness a data point presents when compared to a calibration data split. In this work, we discuss the integration of ICP with MindsDB --an open source AutoML framework-- successfully replacing its existing quantile loss approach for confidence estimation capabilities. Our contribution is threefold. First, we present a study on the effect of a "self-aware" neural network normalizer in the width of predicted region sizes (also known as efficiency) when compared to an unnormalized baseline. Our benchmarks consider results for over 30 datasets of varied domains with both categorical and numerical targets. Second, we propose an algorithm to dynamically determine the confidence level based on a target size for the predicted region, effectively prioritizing efficiency over a minimum error rate. Finally, we showcase the results of a nonconformity measure specifically tailored for small datasets. References: [1] Guo, C., Pleiss, G., Sun, Y., & Weinberger, K.Q. (2017). On Calibration of Modern Neural Networks. ArXiv, abs/1706.04599. [2] Naeini, M., Cooper, G., & Hauskrecht, M. (2015). Obtaining Well Calibrated Probabilities Using Bayesian Binning. Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, 2015, 2901-2907 . [3] Maddox, W., Garipov, T., Izmailov, P., Vetrov, D., & Wilson, A. (2019). A Simple Baseline for Bayesian Uncertainty in Deep Learning. NeurIPS. [4] Papadopoulos, H., Vovk, V., & Gammerman, A. (2007). Conformal Prediction with Neural Networks. 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), 2, 388-395.

https://mindsdb.com

Database & AI Accelerators

Enabling PyTorch on AMD Instinct™ GPUs with the AMD ROCm™ Open Software Platform
Derek Bouius

AMD Instinct GPUs are enabled with the upstream PyTorch repository via the ROCm open software platform. Now users can also easily download the installable Python package, built from the upstream PyTorch repository and hosted on pytorch.org. Notably, it includes support for distributed training across multiple GPUs and supports accelerated mixed precision training. AMD also provides hardware support for the PyTorch community build to help develop and maintain new features. This poster will highlight some of the work that has gone into enabling PyTorch support.

www.amd.com/rocm

Database & AI Accelerators

DeepSpeed: Shattering barriers of deep learning speed & scale
DeepSpeed Team Microsoft Corporation

In the poster (and a talk during the breakout session), we will present three aspects of DeepSpeed (https://github.com/microsoft/DeepSpeed), a deep learning optimization library based on PyTorch framework: 1) How we overcome the GPU memory barrier by ZeRO-powered data parallelism. 2) How we overcome the network bandwidth barrier by 1-bit Adam and 1-bit Lamb compressed optimization algorithms. 3) How we overcome the usability barrier by integration with Azure ML, HuggingFace, and PyTorch Lightning.

Distributed Training

Dask PyTorch DDP: A new library bringing Dask parallelization to PyTorch training
Stephanie Kirmer, Hugo Shi

We have developed a library that helps simplify the task of multi-machine parallel training for PyTorch models, bringing together the power of PyTorch DDP with Dask for parallelism on GPUs. Our poster describes the library and its core function, and demonstrates how the multi-machine training process works in practice.

https://github.com/saturncloud/dask-pytorch-ddp

Distributed Training

Optimising Physics Informed Neural Networks.
Vignesh Gopakumar

Solving PDEs using Neural Networks are often ardently laborious as it requires training towards a well-defined solution, i.e. global minima for a network architecture - objective function combination. For a family of complex PDEs, Physics Informed neural networks won't offer much in comparison to traditional numerical methods as their global minima becomes more and more intractable. We propose a modified approach that hinges on continual and parametrised learning that can create more general PINNs that can solve for a variety of PDE scenarios rather than solving for a well-defined case. We believe that this brings Neural Network based PDE solvers in comparison to numerical solvers.

Distributed Training

FairScale-A general purpose modular PyTorch library for high performance and large scale training
Mandeep Baines, Shruti Bhosale, Vittorio Caggiano, Benjamin Lefaudeux, Vitaliy Liptchinsky, Naman Goyal, Siddhardth Goyal, Myle Ott, Sam Sheifer, Anjali Sridhar, Min Xu

FairScale is a library that extends basic PyTorch capabilities while adding new SOTA techniques for high performance and large scale training on one or multiple machines. FairScale makes available the latest distributed training techniques in the form of composable modules and easy to use APIs. Machine Learning (ML) training at scale traditionally means data parallelism to reduce training time by using multiple devices to train on larger batch size. Nevertheless, with the recent increase of ML models sizes data parallelism is no longer enough to satisfy all "scaling" needs. FairScale provides several options to overcome some of the limitations to scale. For scaling training that is bottlenecked by memory (optimizer state, intermediate activations, parameters), FairScale provides APIs that have implemented optimizer, gradient and parameter sharding. This will allow users to train large models using devices in a more memory efficient manner. To overcome the memory required for large models FairScale provides various flavors of pipeline and model parallelism, MOE (Mixture Of Experts) layer, and Offload models. Those methods allow to perform computation only of shards of the models across multiple devices with micro batches of data to maximize device efficiency. FairScale also provides modules to aid users to scale batch size effectively without changing their existing learning rate hyperparameter - AdaScale - and save memory with checkpoint activation of intermediate layers. FairScale has also been integrated into Pytorch Lightening, HuggingFace, FairSeq, VISSL, and MMF to enable users of those frameworks to take advantage of its features.

Distributed Training

AdaptDL: An Open-Source Resource-Adaptive Deep Learning Training/Scheduling Framework
Aurick Qiao, Sang Keun Choe, Suhas Jayaram Subramanya, Willie Neiswanger, Qirong Ho, Hao Zhang, Gregory R. Ganger, Eric P. Xing

AdaptDL is an open source framework and scheduling algorithm that directly optimizes cluster-wide training performance and resource utilization. By elastically re-scaling jobs, co-adapting batch sizes and learning rates, and avoiding network interference, AdaptDL improves shared-cluster training compared with alternative schedulers. AdaptDL can automatically determine the optimal number of resources given a job's need. It will efficiently add or remove resources dynamically to ensure the highest-level performance. The AdaptDL scheduler will automatically figure out the most efficient number of GPUs to allocate to your job, based on its scalability. When the cluster load is low, your job can dynamically expand to take advantage of more GPUs. AdaptDL offers an easy-to-use API to make existing PyTorch training code elastic with adaptive batch sizes and learning rates. Showcase: Distributed training and Data Loading

Distributed Training

Accelerate PyTorch large model training with ONNX Runtime: just add one line of code!
Natalie Kershaw

As deep learning models, especially transformer models get bigger and bigger, reducing training time becomes both a financial and environmental imperative. ONNX Runtime can accelerate large-scale distributed training of PyTorch transformer models with a one-line code change (in addition to import statements ;-)) Adding in the DeepSpeed library improves training speed even more. With the new ORTModule API, you wrap an existing torch.nn.Module, and have us automatically: export the model as an ONNX computation graph; compile and optimize it with ONNX Runtime; and integrate it into your existing training script. In this poster, we demonstrate how to fine-tune a popular HuggingFace model and show the performance improvement, on a multi-GPU cluster in the Azure Machine Learning cloud service.

https://aka.ms/pytorchort

Distributed Training

PyTorch/XLA with new Cloud TPU VMs and Profiler
Jack Cao, Daniel Sohn, Zak Stone, Shauheen Zahirazami

PyTorch / XLA enables users to train PyTorch models on XLA devices including Cloud TPUs. Cloud TPU VMs now provide direct access to TPU host machines and hence offer much greater flexibility in addition to making debugging easier and reducing data transfer overheads. PyTorch / XLA has now full support for this new architecture. A new profiling tool has also been developed to enable better profiling of PyTorch / XLA. These improvements not only make it much easier to develop models but also reduce the cost of large-scale PyTorch / XLA training runs on Cloud TPUs.

http://goo.gle/pt-xla-tpuvm-signup

Distributed Training

PyTorch Lightning: Deep Learning without the Boilerplate
Ari Bornstein

PyTorch Lightning reduces the engineering boilerplate and resources required to implement state-of-the-art AI. Organizing PyTorch code with Lightning enables seamless training on multiple-GPUs, TPUs, CPUs, and the use of difficult to implement best practices such as model sharding, 16-bit precision, and more, without any code changes. In this poster, we will use practical Lightning examples to demonstrate how to train Deep Learning models with less boilerplate.

https://www.pytorchlightning.ai/

Frontend & Experiment Manager

Accelerate PyTorch with IPEX and oneDNN using Intel BF16 Technology
Jiong Gong, Nikita Shustrov, Eikan Wang, Jianhui Li, Vitaly Fedyunin

Intel and Facebook collaborated to enable BF16, a first-class data type in PyTorch, and a data type that are accelerated natively with the 3rd Gen Intel® Xeon® scalable processors. This poster introduces the latest SW advancements added in Intel Extension for PyTorch (IPEX) on top of PyTorch and the oneAPI DNN library for ease-of-use and high-performance BF16 DL compute on CPU. With these SW advancements, we demonstrated ease-of-use IPEX user-facing API, and we also showcased 1.55X-2.42X speed-up with IPEX BF16 training over FP32 with the stock PyTorch and 1.40X-4.26X speed-up with IPEX BF16 inference over FP32 with the stock PyTorch.

https://github.com/intel/intel-extension-for-pytorch

Frontend & Experiment Manager

TorchStudio, a machine learning studio software based on PyTorch
Robin Lobel

TorchStudio is a standalone software based on PyTorch and LibTorch. It aims to simplify the creation, training and iterations of PyTorch models. It runs locally on Windows, Ubuntu and macOS. It can load, analyze and explore PyTorch datasets from the TorchVision or TorchAudio categories, or custom datasets with any number of inputs and outputs. PyTorch models can then be loaded and written from scratch, analyzed, and trained using local hardware. Trainings can be run simultaneously and compared to identify the best performing models, and export them as a trained TorchScript or ONNX model.

https://torchstudio.ai/

Frontend & Experiment Manager

Hydra Framework
Jieru Hu, Omry Yadan

Hydra is an open source framework for configuring and launching research Python applications. Key features: - Compose and override your config dynamically to get the perfect config for each run - Run on remote clusters like SLURM and AWS without code changes - Perform basic greed search and hyper parameter optimization without code changes - Command line tab completion for your dynamic config And more.

Frontend & Experiment Manager

PyTorch-Ignite: training common things easy and the hard things possible
Victor Fomin, Sylvain Desroziers, Taras Savchyn

This poster intends to give a brief but illustrative overview of what PyTorch-Ignite can offer for Deep Learning enthusiasts, professionals and researchers. Following the same philosophy as PyTorch, PyTorch-Ignite aims to keep it simple, flexible and extensible but performant and scalable. Throughout this poster, we will introduce the basic concepts of PyTorch-Ignite, its API and features it offers. We also assume that the reader is familiar with PyTorch.

Frontend & Experiment Manager

Farabio - Deep Learning Toolkit for Biomedical Imaging
Sanzhar Askaruly, Nurbolat Aimakov, Alisher Iskakov, Hyewon Cho

Deep learning has transformed many aspects of industrial pipelines recently. Scientists involved in biomedical imaging research are also benefiting from the power of AI to tackle complex challenges. Although the academic community has widely accepted image processing tools, such as scikit-image, ImageJ, there is still a need for a tool which integrates deep learning into biomedical image analysis. We propose a minimal, but convenient Python package based on PyTorch with common deep learning models, extended by flexible trainers and medical datasets.

https://github.com/tuttelikz/farabio

Medical & Healthcare

MONAI: A Domain Specialized Library for Healthcare Imaging
Michael Zephyr, Prerna Dogra Richard Brown, Wenqi Li, Eric Kerfoot

Healthcare image analysis for both radiology and pathology is increasingly being addressed with deep-learning-based solutions. These applications have specific requirements to support various imaging modalities like MR, CT, ultrasound, digital pathology, etc. It is a substantial effort for researchers in the field to develop custom functionalities to handle these requirements. Consequently, there has been duplication of effort, and as a result, researchers have incompatible tools, which makes it hard to collaborate. MONAI stands for Medical Open Network for AI. Its mission is to accelerate the development of healthcare imaging solutions by providing domain-specialized building blocks and a common foundation for the community to converge in a native PyTorch paradigm.

https://monai.io/

Medical & Healthcare

How theator Built a Continuous Training Framework to Scale Up Its Surgical Intelligence Platform
Shai Brown, Daniel Neimark, Maya Zohar, Omri Bar, Dotan Asselmann

Theator is re-imagining surgery with a Surgical Intelligence platform that leverages highly advanced AI, specifically machine learning and computer vision technology, to analyze every step, event, milestone, and critical junction of surgical procedures. Our platform analyzes lengthy surgical procedure videos and extracts meaningful information, providing surgeons with highlight reels of key moments in an operation, enhanced by annotations. As the team expanded, we realized that we were spending too much time manually running model training and focusing on DevOps tasks and not enough time dedicated to core research. To face this, we build an automation framework composed of multiple training pipelines using PyTorch and ClearML. Our framework automates and manages our entire process, from model development to deployment to continuous training for model improvement. New data is now immediately processed and fed directly into training pipelines – speeding up workflow, minimizing human error, and freeing up our research team for more important tasks. Thus, enabling us to scale our ML operation and deliver better models for our end users.

Medical & Healthcare

Q&Aid: A Conversation Agent Powered by PyTorch
Cebere Bogdan, Cebere Tudor, Manolache Andrei, Horia Paul-Ion

We present Q&Aid, a conversation agent that relies on a series of machine learning models to filter, label, and answer medical questions based on a provided image and text inputs. Q&Aid is simplifying the hospital logic backend by standardizing it to a Health Intel Provider (HIP). A HIP is a collection of models trained on local data that receives text and visual input, afterward filtering, labeling, and feeding the data to the right models and generating at the end output for the aggregator. Any hospital is identified as a HIP holding custom models and labeling based on its knowledge. The hospitals are training and fine-tuning their models, such as a Visual Question Answering (VQA) model, on private data (e.g. brain anomaly segmentation). We aggregate all of the tasks that the hospitals can provide into a single chat app, offering the results to the user. When the chat ends, the transcript is forwarded to each hospital, a doctor being in charge of the final decision.

https://qrgo.page.link/d1fQk

Medical & Healthcare

Sleepbot: Multi-signal Sleep Stage Classifier AI for hospital and home
Jaden Hong, Kevin Tran, Tyler Lee, Paul Lee, Freddie Cha, Louis Jung, Dr. Jung Kyung Hong, Dr. In-Young Yoon, David Lee

Sleep disorders and insomnia are now regarded as a worldwide problem. Roughly 62% of adults worldwide feel that they don't sleep well. However, sleep is difficult to track so it's not easy to get suitable treatment to improve your sleep quality. Currently, the PSG (Polysomnography) is the only way to evaluate the sleep quality accurately but it's expensive and often inaccurate due to the first night effect. We propose a multi-signal sleep stage classifier for contactless sleep tracking: Sleepbot. By automating the manual PSG reading and providing explainable analysis, Sleepbot opens a new possibility to apply sleep staging AI in both home and hospital. With sound recorded by a smartphone app and RF-sensed signal measured by Asleep's non-contact sleep tracker, Sleepbot provides a clinical level of sleep stage classification. Sleepbot achieved 85.5 % accuracy in 5-class (Wake, N1, N2, N3, Rem) using PSG signals measured from 3,700 subjects and 77 % accuracy in 3-class (Wake, Sleep, REM) classification using only sound data measured from 1,2000 subjects.

Medical & Healthcare

PyMDE: Minimum-Distortion Embedding
Akshay Agrawal, Alnur Ali, Stephen Boyd

We present a unifying framework for the vector embedding problem: given a set of items and some known relationships between them, we seek a representation of the items by vectors, possibly subject to some constraints (e.g., requiring the vectors to have zero mean and identity covariance). We want the vectors associated with similar items to be near each other, and vectors associated with dissimilar items to not be near, measured in Euclidean distance. We formalize this by introducing distortion functions, defined for some pairs of the items. Our goal is to choose an embedding that minimizes the total distortion, subject to the constraints. We call this the minimum-distortion embedding (MDE) problem. The MDE framework generalizes many well-known embedding methods, such as PCA, the Laplacian eigenmap, multidimensional scaling, UMAP, and others, and also includes new types of embeddings. Our accompanying software library, PyMDE, makes it easy for users to specify and approximately solve MDE problems, enabling experimentation with well-known and custom embeddings alike. By making use of automatic differentiation and hardware acceleration via PyTorch, we are able to scale to very large embedding problems. We will showcase examples of embedding real datasets, including an academic co-authorship network, single-cell mRNA transcriptomes, US census data, and population genetics.

Medical & Healthcare

TorchIO: Pre-Processing & Augmentation of Medical Images for Deep Learning Applications
Fernando Pérez-García, Rachel Sparks, Sébastien Ourselin

Processing of medical images such as MRI or CT presents unique challenges compared to RGB images typically used in computer vision. These include a lack of labels for large datasets, high computational costs, and metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment of volumes. We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be composed, reproduced, traced and extended. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages open science, as it supports reproducibility and is version controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.

Medical & Healthcare

Deep Learning Based Model to Predict Covid19 Patients' Outcomes on Admission
Laila Rasmy, Ziqian Xie, Degui Zhi

With the extensive use of electronic records and the availability of historical patient information, predictive models that can help identify patients at risk based on their history at an early stage can be a valuable adjunct to clinician judgment. Deep learning models can better predict patients' outcomes by consuming their medical history regardless of the length and the complexity of such data. We used our Pytorch_EHR framework to train a model that can predict COVID-19 patient's health outcomes on admission. We used the Cerner Real-world COVID-19 (Q2) cohort which included information for 117,496 COVID patients from 62 health systems. We used a cohort of 55,068 patients and defined our outcomes including mortality, intubation, and hospitalization longer than 3 days as binary outcomes. We feed the model with all diagnoses, medication, laboratory results, and other clinical events information available before or on their first COVID-19 encounter admission date. We kept the data preprocessing at a minimum for convenience and practicality relying on the embedding layer that learns features representations from the large training set. Our model showed improved performance compared to other baseline machine learning models like logistic regression (LR). For in-hospital mortality, our model showed AUROC of 89.5%, 90.6%, and 84.3% for in-hospital mortality, intubation, and hospitalization for more than 3 days, respectively versus LR which showed 82.8%, 83.2%, and 76.8%

https://github.com/ZhiGroup/pytorch_ehr

Medical & Healthcare

Rolling out Transformers with TorchScript and Inferentia
Binghui Ouyang, Alexander O’Connor

While Transformers have brought unprecedented improvements in the accuracy and ease of developing NLP applications, their deployment remains challenging due to the large size of the models and their computational complexity. Indeed, until recently is has been a widespread misconception that hosting high-performance transformer-based models was prohibitively expensive, and technically challenging. Fortunately, recent advances in both the PyTorch ecosystem and in custom hardware for inference have created a world where models can be deployed in a cost-effective, scalable way, without the need for complex engineering. In this presentation, we will discuss the use of PyTorch and AWS Inferentia to deploy production-scale models in chatbot intent classification - a particularly relevant and demanding scenario. Autodesk deploys a number of transformer based models to solve customer support issues across our channels, and our ability to provide a flexible, high-quality machine learning solution is supported by leveraging cutting-edge technology such as transformer based classification. Our chatbot, AVA, responds to tens of thousands of customer interactions monthly, and we are evolving our architecture to be supported by customer inference. We will discuss our experience of piloting transformer-based intent models, and present a workflow for going from data to deployment for similar projects.

NLP & Multimodal, RL & Time Series

PyTorchTS: PyTorch Probabilistic Time Series Forecasting Framework
Kashif Rasul

PyTorchTS is a PyTorch based Probabilistic Time Series forecasting framework that comes with state of the art univariate and multivariate models.

https://github.com/zalandoresearch/pytorch-ts

NLP & Multimodal, RL & Time Series

MMF: A modular framework for multimodal research
Sasha Sheng, Amanpreet Singh

MMF is designed from ground up to let you focus on what matters -- your model -- by providing boilerplate code for distributed training, common datasets and state-of-the-art pretrained baselines out-of-the-box. MMF is built on top of PyTorch that brings all of its power in your hands. MMF is not strongly opinionated. So you can use all of your PyTorch knowledge here. MMF is created to be easily extensible and composable. Through our modular design, you can use specific components from MMF that you care about. Our configuration system allows MMF to easily adapt to your needs.

NLP & Multimodal, RL & Time Series

AllenNLP: An NLP research library for developing state-of-the-art models
Dirk Groeneveld, Akshita Bhagia, Pete Walsh, Michael Schmitz

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.

https://github.com/allenai/allennlp

NLP & Multimodal, RL & Time Series

Project Spock at Tubi: Understanding Content using Deep Learning for NLP
John Trenkle, Jaya Kawale & Tubi ML team

Tubi is one of the leading platforms providing free high-quality streaming movies and TV shows to a worldwide audience. We embrace a data-driven approach and leverage advanced machine learning techniques using PyTorch to enhance our platform and business in any way we can. The Three Pillars of AVOD are the guiding principle for our work. The Pillars are Content: all the titles we maintain in our library Audience: everyone who watches titles on Tubi Advertising: ads shown to viewers on behalf of brands In this poster, we'll focus on the Content aspect with more details for the various use cases especially Content Understanding. Content is an important pillar of Tubi since to be successful, we need to look at existing titles and beyond what we already have and attempt to understand all of the titles out in the wild and how they could benefit our platform in some fashion. Content Understanding revolves around digesting a rich collection of 1st- and 3rd-party data in structured (metadata) and unstructured (text) forms and developing representations that capture the essence of those Titles. With the analogy of linear algebra, we can say we are attempting to project Title vectors from the universe to our tubiverse with as much fidelity as possible in order to ascertain potential value for each target use case. We will describe several techniques to understand content better using Pytorch.

NLP & Multimodal, RL & Time Series

RL Based Performance Optimization of Deep Neural Networks
Benoit Steiner, Chris Cummins, Horace He, Hugh Leather

As the usage of machine learning techniques is becoming ubiquitous, the efficient execution of neural networks is crucial to many applications. Frameworks, such as Halide and TVM, separate the algorithmic representation of the deep learning model from the schedule that determines its implementation. Finding good schedules, however, remains extremely challenging. Auto-tuning methods, which search the space of valid schedules and execute each candidate on the hardware, identify some of the best performing schedules, but the search can take hours, hampering the productivity of deep learning practitioners. What is needed is a method that achieves a similar performance without extensive search, delivering the needed efficiency quickly. Using PyTorch, we model the scheduling process as a sequence of optimization choices, and implement a new technique to accurately predict the expected performance of a partial schedule using a LSTM over carefully engineered features that describe each DNN operator and their current scheduling choices. Leveraging these predictions we are able to make these optimization decisions greedily and, without any executions on the target hardware, rapidly identify an efficient schedule. This techniques enables to find schedules that improve the execution performance of deep neural networks by 2.6× over Halide and 1.5× over TVM. Moreover, our technique completes in seconds instead of hours, making it possible to include it as a new backend for PyTorch itself.

http://facebook.ai

NLP & Multimodal, RL & Time Series

A Data-Centric Framework for Composable NLP
Zhenghong Liu

Forte is an open-source toolkit for building Natural Language Processing workflows via assembling state-of-the-art NLP and ML technologies. This toolkit features composable pipeline, cross-task interaction, adaptable data-model interfaces. The highly composable design allows users to build complex NLP pipelines of a wide range of tasks including document retrieval, information extraction, and text generation by combining existing toolkits or customized PyTorch models. The cross-task interaction ability allows developers to utilize the results from individual tasks to make informed decisions. The data-model interface helps developers to focus on building reusable PyTorch models by abstracting out domain and preprocessing details. We show that Forte can be used to build complex pipelines, and the resulting pipeline can be easily adapted to different domains and tasks with small changes in the code.

https://github.com/asyml/forte

NLP & Multimodal, RL & Time Series

Environments and Baselines for Multitask Reinforcement Learning
Shagun Sodhani, Amy Zhang, Ludovic Denoyer, Pierre-Alexandre Kamienny, Olivier Delalleau

The two key components in a multi-task RL codebase are (i) Multi-task RL algorithms and (ii) Multi-task RL environments. We develop open-source libraries for both components. [MTRL](https://github.com/facebookresearch/mtrl) provides components to implement multi-task RL algorithms, and [MTEnv](https://github.com/facebookresearch/mtenv) is a library to interface with existing multi-task RL environments and create new ones. MTRL has two building blocks: (i) single task policy and (ii) components to augment the single-task policy for multi-task setup. The ideal workflow is to start with a base policy and add multi-task components as they seem fit. MTRL enables algorithms like GradNorm, Distral, HiPBMDP, PCGrad, Soft Modularization, etc. MTEnv is an effort to standardize multi-task RL environments and provide better benchmarks. We extend the Gym API to support multiple tasks, with two guiding principles: (i) Make minimal changes to the Gym Interface (which the community is very familiar with) and (ii) Make it easy to port existing environments to MTEnv. Additionally, we provide a collection of commonly used multi-task RL environments (Acrobot, Cartpole, Multitask variant of DeepMind Control Suite, Meta-World, Multi-armed Bandit, etc.). The RL practitioner can combine its own environments with the MTEnv wrappers to add multi-task support with a small code change. MTRL and MTEnv are used in several ongoing/published works at FAIR.

http://qr.w69b.com/g/tGZSFw33G

NLP & Multimodal, RL & Time Series

The Hugging Face Ecosystem
Lysandre Debut, Sylvain Gugger, Quentin Lhoest 

Transfer learning has become the norm to get state-of-the-art results in NLP. Hugging Face provides you with tools to help you on every step along the way: - A free git-based shared hub with more than 7,500 PyTorch checkpoints, and more than 800 NLP datasets. - The ? Datasets library, to easily download the dataset, manipulate it and prepare it. - The ? Tokenizers library, that provides ultra-fast tokenizers backed by Rust, and converts text in PyTorch tensors. - The ? Transformers library, providing more than 45 PyTorch implementations of Transformer architectures as simple nn.Module as well as a training API. - The ? Accelerate library, a non-intrusive API that allows you to run your raw training loop on any distributed setup. The pipeline is then simply a six-step process: select a pretrained model from the hub, handle the data with Datasets, tokenize the text with Tokenizers, load the model with Transformers, train it with the Trainer or your own loop powered by Accelerate, before sharing your results with the community on the hub.

https://huggingface.co/models

NLP & Multimodal, RL & Time Series

 Asteroid: the Pytorch-based Audio Source Separation Toolkit for Researchers
Manuel Pariente, Samuele Cornell, Jonas Haag, Joris Cosentino, Michel Olvera, Fabian-Robert Stöter, Efthymios Tzinis

Asteroid is an audio source separation toolkit built with PyTorch and PyTorch-Lightning. Inspired by the most successful neural source separation systems, it provides all neural building blocks required to build such a system. To improve reproducibility, recipes on common audio source separation datasets are provided, including all the steps from data download\preparation through training to evaluation as well as many current state-of-the-art DNN models. Asteroid exposes all levels of granularity to the user from simple layers to complete ready-to-use models. Our pretrained models are hosted on the asteroid-models community in Zenodo and on the Huggingface model Hub. Loading and using pretrained models is trivial and sharing them is also made easy with asteroid's CLI.","poster_showcase":"Audio Source Separation, Speech Processing, Deep Learning","email":"cornellsamuele@gmail.com"}

https://asteroid-team.github.io/

NLP & Multimodal, RL & Time Series

rlstructures: A Lightweight Python Library for Reinforcement Learning Research
Ludovic Denoyer, Danielle Rothermel, Xavier Martinet

RLStructures is a lightweight Python library that provides simple APIs as well as data structures that make as few assumptions as possible about the structure of your agent or your task, while allowing for transparently executing multiple policies on multiple environments in parallel (incl. multiple GPUs). It thus facilitates the implementation of RL algorithms while avoiding complex abstractions.

NLP & Multimodal, RL & Time Series

MBRL-Lib: a PyTorch toolbox for model-based reinforcement learning research
Luis Pineda, Brandon Amos, Amy Zhang, Nathan O. Lambert, Roberto Calandra

Model-based reinforcement learning (MBRL) is an active area of research with enormous potential. In contrast to model-free RL, MBRL algorithms solve tasks by learning a predictive model of the task dynamics, and use this model to predict the future and facilitate decision making. Many researchers have argued that MBRL can result in lower sample complexity, better generalization, as well as safer and more interpretable decisions. However, despite the surge in popularity and great potential of MBRL, there is currently no widely accepted library for facilitating research in this area. Since MBRL methods often involve the interplay of complex components such as probabilistic ensembles, latent variable models, planning algorithms, and even model-free methods, the lack of such a library raises the entry bar to the field and slows down research efforts. In this work we aim to solve this problem by introducing MBRL-Lib, a modular PyTorch toolbox specifically designed for facilitating research on model-based reinforcement learning. MBRL-Lib provides interchangeable options for training dynamics models and running planning algorithms, which can then be used in a mix and match fashion to create novel MBRL methods. The library also provides a set of utility functions to run common MBRL tasks, as well a set of diagnostics tools to identify potential issues while training dynamics models and control algorithms.

https://github.com/facebookresearch/mbrl-lib

NLP & Multimodal, RL & Time Series

Introducing New PyTorch Profiler
Geeta Chauhan, Gisle Dankel, Elena Neroslavaskaya

Analyzing and improving large-scale deep learning model performance is an ongoing challenge that continues to grow in importance as the model sizes increase. Microsoft and Facebook collaborated to create a native PyTorch performance debugging tool called PyTorch Profiler. The profiler builds on the PyTorch autograd profiler foundation, adds a new high fidelity GPU profiling engine, and out-of-the-box bottleneck analysis tool in Tensorboard. New Profiler delivers the simplest experience available to date where users can profile their models without installing any additional packages and see results immediately in Tensorboard. Until today, beginner users of PyTorch may not have attempted to profile their models due to the task complexity. With the new bottleneck analysis tool, they will find profiling easy and accessible. Experienced users will be delighted by the detailed trace views which illustrate GPU kernel execution events and their relationship to the PyTorch operations. Come learn how to profile your PyTorch models using this new delightfully simple tool.

https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool

Performance & Profiler

TRTorch: A Compiler for TorchScript Targeting NVIDIA GPUs with TensorRT
Naren Dasan

For experimentation and the development of machine learning models, few tools are as approachable as PyTorch. However, when moving from research to production, some of the features that make PyTorch great for development make it hard to deploy. With the introduction of TorchScript, PyTorch has solid tooling for addressing some of the problems of deploying PyTorch models. TorchScript removes the dependency on Python and produces portable, self contained, static representations of code and weights. But in addition to portability, users also look to optimize performance in deployment. When deploying on NVIDIA GPUs, TensorRT, NVIDIA's deep learning optimizer, provides the capability to maximize performance of workloads by tuning the execution of models for specific target hardware. TensorRT also provides tooling for conducting further optimization through mixed and reduced precision execution and post training quantization (PTQ). We present TRTorch, a compiler for PyTorch and TorchScript targeting NVIDIA GPUs, which combines the usability of PyTorch with the performance of TensorRT and allows users to fully optimize their inference workloads without leaving the PyTorch ecosystem. It also simplifies conducting complex optimizations like PTQ by leveraging common PyTorch tooling. TRTorch can be used directly from PyTorch as a TorchScript Backend, embedded in an application or used from the command line to easily increase the performance of inference applications.

https://nvidia.github.io/TRTorch/

Performance & Profiler

WeightWatcher: A Diagnostic Tool for DNNs
Charles H. Martin

WeightWatcher (WW) is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data. It can be used to: analyze pre/trained pyTorch models; inspect models that are difficult to train; gauge improvements in model performance; predict test accuracies across different models; and detect potential problems when compressing or fine-tuning pretrained models. WeightWatcher is based on theoretical research (done in\-joint with UC Berkeley) into "Why Deep Learning Works", using ideas from Random Matrix Theory (RMT), Statistical Mechanics, and Strongly Correlated Systems.

Performance & Profiler

Constrained Optimization in PyTorch 1.9 Through Parametrizations
Mario Lezcano-Casado

"This poster presents the ""parametrizations"" feature that will be added to PyTorch in 1.9.0. This feature allows for a simple implementation of methods like pruning, weight_normalization or spectral_normalization. More generally, it implements a way to have ""computed parameters"". This means that we replace a parameter `weight` in a layer with `f(weight)`, where `f` is an arbitrary module. In other words, after putting a parametrization `f` on `layer.weight`, `layer.weight` will return `f(weight)`. They implement a caching system, so that the value `f(weight)` is computed just once during the forward pass. A module that implements a parametrisation may also have a `right_inverse` method. If this method is present, it is possible to assign to a parametrised tensor. This is useful when initialising a parametrised tensor. This feature can be seen as a first step towards invertible modules. In particular, it may also help making distributions first-class citizens in PyTorch. Parametrisations also allows for a simple implementation of constrained optimisation. From this perspective, parametrisation maps an unconstrained tensor to a constrained space such as the space of orthogonal matrices, SPD matrices, low-rank matrices... This approach is implemented in the library GeoTorch (https://github.com/Lezcano/geotorch/)."

Performance & Profiler

Distributed Pytorch with Ray
Richard Liaw, Kai Fricke, Amog Kamsetty, Michael Galarnyk

Ray is a popular framework for distributed Python that can be paired with PyTorch to rapidly scale machine learning applications. Ray contains a large ecosystem of applications and libraries that leverage and integrate with Pytorch. This includes Ray Tune, a Python library for experiment execution and hyperparameter tuning at any scale; RLlib, a state-of-the-art library for reinforcement learning; and Ray Serve, a library for scalable model serving. Together, Ray and Pytorch are becoming the core foundation for the next generation of production machine learning platforms.

https://ray.io/

Platforms & Ops & Tools

Avalanche: an End-to-End Library for Continual Learning based on PyTorch
Vincenzo Lomonaco, Lorenzo Pellegrini Andrea Cossu, Antonio Carta, Gabriele Graffieti

Learning continually from non-stationary data stream is a long sought goal of machine learning research. Recently, we have witnessed a renewed and fast-growing interest in Continual Learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose an open-source, end-to-end library for continual learning based on PyTorch that may provide a shared and collaborative code-base for fast prototyping, training and reproducible evaluation of continual learning algorithms.

https://avalanche.continualai.org

Platforms & Ops & Tools

PyTorch on IBM Z and LinuxONE (s390x)
Hong Xu

IBM Z is a hardware product line for mission-critical applications, such as finance and health applications. It employs its own CPU architecture, which PyTorch does not officially support. In this poster, we discuss why it is important to support PyTorch on Z. Then, we show our prebuilt minimal PyTorch package for IBM Z. Finally, we demonstrate our continuing commitment to make more PyTorch features available on IBM Z.

https://codait.github.io/pytorch-on-z

Platforms & Ops & Tools

The Fundamentals of MLOps for R&D: Orchestration, Automation, Reproducibility
Dr. Ariel Biller

Both from sanity considerations and the productivity perspective, Data Scientists, ML engineers, Graduate students, and other research-facing roles are all starting to adopt best-practices from production-grade MLOps. However, most toolchains come with a hefty price of extra code and maintenance, which reduces the actual time available for R&D. We will show an alternative approach using ClearML, the open-source MLOps solution. In this "best-practices" poster, we will overview the "must-haves" of R&D-MLOPs: Orchestration, Automation, and Reproducibility. These enable easy remote execution through magically reproducible setups and even custom, reusable, bottom-up pipelines. We will take a single example and schematically transform it from the "as downloaded from GitHub" stage to a fully-fledged, scalable, version-controlled, parameterizable R&D pipeline. We will measure the number of changes needed to the codebase and provide evidence of real low-cost integration. All code, logs, and metrics will be available as supporting information.

Platforms & Ops & Tools

FairTorch: Aspiring to Mitigate the Unfairness of Machine Learning Models
Masashi Sode, Akihiko Fukuchi, Yoki Yabe, Yasufumi Nakata

Is your machine learning model fair enough to be used in your system? What if a recruiting AI discriminates on gender and race? What if the accuracy of medical AI depends on a person's annual income or on the GDP of the country where it is used? Today's AI has the potential to cause such problems. In recent years, fairness in machine learning has received increasing attention. If current machine learning models used for decision making may cause unfair discrimination, developing a fair machine learning model is an important goal in many areas, such as medicine, employment, and politics. Despite the importance of this goal to society, as of 2020, there was no PyTorch¹ project incorporating fairness into a machine learning model. To solve this problem, we created FairTorch at the PyTorch Summer Hackathon 2020. FairTorch provides a tool to mitigate the unfairness of machine learning models. A unique feature of our tool is that it allows you to add a fairness constraint to your model by adding only a few lines of code, using the fairness criteria provided in the library.

https://github.com/wbawakate/fairtorch

Platforms & Ops & Tools

TorchDrift: Drift Detection for PyTorch
Thomas Viehmann, Luca Antiga

When machine learning models are deployed to solve a given task, a crucial question is whether they are actually able to perform as expected. TorchDrift addresses one aspect of the answer, namely drift detection, or whether the information flowing through our models - either probed at the input, output or somewhere in-between - is still consistent with the one it was trained and evaluated on. In a nutshell, TorchDrift is designed to be plugged into PyTorch models and check whether they are operating within spec. TorchDrift's principles apply PyTorch's motto _from research to production_ to drift detection: We provide a library of methods that canbe used as baselines or building blocks for drift detection research, as well as provide practitioners deploying PyTorch models in production with up-to-date methods and educational material for building the necessary statistical background. Here we introduce TorchDrift with an example illustrating the underlying two-sample tests. We show how TorchDrift can be integrated in high-performance runtimes such as TorchServe or RedisAI, to enable drift detection in real-world applications thanks to the PyTorch JIT.

https://torchdrift.org/

Platforms & Ops & Tools

Ouroboros: MLOps for Automated Driving
Quincy Chen, Arjun Bhargava, Sudeep Pillai, Marcus Pan, Chao Fang, Chris Ochoa, Adrien Gaidon, Kuan-Hui Lee, Wolfram Burgard

Modern machine learning for autonomous vehicles requires a fundamentally different infrastructure and production lifecycle from their standard software continuous-integration/continuous-deployment counterparts. At Toyota Research Institute (TRI), we have developed ​Ouroboros​ - a modern ML platform that supports the end-to-end lifecycle of all ML models delivered to TRI's autonomous vehicle fleets. We envision that all ML models delivered to our fleet undergo a systematic and rigorous treatment. Ouroboros delivers several essential features including: a. ML dataset governance and infrastructure-as-code​ that ensures the traceability, reproducibility, standardization, and fairness for all ML datasets and models procedurally generated and delivered to the TRI fleet. b. Unified ML dataset and model management:​ An unified and streamlined workflow for ML dataset curation, label management, and model development that supports several key ML models delivered to the TRI fleet today c. A Large-scale Multi-task, Multi-modal Dataset for Automated Driving​ that supports the development of various models today, including 3D object detection, 2D object detection, 2D BeVFlow, Panoptic Segmentation; d. Orchestrated ML workflows​ to stand up scalable ML applications such as push-button re-training solutions, ML CI/CDs pipelines, Dataset Curation workflows, Auto-labelling pipelines, leveraging the most up-to-date cloud tools available. along their lifecycles, ensuring strong governance on building reusable, reproducible, robust, traceable, and fair ML models for the production driving setting. By following the best MLOps practices, we expect our platform to lay the foundation for continuous life-long learning in our autonomous vehicle fleets and accelerate the transition from research to production.

https://github.com/TRI-ML

Platforms & Ops & Tools

carefree-learn: Tabular Datasets ❤️ PyTorch
Yujian He

carefree-learn makes PyTorch accessible to people who are familiar with machine learning but not necessarily PyTorch. By having already implemented all the pre-processing and post-processing under the hood, users can focus on implementing the core machine learning algorithms / models with PyTorch and test them on various datasets. By having designed the whole structure carefully, users can easily customize every block in the whole pipeline, and can also 'combine' the implemented blocks to 'construct' new models without efforts. By having carefully made abstractions users can adapt it to their specific down-stream tasks, such as quantitative trading (in fact I've already implemented one for my company and it works pretty well XD). carefree-learn handles distributed training carefully, so users can either run multiple tasks at the same time, or run a huge model with DDP in one line of code. carefree-learn also integrates with mlflow and supports exporting to ONNX, which means it is ready for production to some extend.

Platforms & Ops & Tools

OpenMMLab: An Open-Source Algorithm Platform for Computer Vision
Wenwei Zhang

OpenMMLab project builds open-source toolboxes for Artificial Intelligence (AI). It aims to 1) provide high-quality codebases to reduce the difficulties in algorithm reimplementation; 2) provide a complete research platform to accelerate the research production; and 3) shorten the gap between research production to the industrial applications. Based on PyTorch, OpenMMLab develops MMCV to provide unified abstract training APIs and common utils, which serves as a foundation of 15+ toolboxes and 40+ datasets. Since the initial release in October 2018, OpenMMLab has released 15+ toolboxes that cover 10+ directions, implement 100+ algorithms, and contain 1000+ pre-trained models. With a tighter collaboration with the community, OpenMMLab will release more toolboxes with more flexible and easy-to-use training frameworks in the future.

https://openmmlab.com/

Platforms & Ops & Tools

Catalyst – Accelerated deep learning R&D
Sergey Kolesnikov

For the last three years, Catalyst-Team and collaborators have been working on Catalyst  - a high-level PyTorch framework Deep Learning Research and Development. It focuses on reproducibility, rapid experimentation, and codebase reuse so you can create something new rather than write yet another train loop. You get metrics, model checkpointing, advanced logging, and distributed training support without the boilerplate and low-level bugs.

https://catalyst-team.com

Platforms & Ops & Tools

High-fidelity performance metrics for generative models in PyTorch
Anton Obukhov

Evaluation of generative models such as GANs is an important part of deep learning research. In 2D image generation, three approaches became widely spread: Inception Score, Fréchet Inception Distance, and Kernel Inception Distance. Despite having a clear mathematical and algorithmic description, these metrics were initially implemented in TensorFlow and inherited a few properties of the framework itself, such as a specific implementation of the interpolation function. These design decisions were effectively baked into the evaluation protocol and became an inherent part of the specification of the metrics. As a result, researchers wishing to compare against state of the art in generative modeling are forced to perform an evaluation using the original metric authors' codebases. Reimplementations of metrics in PyTorch and other frameworks exist, but they do not provide a proper level of fidelity, thus making them unsuitable for reporting results and comparing them to other methods. This software aims to provide epsilon-exact implementations of the said metrics in PyTorch and remove inconveniences associated with generative model evaluation and development. All the evaluation pipeline steps are correctly tested, with relative errors and sources of remaining non-determinism summarized in sections below. TLDR; fast and reliable GAN evaluation in PyTorch

https://github.com/toshas/torch-fidelity

Platforms & Ops & Tools

Using Satellite Imagery to Identify Oceanic Oil Pollution
Jona Raphael (jona@skytruth.org), Ben Eggleston, Ryan Covington, Tatianna Evanisko, John Amos

Operational oil discharges from ships, also known as "bilge dumping," have been identified as a major source of petroleum products entering our oceans, cumulatively exceeding the largest oil spills, such as the Exxon Valdez and Deepwater Horizon spills, even when considered over short time spans. However, we still don't have a good estimate of ● How much oil is being discharged; ● Where the discharge is happening; ● Who the responsible vessels are. This makes it difficult to prevent and effectively respond to oil pollution that can damage our marine and coastal environments and economies that depend on them. In this poster we will share SkyTruth's recent work to address these gaps using machine learning tools to detect oil pollution events and identify the responsible vessels when possible. We use a convolutional neural network (CNN) in a ResNet-34 architecture to perform pixel segmentation on all incoming Sentinel-1 synthetic aperture radar (SAR) imagery to classify slicks. Despite the satellites' incomplete oceanic coverage, we have been detecting an average of 135 vessel slicks per month, and have identified several geographic hotspots where oily discharges are occurring regularly. For the images that capture a vessel in the act of discharging oil, we rely on an Automatic Identification System (AIS) database to extract details about the ships, including vessel type and flag state. We will share our experience ● Making sufficient training data from inherently sparse satellite image datasets; ● Building a computer vision model using PyTorch and fastai; ● Fully automating the process in the Amazon Web Services (AWS) cloud. The application has been running continuously since August 2020, has processed over 380,000 Sentinel-1 images, and has populated a database with more than 1100 high-confidence slicks from vessels. We will be discussing preliminary results from this dataset and remaining challenges to be overcome. Learn more at https://skytruth.org/bilge-dumping/

Vision

UPIT: A fastai Package for Unpaired Image-to-Image Translation
Tanishq Abraham

Unpaired image-to-image translation algorithms have been used for various computer vision tasks like style transfer and domain adaption. Such algorithms are highly attractive because they alleviate the need for the collection of paired datasets. In this poster, we demonstrate UPIT, a novel fastai/PyTorch package (built with nbdev) for unpaired image-to-image translation. It implements various state-of-the-art unpaired image-to-image translation algorithms such as CycleGAN, DualGAN, UNIT, and more. It enables simple training and inference on unpaired datasets. It also comes with implementations of commonly used metrics like FID, KID, and LPIPS. It also comes with Weights-and-Biases integration for easy experiment tracking. Since it is built on top of fastai and PyTorch, it comes with support for mixed-precision and multi-GPU training. It is highly flexible, and custom dataset types, models, and metrics can be used as well. With UPIT, training and applying unpaired image-to-image translation only takes a few lines of code.

https://github.com/tmabraham/UPIT

Vision

PyTorchVideo: A Deep Learning Library for Video Understanding
Aaron Adcock, Bo Xiong, Christoph Feichtenhofer, Haoqi Fan, Heng Wang, Kalyan Vasudev Alwala, Matt Feiszli, Tullie Murrell, Wan-Yen Lo, Yanghao Li, Yilei Li, Zhicheng Yan

PyTorchVideo is the new Facebook AI deep learning library for video understanding research. It contains variety of state of the art pretrained video models, dataset, augmentation, tools for video understanding. PyTorchVideo provides efficient video components on accelerated inference on mobile device.

https://pytorchvideo.org/

Vision

Deep Learning Enables Fast and Dense Single-Molecule Localization with High Accuracy
A. Speiser, L-R. Müller, P. Hoess, U. Matti, C. J. Obara, J. H. Macke, J. Ries, S. C. Turaga

Single-molecule localization microscopy (SMLM) has had remarkable success in imaging cellular structures with nanometer resolution, but the need for activating only single isolated emitters limits imaging speed and labeling density. Here, we overcome this major limitation using deep learning. We developed DECODE, a computational tool that can localize single emitters at high density in 3D with the highest accuracy for a large range of imaging modalities and conditions. In a public software benchmark competition, it outperformed all other fitters on 12 out of 12 data-sets when comparing both detection accuracy and localization error, often by a substantial margin. DECODE allowed us to take live-cell SMLM data with reduced light exposure in just 3 seconds and to image microtubules at ultra-high labeling density. Packaged for simple installation and use, DECODE will enable many labs to reduce imaging times and increase localization density in SMLM.

http://github.com/turagalab/decode

Vision

A Robust PyTorch Trainable Entry Convnet Layer in Fourier Domain
Abraham Sánchez, Guillermo Mendoza, E. Ulises Moya-Sánchez

We draw inspiration from the cortical area V1. We try to mimic their main processing properties by means of: quaternion local phase/orientation to compute lines and edges detection in a specific direction. We analyze how this layer is robust by its greometry to large illumination and brightness changes.

https://gitlab.com/ab.sanchezperez/pytorch-monogenic

Vision

PyroNear: Embedded Deep Learning for Early Wildfire Detection
François-Guillaume Fernandez, Mateo Lostanlen, Sebastien Elmaleh, Bruno Lenzi, Felix Veith, and more than 15+ contributors

"PyroNear is non-profit organization composed solely of volunteers which was created in late 2019. Our core belief is that recent technological developments can support the cohabitation between mankind & its natural habitat. We strive towards high-performing, accessible & affordable tech-solutions for protection against natural hazards. More specifically, our first efforts are focused on wildfire protection by increasing the coverage of automatic detection systems. Our ongoing initiative has now gathered dozens of volunteers to put up the following main contributions: - Computer Vision: compiling open-source models and datasets (soon to be published) for vision tasks related to wildfire detection - Edge Computing: developing an affordable physical prototype running our PyTorch model on a Raspberry Pi - End-to-end detection workflow: building a responsible end-to-end system for large scale detection and alert management (API, front-end monitoring platform) - Deployment: working with French firefighter departments to gather field knowledge and conduct a test phase over the incoming European summer." PyTorch3D is a modular and optimized library for 3D Deep Learning with PyTorch. It includes support for: data structures for heterogeneous batching of 3D data (Meshes, Point clouds and Volumes), optimized 3D operators and loss functions (with custom CUDA kernels), a modular differentiable rendering API for Meshes, Point clouds and Implicit functions, as well as several other tools for 3D Deep Learning.

https://github.com/pyronear

Vision

PyTorch3D: Fast, Flexible, 3D Deep Learning
Nikhila Ravi, Jeremy Reizenstein, David Novotny, Justin Johnson, Georgia Gkioxari, Roman Shapovalov, Patrick Labatut, Wan-Yen Lo

PyTorch3D is a modular and optimized library for 3D Deep Learning with PyTorch. It includes support for: data structures for heterogeneous batching of 3D data (Meshes, Point clouds and Volumes), optimized 3D operators and loss functions (with custom CUDA kernels), a modular differentiable rendering API for Meshes, Point clouds and Implicit functions, as well as several other tools for 3D Deep Learning.

https://arxiv.org/abs/2007.08501

Vision

Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
E. Riba, J. Shi, D. Mishkin, L. Ferraz, A. Nicolao

This work presents Kornia, an open source computer vision library built upon a set of differentiable routines and modules that aims to solve generic computer vision problems. The package uses PyTorch as its main backend, not only for efficiency but also to take advantage of the reverse auto-differentiation engine to define and compute the gradient of complex functions. Inspired by OpenCV, Kornia is composed of a set of modules containing operators that can be integrated into neural networks to train models to perform a wide range of operations including image transformations,camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations on graphical processing units, generating faster systems. Examples of classical vision problems implemented using our framework are provided including a benchmark comparing to existing vision libraries.

http://www.kornia.org

Vision

NNGeometry: Easy and Fast Fisher Information Matrices and Neural Tangent Kernels in PyTorch
Thomas George

Fisher Information Matrices (FIM) and Neural Tangent Kernels (NTK) are useful tools in a number of diverse applications related to neural networks. Yet these theoretical tools are often difficult to implement using current libraries for practical size networks, given that they require per-example gradients, and a large amount of memory since they scale as the number of parameters (for the FIM) or the number of examples x cardinality of the output space (for the NTK). NNGeometry is a PyTorch library that offers a high level API for computing various linear algebra operations such as matrix-vector products, trace, frobenius norm, and so on, where the matrix is either the FIM or the NTK, leveraging recent advances in approximating these matrices.

https://github.com/tfjgeorge/nngeometry/

Vision

CompressAI: a research library and evaluation platform for end-to-end compression
Bégaint J., Racapé F., Feltman S., Pushparaja A.

CompressAI is a PyTorch library that provides custom operations, layers, modules and tools to research, develop and evaluate end-to-end image and video compression codecs. In particular, CompressAI includes pre-trained models and evaluation tools to compare learned methods with traditional codecs. State-of-the-art end-to-end compression models have been reimplemented in PyTorch and trained from scratch, reproducing published results and allowing further research in the domain.

Vision

pystiche: A Framework for Neural Style Transfer
Philip Meier, Volker Lohweg

The seminal work of Gatys, Ecker, and Bethge gave birth to the field of _Neural Style Transfer_ (NST) in 2016. An NST describes the merger between the content and artistic style of two arbitrary images. This idea is nothing new in the field of Non-photorealistic rendering (NPR). What distinguishes NST from traditional NPR approaches is its generality: an NST only needs a single arbitrary content and style image as input and thus "makes -- for the first time -- a generalized style transfer practicable". Besides peripheral tasks, an NST at its core is the definition of an optimization criterion called _perceptual loss_, which estimates the perceptual quality of the stylized image. Usually the perceptual loss comprises a deep neural network that needs to supply encodings of images from various depths. `pystiche` is a library for NST written in Python and built upon PyTorch. It provides modular and efficient implementations for commonly used perceptual losses as well as neural net architectures. This enables users to mix current state-of-the-art techniques with new ideas with ease. This poster will showcase the core concepts of `pystiche` that will enable other researchers as well as lay persons to got an NST running in minutes.

https://github.com/pmeier/pystiche

Vision

GaNDLF – A Generally Nuanced Deep Learning Framework for Clinical Imaging Workflows
Siddhish Thakur

Deep Learning (DL) has greatly highlighted the potential impact of optimized machine learning in both the scientific and clinical communities. The advent of open-source DL libraries from major industrial entities, such as TensorFlow (Google), PyTorch (Facebook), further contributes to DL promises on the democratization of computational analytics. However, increased technical and specialized background is required to develop DL algorithms, and the variability of implementation details hinders their reproducibility. Towards lowering the barrier and making the mechanism of DL development, training, and inference more stable, reproducible, and scalable, without requiring an extensive technical background, this manuscript proposes the Generally Nuanced Deep Learning Framework (GaNDLF). With built-in support for k-fold cross-validation, data augmentation, multiple modalities and output classes, and multi-GPU training, as well as the ability to work with both radiographic and histologic imaging, GaNDLF aims to provide an end-to-end solution for all DL-related tasks, to tackle problems in medical imaging and provide a robust application framework for deployment in clinical workflows. Keywords: Deep Learning, Framework, Segmentation, Regression, Classification, Cross-validation, Data augmentation, Deployment, Clinical, Workflows

Vision