High performance distributed framework for training deep learning recommendation models based on PyTorch.

Last update: Dec 30, 2022

Overview

PERSIA (Parallel rEcommendation tRaining System with hybrId Acceleration) is developed by AI [email protected] Technology, collaborating with ETH. It is a PyTorch-based (the first public one to our best knowledge) system for training large scale deep learning recommendation models on commodity hardwares. It is capable of training recommendation models with up to 100 trillion parameters. To the best of our knowledge, this is the largest model size in recommendation systems so far. Empirical study on public datasets indicate PERSIA's significant advantage over several other existing training systems in recommendation [1]. Its efficiency and robustness have also been validated by multiple applications with 100 million level DAU at Kuaishou.

Disclaimer: The program is usable and has served several important businesses. However, the official English documentation and tutorials are still under heavy construction (there are some materials on the tutorials website, but they are pretty raw). We encourage adventurers to try out PERSIA and contribute!

News

AI Engines in the "Short-video" Era: Eating 100 Trillion Parameters, Invited talk, Facebook, 2021.
单机训练速度提升 640 倍！独家解读快手商业广告模型 GPU 训练平台 PERSIA (In Chinese. Title: 640x Faster GPU Based Learning System for Ad Recommendation)
- [AI Front] [中国日报] [InfoQ] [CSDN] [Tencent Cloud News] [AcFun]
创新、平衡与大格局：快手商业化的慢与快 (In Chinese. Title: Innovation, Balance, and Big Picture: The Speed of Kwai Commercialization)
- [TechSir] [China Daily] [Sohu]

Discussion

Feel free to join our Telegram Group for discussion!

References

Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang, Ce Zhang, & Ji Liu. (2021). Persia: A Hybrid System Scaling Deep Learning Based Recommenders up to 100 Trillion Parameters.
Ji Liu & Ce Zhang. (2021). Distributed Learning Systems with First-order Methods.

License

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

High performance distributed framework for training deep learning recommendation models based on PyTorch.

Related tags

Overview

News

Links

Discussion

References

License

Owner

Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis

Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.

A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano

Semantic code search implementation using Tensorflow framework and the source code data from the CodeSearchNet project

PyTorch implementation for our paper Learning Character-Agnostic Motion for Motion Retargeting in 2D, SIGGRAPH 2019

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

一个目标检测的通用框架(不需要cuda编译)，支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Distilled coarse part of LoFTR adapted for compatibility with TensorRT and embedded divices

CSPML (crystal structure prediction with machine learning-based element substitution)

A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

📝 Wrapper library for text generation / language models at char and word level with RNN in TensorFlow

Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception, IROS 2021

Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

PyTorch implementations of deep reinforcement learning algorithms and environments

A lightweight Python-based 3D network multi-agent simulator. Uses a cell-based congestion model. Calculates risk, loudness and battery capacities of the agents. Suitable for 3D network optimization tasks.