AI创造营 :Metaverse启动机之重构现世,结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Overview

paddle-wechaty-Zodiac

AI创造营 :Metaverse启动机之重构现世,结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

12星座若穿越科幻剧,会拥有什么超能力呢?快来迎接你的专属超能力吧!

  • 现在很多年轻人都喜欢看科幻剧,像是复仇者系列,里面有很多英雄、超能力者,这些人都是我们的青春与情怀,那么12星座若穿越科幻剧,会分别拥有什么超能力呢?
  • 利用paddle提供的文本匹配和对话闲聊模型结合wechaty进行构建
  • 除了获得专属超能力外,还可以查看今日运势,外加寂寞无聊时找个“算命大师”聊聊天

效果展示

本项目的实现过程

云服务器部分

  • 参考https://aistudio.baidu.com/aistudio/projectdetail/2279551

  • 我用的阿里云的云服务器,也可以考虑其他云服务或者是外网可访问的服务器资源。

  • 进入服务器终端,在终端输入以下命令(注:确保输入的端口是对外开放的,WECHATY_TOKEN请填写自己的token)

$ apt update

$ apt install docker.io

$ docker pull wechaty/wechaty:latest

$ export WECHATY_LOG="verbose"

$ export WECHATY_PUPPET="wechaty-puppet-wechat"

$ export WECHATY_PUPPET_SERVER_PORT="8080"

$ export WECHATY_TOKEN="puppet_padlocal_xxxxxx" # 这里输入你自己的token

$ docker run -ti --name wechaty_puppet_service_token_gateway --rm -e WECHATY_LOG -e WECHATY_PUPPET -e WECHATY_TOKEN -e WECHATY_PUPPET_SERVER_PORT -p "$WECHATY_PUPPET_SERVER_PORT:$WECHATY_PUPPET_SERVER_PORT" wechaty/wechaty:latest
  • 输入网址: https://api.chatie.io/v0/hosties/xxxxxx (后面的xxxxxx是自己的token),如果返回了服务器的ip地址以及端口号,就说明运行成功了

  • 运行后会输出一大堆东西,找到一个Online QR Code: 的地址点击进去,会出现二维码,微信扫码登录,最终手机上显示桌面微信已登录,即可。

环境安装

!pip install -U paddlepaddle -i https://mirror.baidu.com/pypi/simple
!python -m pip install --upgrade paddlenlp -i https://pypi.org/simple
!pip install --upgrade pip
!pip install --upgrade sentencepiece 
!pip install wechaty

文本匹配部分

  • 文本语义匹配是NLP最基础的任务之一,简单来说就是判断两段文本的语义相似度。应用场景广泛,比如搜索引擎、智能问答、知识检索、信息流推荐等。

  • 为什么要用上这个功能呢,因为如果我们直接基于关键词匹配去判断用户需求的话,可能会出现理解错误的情况。比如如果用户输入“我可讨厌星座了”,但是聊天机器人可能还是会给用户展示星座超能力;如果直接限于关键词“星座”严格匹配的话,那用户如果不小心多输入一个字或者标点符号都不能实现想要的功能,太不友好了。因此本项目利用文本匹配技术来判断用户是否真实需要查看星座未来超能力的功能。

  • 本次项目基于 PaddleNLP,使用百度开源的预训练模型 ERNIE1.0,构建语义匹配模型,来判断 2 个文本语义是否相同。

  • 从头训练一个模型的关键步骤有数据加载、数据预处理、模型搭建、模型训练和评估,具体可参照https://aistudio.baidu.com/aistudio/projectdetail/1972174, 在这我们就直接调用已经训练好的语义匹配模型进行应用。

  • 下载已经训练好的语义匹配模型, 并解压

! wget https://paddlenlp.bj.bcebos.com/models/text_matching/pointwise_matching_model.tar
! tar -xvf pointwise_matching_model.tar
  • 具体代码部分(match.py文件)

对话闲聊部分

  • 近年来,人机对话系统受到了学术界和产业界的广泛关注。开放域对话系统希望机器可以流畅自然地与人进行交互,既可以进行日常问候类的闲聊,又可以完成特定功能。

  • 随着深度学习技术的不断发展,聊天机器人变得越来越智能。我们可以通过机器人来完成一些机械性的问答工作,也可以在闲暇时和智能机器人进行对话,他们的出现让生活变得更丰富多彩。

  • 本项目载入该功能,也是希望人们可以在寂寞无聊的时候有个聊天的小伙伴,虽然有时候他可能会不知所云,但他永远会在那等你。

  • 具体代码部分(chat.py文件)

  • PaddleNLP针对生成式任务提供了generate()函数,内嵌于PaddleNLP所有的生成式模型。支持Greedy Search、Beam Search和Sampling解码策略,用户只需指定解码策略以及相应的参数即可完成预测解码,得到生成的sequence的token ids以及概率得分。

  • PaddleNLP对于各种预训练模型已经内置了相应的tokenizer,指定想要使用的模型名字即可加载对应的tokenizer。

  • PaddleNLP提供了GPT,UnifiedTransformer等中文预训练模型,可以通过预训练模型名称完成一键加载。这次用的是一个小的中文GPT预训练模型。其他预训练模型请参考模型列表

主函数部分(main.py)

  • 星座今日运势需要自己申请一下接口,网址:星座运势,将你申请到的APIKEY填入values['key']
  • 运行本函数的时候,不要忘记云服务器也要开启哦,这样你的微信号才能变身为算命大师哦,不然只能在本地感受了。

后记

项目可改进的部分

  • 闲聊模型不够强大,会出现不知所云的情况,又或者更偏向于文本续写而不是对话。甚至会出现不太好的话语,这应该是训练语料导致的,模型预训练时语料没有清洗,以后可以考虑用干净的更符合本项目的对话语料微调。
  • 对话语义匹配的调用方式过于单调,希望之后可以改进。
  • 功能还不够多,结合paddle提供的优秀资源还可以创造出更多更好玩的功能。

互相交流进步

参考资料

最后的最后,觉得项目写的好的话,记得fork和爱心哦,谢谢啦!!!

EM-POSE 3D Human Pose Estimation from Sparse Electromagnetic Trackers.

EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers This repository contains the code to our paper published at ICCV 2021. For ques

Facebook Research 62 Dec 14, 2022
Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provi

Qiulin Zhang 228 Dec 18, 2022
A Kitti Road Segmentation model implemented in tensorflow.

KittiSeg KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark

Marvin Teichmann 890 Jan 04, 2023
Spam your friends and famly and when you do your famly will disown you and you will have no friends.

SpamBot9000 Spam your friends and family and when you do your family will disown you and you will have no friends. Terms of Use Disclaimer: Please onl

DJ15 0 Jun 09, 2022
The comma.ai Calibration Challenge!

Welcome to the comma.ai Calibration Challenge! Your goal is to predict the direction of travel (in camera frame) from provided dashcam video. This rep

comma.ai 697 Jan 05, 2023
Learning to Segment Instances in Videos with Spatial Propagation Network

Learning to Segment Instances in Videos with Spatial Propagation Network This paper is available at the 2017 DAVIS Challenge website. Check our result

Jingchun Cheng 145 Sep 28, 2022
Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022
Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment

Voronoi Multi_Robot Collaborate Exploration Introduction In the unknown environment, the cooperative exploration of multiple robots is completed by Vo

PeaceWord 6 Nov 22, 2022
Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Open-L2O This repository establishes the first comprehensive benchmark efforts of existing learning to optimize (L2O) approaches on a number of proble

VITA 161 Jan 02, 2023
arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

Andrej 671 Dec 31, 2022
PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Future urban scene generation through vehicle synthesis This repository contains Pytorch code for the ICPR2020 paper "Future Urban Scene Generation Th

Alessandro Simoni 4 Oct 11, 2021
Riemannian Geometry for Molecular Surface Approximation (RGMolSA)

Riemannian Geometry for Molecular Surface Approximation (RGMolSA) Introduction Ligand-based virtual screening aims to reduce the cost and duration of

11 Nov 15, 2022
This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit

BMW Semantic Segmentation GPU/CPU Inference API This is a repository for a Semantic Segmentation inference API using the Gluoncv CV toolkit. The train

BMW TechOffice MUNICH 56 Nov 24, 2022
🍷 Gracefully claim weekly free games and monthly content from Epic Store.

EPIC 免费人 🚀 优雅地领取 Epic 免费游戏 Introduction 👋 Epic AwesomeGamer 帮助玩家优雅地领取 Epic 免费游戏。 使用 「Epic免费人」可以实现如下需求: get:搬空游戏商店,获取所有常驻免费游戏与免费附加内容; claim:领取周免游戏及其免

571 Dec 28, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception, IROS 2021

For academic use only. Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception Ziwei Wang, Liyuan Pan, Yonhon Ng, Zheyu Zhuang and Robert Mahony Th

Ziwei Wang 11 Jan 04, 2023
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022
Mail classification with tensorflow and MS Exchange Server (ham or spam).

Mail classification with tensorflow and MS Exchange Server (ham or spam).

Metin Karatas 1 Sep 11, 2021
ChebLieNet, a spectral graph neural network turned equivariant by Riemannian geometry on Lie groups.

ChebLieNet: Invariant spectral graph NNs turned equivariant by Riemannian geometry on Lie groups Hugo Aguettaz, Erik J. Bekkers, Michaël Defferrard We

haguettaz 12 Dec 10, 2022
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 07, 2022