Springer面向公众开放正版电子书籍,附65本数学、编程、数据挖掘、数据科学、数据分析、机器学习、深度学习、人工智能相关书籍链接及打包下载

AINLP

shipulinge(springer)shishijiezhumingdekejiqikan、tushuchubangongsi,zheiciyiqingqijianmianxianggongzhongmianfeikaifangleyipishekerenwen,zirankexuedenglingyudezhengbandianzishuji(jushuoshi400duoben),towardsdatascience shangyouxuezhejiangqizhong65benjiqixuexiheshujukexueyijitongjixiangguandemianfeijiaocaixiazailianjiezhenglilechulai,woshileyixia,wuxuzhuce,keyizhijiexiazaixiangguandepdfshuji,xiangdangfangbian:

kanleyixiazheifenshudanbaokuodeshujihaishihenbangde,baokuoshuxuelei(duoyuanweijifenhejihe、jisuanjihe、pianweifen、daishu、xianxingdaishu、xianxingguihua、gailvhetongji、tongjixue、tongjixuexi、shuxuejianmodeng)、bianchenglei(shujujiegouyusuanfa、pythonbiancheng、ryuyan、bianchengyuyanjichu、mianxiangduixiangfenxihesheji、shujukudeng)、shujuwajue、shujufenxi、shujukexue、jiqixuexi、rengongzhineng、shenduxuexi、jisuanjishijue,jiqirendengxiangguandedianzishu,shenzhibaokuoruhexuexilatex,yuanbixiangxiangdefengfuhenduo。

3d和值振幅连线走势图zheifenqingdandediyibenshujijiushijingdiande“”, jinrushujiyemianhou,zhijiedianji“download pdf” jikedanduxiazaigaishudianziban:

redditshangyouwangyoutigongleyigegoogle drivededabaoxiazailianjie,baokuoqizhong64benshujidepdfdabaoxiazai,keyizhijiexiazai:


3d和值振幅连线走势图ruguohaishibufangbian,keyiguanzhuainlpgongzhonghao,huifu"sprg"huoqubaiduwangpanlianjie:

zheifenshujiqingdanhelianjieruxia,wojiandanfanyileyixiashuming,gongganxingqudepengyoucankao:

The Elements of Statistical Learning(统计学习基础)

trevor hastie, robert tibshirani, jerome friedman

Introductory Time Series with R(时间序列导论-基于R语言讲解)

3d和值振幅连线走势图paul s.p. cowpertwait, andrew v. metcalfe

A Beginner’s Guide to R(R语言初学者指南)

alain zuur, elena n. ieno, erik meesters

Introduction to Evolutionary Computing(进化计算导论)

a.e. eiben, j.e. smith

Data Analysis(数据分析)

siegmund brandt

Linear and Nonlinear Programming(线性和非线性规划)

3d和值振幅连线走势图david g. luenberger, yinyu ye

Introduction to Partial Differential Equations(偏微分方程简介)

3d和值振幅连线走势图david borthwick

Fundamentals of Robotic Mechanical Systems(机器人机械系统基础)

3d和值振幅连线走势图jorge angeles

Data Structures and Algorithms with Python(Python数据结构和算法)

kent d. lee, steve hubbard

Introduction to Partial Differential Equations(偏微分方程简介)

3d和值振幅连线走势图peter j. olver

Methods of Mathematical Modelling(数学建模方法)

thomas witelski, mark bowen

LaTeX in 24 Hours(24小时掌握LaTeX)

dilip datta

Introduction to Statistics and Data Analysis(统计与数据分析导论)

3d和值振幅连线走势图christian heumann, michael schomaker, shalabh

Principles of Data Mining(数据挖掘原理)

max bramer

Computer Vision(计算机视觉)

richard szeliski

Data Mining(数据挖掘)

charu c. aggarwal

Computational Geometry(计算几何)

3d和值振幅连线走势图mark de berg, otfried cheong, marc van kreveld, mark overmars

Robotics, Vision and Control(机器人,视觉与控制)

3d和值振幅连线走势图peter corke

Statistical Analysis and Data Display(统计分析和数据展示)

richard m. heiberger, burt holland

Statistics and Data Analysis for Financial Engineering(金融工程统计与数据分析)

3d和值振幅连线走势图david ruppert, david s. matteson

Stochastic Processes and Calculus(随机过程与微积分)

uwe hassler

Statistical Analysis of Clinical Data on a Pocket Calculator(袖珍计算器上的临床数据统计分析)

ton j. cleophas, aeilko h. zwinderman

Clinical Data Analysis on a Pocket Calculator(袖珍计算器的临床数据分析)

3d和值振幅连线走势图ton j. cleophas, aeilko h. zwinderman

The Data Science Design Manual(数据科学设计手册)

steven s. skiena

An Introduction to Machine Learning(机器学习导论)

3d和值振幅连线走势图miroslav kubat

Guide to Discrete Mathematics(离散数学指南)

gerard o’regan

Introduction to Time Series and Forecasting(时间序列和预测简介)

3d和值振幅连线走势图peter j. brockwell, richard a. davis

Multivariate Calculus and Geometry(多元微积分和几何)

seán dineen

Statistics and Analysis of Scientific Data(科学数据统计与分析)

massimiliano bonamente

Modelling Computing Systems(建模计算系统)

3d和值振幅连线走势图faron moller, georg struth

Search Methodologies(搜索方法论)

3d和值振幅连线走势图edmund k. burke, graham kendall

Linear Algebra Done Right(线性代数应该这样学)

3d和值振幅连线走势图sheldon axler

Linear Algebra(线性代数)

3d和值振幅连线走势图jörg liesen, volker mehrmann

Algebra(代数)

serge lang

Understanding Analysis(理解分析学)

stephen abbott

Linear Programming(线性规划)

3d和值振幅连线走势图robert j vanderbei

Understanding Statistics Using R(通过R语言学习统计学)

randall schumacker, sara tomek

An Introduction to Statistical Learning(统计学习导论)

gareth james, daniela witten, trevor hastie, robert tibshirani

Statistical Learning from a Regression Perspective(回归视角的统计学习)

3d和值振幅连线走势图richard a. berk

Applied Partial Differential Equations(应用偏微分方程)

3d和值振幅连线走势图j. david logan

Robotics(机器人技术)

3d和值振幅连线走势图bruno siciliano, lorenzo sciavicco, luigi villani, giuseppe oriolo

Regression Modeling Strategies(回归建模策略)

frank e. harrell , jr.

A Modern Introduction to Probability and Statistics(概率统计的现代视角导论)

f.m. dekking, c. kraaikamp, h.p. lopuhaä, l.e. meester

The Python Workbook(Python手册)

ben stephenson

Machine Learning in Medicine — a Complete Overview(医学中的机器学习-完整概述)

ton j. cleophas, aeilko h. zwinderman

Object-Oriented Analysis, Design and Implementation(面向对象的分析,设计与实现)

3d和值振幅连线走势图brahma dathan, sarnath ramnath

Introduction to Data Science(数据科学导论)

3d和值振幅连线走势图laura igual, santi seguí

Applied Predictive Modeling(应用预测建模)

3d和值振幅连线走势图max kuhn, kjell johnson

Python For ArcGIS(面向ArcGIS的Python指南)

3d和值振幅连线走势图laura tateosian

Concise Guide to Databases(简明数据库指南)

3d和值振幅连线走势图peter lake, paul crowther

Digital Image Processing(数字图像处理)

3d和值振幅连线走势图wilhelm burger, mark j. burge

Bayesian Essentials with R(通过R学习贝叶斯基础)

jean-michel marin, christian p. robert

Robotics, Vision and Control(机器人,视觉与控制)

peter corke

Foundations of Programming Languages(编程语言基础)

kent d. lee

Introduction to Artificial Intelligence(人工智能导论)

wolfgang ertel

Introduction to Deep Learning(深度学习导论)

3d和值振幅连线走势图sandro skansi

Linear Algebra and Analytic Geometry for Physical Sciences(物理科学的线性代数和解析几何)

giovanni landi, alessandro zampini

Applied Linear Algebra(应用线性代数)

peter j. olver, chehrzad shakiban

Neural Networks and Deep Learning(神经网络与深度学习)

charu c. aggarwal

Data Science and Predictive Analytics(数据科学与预测分析)

3d和值振幅连线走势图ivo d. dinov

Analysis for Computer Scientists(面向计算机科学家的分析学)

michael oberguggenberger, alexander ostermann

Excel Data Analysis(Excel数据分析)

hector guerrero

A Beginners Guide to Python 3 Programming(Python 3编程入门指南)

john hunt

Advanced Guide to Python 3 Programming(Python 3编程高级指南)

john hunt

ganxingqudetongxuekeyiguanzhuxiafanggongzhonghao,huifu"sprg"huoqudabaoxiazaiwangpanlianjie:

❤️Emotional First Aid Dataset, 心理咨询问答语料库

AINLP

500彩票足球比分直播

3d和值振幅连线走势图xinlizixunwendayuliaoku,jinxianyanjiuyongtu。

http://github.com/chatopera/efaqa-corpus-zh

为什么发布这个语料库

3d和值振幅连线走势图xinlizixunzhongyingyongrengongzhineng,shiwomenrenweifeichangyouyiyideyigetansuo。womenyuanyihegengduorenhezuo,bamuqianlingxianderengongzhinengjishu,zaixinlizixunbutongchangjingxialuodi。koumende,jiujitakaimen,yuanmeigerendouyouzijidexinlizixunshi。

518电玩城app下载

心理咨询问答语料库(以下也称为“数据集”,“语料库3d和值振幅连线走势图”)是为应用人工智能技术于心理咨询领域制作的语料。据我们所知,这是心理咨询领域首个开放的 QA 语料库,包括 20,000 条心理咨询数据,也是公开的最大的中文心理咨询对话语料。数据集内容丰富,不但具备多轮对话内容,也有分类等信息,制作过程耗费大量时间和精力,比如标注过程是面向多轮对话,平均每条标记耗时 1 分钟。

3d和值振幅连线走势图shujujiyousitanfudaxue,ucla hetaiwanfurendaxuelinchuangxinlixuedengxinlixuezhuanyerenshicanyujianshe,bingyou chatopera hezhuduozhiyuanzhehezuowancheng。

3d和值振幅连线走势图shujuwenjianweizhi,gaiwenjianwei gzip yasuo,utf8 bianma,meixingyitiaoshuju,meitiaowei json geshizifuchuan,geshiruxia:

字段 说明 类型
md5 唯一标识 string
title 标题 string
description 描述 string
owner 发布者(脱敏后) string
label 话题标签 Object
s3 烦恼类型 string
s2 心理疾病 string
s1 SOS string
chats 聊天数据 Array
sender 发布者 string
type 消息类型 string
time 发布时间 string
value 消息内容 string
label 聊天标签 Object
knowledge 知识性 boolean
question 追问 boolean
negative 负面回复 boolean

数据示例

{
  "md5": "2f63d374c071043d9e1968aefa62ffb7",
  "owner": "匿名",
  "title": "女 听过别人最多的议论就是干啥啥不行不长心眼没有脑子",
  "label": {
    "s1": "1.13",
    "s2": "2.7",
    "s3": "3.4"
  },
  "chats": [
    {
      "time": "11:02:45",
      "value": "这样的议论是针对谁呢?",
      "sender": "audience",
      "type": "textMessage",
      "label": { "question": true, "knowledge": false, "negative": false }
    },
    {
      "time": "11:08:38",
      "sender": "audience",
      "type": "textMessage",
      "value": "欢迎你来找我玩❤",
      "label": { "question": false, "knowledge": false, "negative": false }
    },
    {
      "time": "11:15:17",
      "sender": "owner",
      "type": "textMessage",
      "value": "好惨"
    }
  ]
}

话题标签

一条数据中,titledescription是咨询者咨询的初始信息,话题标签是基于二者将咨询问题进行分类,分类包含三个维度:S1 烦恼类型;S2 心理疾病;S3 SOS。其中,S代表severity,三个维度体现心理问题的严重程度依次加重。需要强调的是, 其中一些项目需要临床医学鉴定,数据集所使用概念,均代表疑似,比如我们标记了一个话题分类为抑郁症,实际上是指疑似抑郁症,该声明不代表我们的工作不认真,而是严格的判断的难度以及出于严谨性的考虑。

label中记录的是每个维度子类的 ID,ID 设计如下。

3d动物幸运森林舞会

ID 中文 英文 备注
1.1 学业烦恼、对未来规划的迷茫 Academic Concerns 学业烦恼包括学习障碍、学习吃力、学习成绩差、注意力不集中和对学习科目无兴趣等。
1.2 事业和工作烦恼 Career and Workplace Issues 在工作中的,人际冲突问题、沟通问题、谣言、职场骚扰、歧视、动力不足和工作满意度低和职场表现差等问题。
1.3 家庭问题和矛盾 Family Issues and Conflict 家庭问题和矛盾包括家庭暴力、金钱遗产争执、家庭不和睦、婆媳问题、子女们对年长父母看护问题、继父母继子女冲突问题和离异父母对于儿女的养护问题。
1.4 物质滥用 Substance Abuse and Addiction 成人如酗酒、吸烟、药物滥用、吸毒、赌博和任何影响生活品质的上瘾行为。
1.5 悲恸 Grief 由于痛失亲人或朋友而引起的极大悲伤。
1.6 失眠 Insomnia 无法入睡或难以保持入睡状态而影响第二天表现的睡眠障碍。
1.7 压力 Stress 压力是一种情绪上或身体上的紧张感。它可能来自任何使您感到沮丧,愤怒或紧张的事件或想法。
1.8 人际关系 Interpersonal Relationship 不属于职场、学校以及家庭的人际关系紧张与矛盾。
1.9 情感关系问题 Relationship Issues 早恋、暗恋、异地恋、出轨、吵架、复合、LGBT 群体
1.10 离婚 Divorce 离婚后情感以及孩子的问题
1.11 分手 Break Up 分手后的痛苦
1.12 自我探索 Self-Awareness 如星座、性格、兴趣等
1.13 低自尊 Low self-esteem 低自尊心的表现 自尊是一个人对自己的价值的主观评价。自尊包括对自己以及情绪状态的信念,例如胜利,绝望,骄傲和羞耻。
1.14 青春期问题 Adolescent Problem 青春期少年在身心成长上所面临的问题,如叛逆、伤害他人、怀孕、药物滥用和青少年犯罪。
1.15 强迫症 OCD 强迫症的人会陷入一种无意义、且令人沮丧的重复的想法与行为当中,但是一直想却无法摆脱它。
1.16 其它 Others 其他烦恼,虽然对生活学习没有造成毁灭性的阻碍,但是却依然会引起心里不适。
1.17 男同性恋、女同性恋、双性恋与跨性别 LGBT 男同性恋、女同性恋、双性恋与跨性别
1.18 性问题 Sex 对于青少年,是性教育不足引起各种社会问题;对于成年人,性焦虑与性上瘾可以演变成生理疾病。
1.19 亲子关系 Parent-child relationship 亲子关系,从婴幼儿时期就开始影响着孩子各方面的发展,比如性格、毅力、人际交往等等。

3d断组预测牛材网

3d和值振幅连线走势图xinliwentiyijingyingxianggongzuo,zixunzhexuyaoxiuxidiaozhenghuojiuyi。

ID 中文 英文 备注
2.1 忧郁症 Depression 长时间持续的抑郁情绪,并且这种情绪明显超过必要的限度,缺乏自信,避开人群,甚至有罪恶感,感到身体能量的明显降低,时间的感受力减慢,无法在任何有趣的活动中体会到快乐。
2.2 焦虑症 Anxiety 长时间持续性的焦虑情绪,无明确客观对象却依然紧张担心,坐立不安,如心悸、手抖、出汗、尿频、注意力难以集中。
2.3 躁郁症 Bipolar Disorder 又称为"双向情感障碍" 。狂躁期:感到生机勃勃、精力充沛以及情感高涨或易被激惹。也可感到过度自信,行为或穿着铺张浪费,睡眠极少且语量增多。
2.4 创伤后应激反应 PTSD 首先要经历创伤:如孩童时期遭受身体或心理上的虐待;接触相关事物时会有精神或身体上的不适和紧张,创伤的情景会一遍一遍在脑海中重演。
2.5 恐慌症 Panic Disorder 又称急性焦虑症,是反复发生的惊恐发作。惊恐发作是突然的短期强烈的恐惧(濒死感),包含心悸、流汗、手颤抖、呼吸困难、麻痹感。
2.6 厌食症和暴食症 Eating Disorder 厌食症:吃太少导致体重偏轻;暴食症:大量进食后再想办法吐出来。两种疾病都对"瘦"有着极端的追求,对自己身体不满意,在生活学习上有极端完美主义心态。
2.7 尚未达到 S2 Unrelated 还没有严重到心理疾病
2.8 其它疾病 Others 已经严重影响生活和工作,甚至生活工作不能进行,但并不能确认是哪一类疾病的情况。

【注意:】一些在临床上更为严重的心理疾病,比如多重人格等,因为其复杂性,更不容易判断,数据集暂时不涉及标注。

3d专家预测毒胆家彩网

jinjiqingkuang,xuyaolikeyourengongganyu。

ID 中文 英文 备注
3.1 正在进行的自杀行为 Suicide Action N/A
3.2 策划进行的自杀行为 Suicide Ideation N/A
3.3 自残 Self-harm N/A
3.4 进行的人身伤害 N/A 正在对他人进行伤害
3.5 计划的人身伤害 N/A 计划对他人进行伤害
3.6 无伤害身体倾向 N/A N/A

聊天标签

标记 含义
question 是否是追问,追问可以让咨询者更多倾诉
knowledge 是否带有知识,含知识内容有助于开导咨询者
negative 负面回复,对咨询者起负面作用

项目背景

weilebangzhugenghaoyingyonggaishujuji,tebiezhizuoleyigeshipinbangzhulejiexiangmubeijing、biaozhushejihebiaozhuguocheng。

心理咨询AI助手|派特心理

安装使用

510k手机游戏

为了方便使用,数据集发布到  上,使用 pip 下载安装。

pip install efaqa-corpus-zh

yanshidaima

import efaqa_corpus_zh
l = list(efaqa_corpus_zh.load())
print("size: %s" % len(l))
print(l[0]["title"])

初次执行 load3d和值振幅连线走势图 接口,会下载数据,数据在 Github 上,请确保网络可以访问到。下载速度取决于网络质量,目前数据集压缩包大小~8MB。

3d彩票专家预测

3d和值振幅连线走势图ruguoninshiyongqitabianchengyuyan,neimezhijiexianxiazaishujuwenjian,ranhoushiyong gzip jieyagongjujieya,dedaowenbenwenjian,ranhouanxingduqu。

369娱乐APP下载

作为心理咨询平台,心理健康服务开发者,如何获得智能问答服务呢?如果不想从零开始,有没有成熟的方案呢?我们称之为心理问答 API。伴随着我们标注数据量的累积,我们也在完善利用这些数据研发的对话服务,通过使用 SDK,几行代码立刻接入心理问答 API。

心理问答 API 包括单轮对话多轮对话,详细参考。

媒体报导

声明

5个数复式公式表图

3d和值振幅连线走势图benshujujishiyongzaixianxinlizixunshujuqingxi、tuominhebiaozhuzhizuo,shujujidaimafabushiyong gpl 3.0 xukexieyi。shujujinxianyuyanjiuyongtu,ruguozaifabuderenhemeiti、qikan、zazhihuobokedengneirongshi,bixuzhumingyinyonghedizhi。wushouquanshangyeyongtu,zhuijiubanquan。

@online{efaqa-corpus-zh:petpsychology,
  author = {Hai Liang Wang, Zhi Zhi Wu, Jia Yuan Lang},
  title = {派特心理:心理咨询问答语料库},
  year = 2020,
  url = {http://github.com/chatopera/efaqa-corpus-zh},
  urldate = {2020-04-22}
}

310ⅴ大赢家比分

3d和值振幅连线走势图yuliaokuweizhuguanbiaozhu,jianyuxinlizixundeyansuxinghezhongyaoxing,yuliaozhizuoshijinkenengbaozhengshujudezhunquexing,danshiwufabaozheng 100%zhunque,duiyuyinshujuneirongbudangchanshengdehouguo,bentuanduibuchengdanrenhefalvzeren。

emotional first aid dataset, chatopera inc., , apr. 22th, 2020

GPL 3.0 许可协议

3d和值振幅连线走势图emotional first aid dataset, only for research. copyright (c) 2020 beijinghuaxiachunsongkejiyouxiangongsi 

3d和值振幅连线走势图this program is free software: you can redistribute it and/or modify it under the terms of the gnu general public license as published by the free software foundation, either version 3 of the license, or (at your option) any later version.

this program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. see the gnu general public license for more details.

you should have received a copy of the gnu general public license along with this program. if not, see .

 

 

 

斯坦福大学NLP组Python深度学习自然语言处理工具Stanza试用

AINLP

众所周知,出品了一系列NLP工具包,但是大多数都是用Java写得,对于Python用户不是很友好。几年前我曾基于斯坦福Java工具包和NLTK写过一个简单的中文分词接口:Python自然语言处理实践: 在NLTK中使用斯坦福中文分词器3d和值振幅连线走势图,不过用起来也不是很方便。深度学习自然语言处理时代,斯坦福大学自然语言处理组开发了一个纯Python版本的深度学习NLP工具包:,前段时间,Stanza v1.0.0 版本正式发布,算是一个里程碑:

3d和值振幅连线走势图 shiyigechunpythonshixiandeziranyuyanchuligongjubao,zheigequbieyusitanfudaxueziranyuyanchulizuzhiqianyizhiweihudejavashixian corenlp dengziranyuyanchuligongjubao,duiyupythonyonghulaishuo,jiugengfangbiandiaoyongle,bingqiestanzahaitigongleyigepythonjiekoukeyongyucorenlpdediaoyong ,duiyuyixiemeiyouzaistanzazhongshixiandenlpgongneng,keyitongguozheigejiekoudiaoyong corenlp zuoweibuchong。 stanzadeshenduxuexiziranyuyanchulimokuaijiyupytorchshixian,yonghukeyijiyuzijibiaozhudeshujugoujiangengzhunquedeshenjingwangluomoxingyongyuxunlian、pingguheshiyong,dangran,ruguoyougpujiqijiachi,sudukeyigengkuai。stanzamuqianzhichi66zhongyuyandewenbenfenxi,baokuozidongduanju、tokenize(huozhefenci)、cixingbiaozhuhexingtaisufenxi、yicunjufafenxiyijimingmingshitishibie。

to summarize, stanza features:

Native Python implementation requiring minimal efforts to set up;
Full neural network pipeline for robust text analytics, including tokenization, multi-word token (MWT) expansion, lemmatization, part-of-speech (POS) and morphological features tagging, dependency parsing, and named entity recognition;
Pretrained neural models supporting 66 (human) languages;
3d和值振幅连线走势图 A stable, officially maintained Python interface to CoreNLP.

shiyongleyixiastanza,haishihenfangbiande,guanfangwendanghenqingxi,keyizhijiecankao。jiandanjiluyixiazhongyingwenmokuaideanzhuangheshiyong,yixiashizaiubuntu16.04, python 3.6.8 huanjingxia,qingzhuyi,stanzaxuyaopython3.6jiyishangdebanben,ruguodiyuzheigebanben,yong pip install stanza anzhuangdestanzafeisitanfudaxuenlpzudestanza。

anzhuangstanzadefangfayouduozhong,zheilishivirtualenvxunihuanjingxiatongguo pip install stanza anzhuangstanzajiqixiangguanyilaide,jutikeyicankaostanzadeanzhuangwendang:

anzhuangwanchenghou,keyichangshishiyong,buguoshiyongmouzhongyuyandenlpgongjubaoshi,haixuyaoxianxiazaixiangguandedabaomoxing,zheigezaidiyicishiyongshihuiyoutishihecaozuo,yihoujiuwuxuxiazaile,womenxiancongzouqi,yiyingwenweili:

In [1]: import stanza                                                                             
 
# 这里因为已经下载过英文模型打包文件,所以可以直接使用,如果没有下载过,初次使用会有一个下载过程
In [2]: stanza.download('en')                                                                     
Downloading http://raw.githubusercontent.com/stanfordnlp/stanza-resources/master/resources_1.0.0.Downloading http://raw.githubusercontent.com/stanfordnlp/stanza-resources/master/resources_1.0.0.json: 116kB [00:00, 154kB/s]
2020-04-11 23:13:14 INFO: Downloading default packages for language: en (English)...
2020-04-11 23:13:15 INFO: File exists: /home/textminer/stanza_resources/en/default.zip.
2020-04-11 23:13:19 INFO: Finished downloading models and saved to /home/textminer/stanza_resources.
 
# Pipeline是Stanza里一个重要的概念
In [3]: en_nlp = stanza.Pipeline('en')                                                            
2020-04-11 23:14:27 INFO: Loading these models for language: en (English):
=========================
| Processor | Package   |
-------------------------
| tokenize  | ewt       |
| pos       | ewt       |
| lemma     | ewt       |
| depparse  | ewt       |
| ner       | ontonotes |
=========================
 
2020-04-11 23:14:28 INFO: Use device: gpu
2020-04-11 23:14:28 INFO: Loading: tokenize
2020-04-11 23:14:30 INFO: Loading: pos
2020-04-11 23:14:30 INFO: Loading: lemma
2020-04-11 23:14:30 INFO: Loading: depparse
2020-04-11 23:14:31 INFO: Loading: ner
2020-04-11 23:14:32 INFO: Done loading processors!
 
In [5]: doc = en_nlp("Barack Obama was born in Hawaii.")                                          
 
In [6]: print(doc)                                                                                
[
  [
    {
      "id": "1",
      "text": "Barack",
      "lemma": "Barack",
      "upos": "PROPN",
      "xpos": "NNP",
      "feats": "Number=Sing",
      "head": 4,
      "deprel": "nsubj:pass",
      "misc": "start_char=0|end_char=6"
    },
    {
      "id": "2",
      "text": "Obama",
      "lemma": "Obama",
      "upos": "PROPN",
      "xpos": "NNP",
      "feats": "Number=Sing",
      "head": 1,
      "deprel": "flat",
      "misc": "start_char=7|end_char=12"
    },
    {
      "id": "3",
      "text": "was",
      "lemma": "be",
      "upos": "AUX",
      "xpos": "VBD",
      "feats": "Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin",
      "head": 4,
      "deprel": "aux:pass",
      "misc": "start_char=13|end_char=16"
    },
    {
      "id": "4",
      "text": "born",
      "lemma": "bear",
      "upos": "VERB",
      "xpos": "VBN",
      "feats": "Tense=Past|VerbForm=Part|Voice=Pass",
      "head": 0,
      "deprel": "root",
      "misc": "start_char=17|end_char=21"
    },
    {
      "id": "5",
      "text": "in",
      "lemma": "in",
      "upos": "ADP",
      "xpos": "IN",
      "head": 6,
      "deprel": "case",
      "misc": "start_char=22|end_char=24"
    },
    {
      "id": "6",
      "text": "Hawaii",
      "lemma": "Hawaii",
      "upos": "PROPN",
      "xpos": "NNP",
      "feats": "Number=Sing",
      "head": 4,
      "deprel": "obl",
      "misc": "start_char=25|end_char=31"
    },
    {
      "id": "7",
      "text": ".",
      "lemma": ".",
      "upos": "PUNCT",
      "xpos": ".",
      "head": 4,
      "deprel": "punct",
      "misc": "start_char=31|end_char=32"
    }
  ]
]
 
In [7]: print(doc.entities)                                                                       
[{
  "text": "Barack Obama",
  "type": "PERSON",
  "start_char": 0,
  "end_char": 12
}, {
  "text": "Hawaii",
  "type": "GPE",
  "start_char": 25,
  "end_char": 31
}]

pipelineshistanzalideyigezhongyaogainian:

keyitongguopipelineyujiazaibutongyuyandemoxing,yekeyitongguopipelinexuanzebutongdechulimokuai,haikeyixuanzeshifoushiyonggpu,zheiliwomenzaishishizhongwenmoxing:

In [9]: import stanza                                                                             
 
# 测试一下中文模型(因为我这边中文模型已经下载过了,所以跳过download环节)
In [10]: zh_nlp = stanza.Pipeline('zh')                                                           
2020-04-12 11:32:47 INFO: "zh" is an alias for "zh-hans"
2020-04-12 11:32:47 INFO: Loading these models for language: zh-hans (Simplified_Chinese):
=========================
| Processor | Package   |
-------------------------
| tokenize  | gsdsimp   |
| pos       | gsdsimp   |
| lemma     | gsdsimp   |
| depparse  | gsdsimp   |
| ner       | ontonotes |
=========================
 
2020-04-12 11:32:48 INFO: Use device: gpu
2020-04-12 11:32:48 INFO: Loading: tokenize
2020-04-12 11:32:49 INFO: Loading: pos
2020-04-12 11:32:51 INFO: Loading: lemma
2020-04-12 11:32:51 INFO: Loading: depparse
2020-04-12 11:32:53 INFO: Loading: ner
2020-04-12 11:32:54 INFO: Done loading processors!
 
In [11]: text = """英国首相约翰逊6日晚因病情恶化,被转入重症监护室治疗。英国首相府发言人说,目前约
    ...: 翰逊意识清晰,将他转移到重症监护室只是预防性措施。发言人说,约翰逊被转移到重症监护室前已
    ...: 安排英国外交大臣拉布代表他处理有关事务。"""                                              
 
In [12]: doc = zh_nlp(text)  
 
In [13]: for sent in doc.sentences: 
    ...:     print("Sentence:" + sent.text) # 断句
    ...:     print("Tokenize:" + ' '.join(token.text for token in sent.tokens)) # 中文分词
    ...:     print("UPOS: " + ' '.join(f'{word.text}/{word.upos}' for word in sent.words)) # 词性标注(UPOS)
    ...:     print("XPOS: " + ' '.join(f'{word.text}/{word.xpos}' for word in sent.words)) # 词性标注(XPOS)
    ...:     print("NER: " + ' '.join(f'{ent.text}/{ent.type}' for ent in sent.ents)) # 命名实体识别
    ...:                                                                                          
Sentence:英国首相约翰逊6日晚因病情恶化,被转入重症监护室治疗。
Tokenize:英国 首相 约翰逊 6 日 晚因 病情 恶化 , 被 转入 重症 监护 室 治疗 。
UPOS: 英国/PROPN 首相/NOUN 约翰逊/PROPN 6/NUM 日/NOUN 晚因/NOUN 病情/NOUN 恶化/VERB ,/PUNCT 被/VERB 转入/VERB 重症/NOUN 监护/VERB 室/PART 治疗/NOUN 。/PUNCT
XPOS: 英国/NNP 首相/NN 约翰逊/NNP 6/CD 日/NNB 晚因/NN 病情/NN 恶化/VV ,/, 被/BB 转入/VV 重症/NN 监护/VV 室/SFN 治疗/NN 。/.
NER: 英国/GPE 约翰逊/PERSON 6日/DATE
Sentence:英国首相府发言人说,目前约翰逊意识清晰,将他转移到重症监护室只是预防性措施。
Tokenize:英国 首相 府 发言 人 说 , 目前 约翰逊 意识 清晰 , 将 他 转移 到 重症 监护 室 只 是 预防 性 措施 。
UPOS: 英国/PROPN 首相/NOUN 府/PART 发言/VERB 人/PART 说/VERB ,/PUNCT 目前/NOUN 约翰逊/PROPN 意识/NOUN 清晰/ADJ ,/PUNCT 将/ADP 他/PRON 转移/VERB 到/VERB 重症/NOUN 监护/VERB 室/PART 只/ADV 是/AUX 预防/VERB 性/PART 措施/NOUN 。/PUNCT
XPOS: 英国/NNP 首相/NN 府/SFN 发言/VV 人/SFN 说/VV ,/, 目前/NN 约翰逊/NNP 意识/NN 清晰/JJ ,/, 将/BB 他/PRP 转移/VV 到/VV 重症/NN 监护/VV 室/SFN 只/RB 是/VC 预防/VV 性/SFN 措施/NN 。/.
NER: 英国/GPE 约翰逊/PERSON
Sentence:发言人说,约翰逊被转移到重症监护室前已安排英国外交大臣拉布代表他处理有关事务。
Tokenize:发言 人 说 , 约翰逊 被 转移 到 重症 监护 室 前 已 安排 英国 外交 大臣 拉布 代表 他 处理 有关 事务 。
UPOS: 发言/VERB 人/PART 说/VERB ,/PUNCT 约翰逊/PROPN 被/VERB 转移/VERB 到/VERB 重症/NOUN 监护/VERB 室/PART 前/ADP 已/ADV 安排/VERB 英国/PROPN 外交/NOUN 大臣/NOUN 拉布/PROPN 代表/VERB 他/PRON 处理/VERB 有关/ADJ 事务/NOUN 。/PUNCT
XPOS: 发言/VV 人/SFN 说/VV ,/, 约翰逊/NNP 被/BB 转移/VV 到/VV 重症/NN 监护/VV 室/SFN 前/IN 已/RB 安排/VV 英国/NNP 外交/NN 大臣/NN 拉布/NNP 代表/VV 他/PRP 处理/VV 有关/JJ 事务/NN 。/.
NER: 约翰逊/PERSON 英国/GPE 拉布/PERSON

ruguoyonghubuxuyaoshiyongmingmingshitishibie、yicunjufadenggongneng,keyizaimoxingxiazaihuozheyujiazaijieduanhuozhegoujianpipelineshixuanzezijixuyaodegongnengmokuaichuliqi,lirukeyizhixuanzezhongwenfencihecixingbiaozhu,huozhedanyidezhongwenfencigongneng,zheiliyi“woaiziranyuyanchuli”weili:

 
# 可以在使用时只选择自己需要的功能,这样下载的模型包更小,节约时间,这里因为之前已经下载过全量的中文模型,所以不再有下载过程,只是用于演示
In [14]: stanza.download('zh', processors='tokenize,pos')                                         
Downloading http://raw.githubusercontent.com/stanfordnlp/stanza-resources/master/resources_1.0.0.Downloading http://raw.githubusercontent.com/stanfordnlp/stanza-resources/master/resources_1.0.0.json: 116kB [00:00, 554kB/s]
2020-04-15 07:27:38 INFO: "zh" is an alias for "zh-hans"
2020-04-15 07:27:38 INFO: Downloading these customized packages for language: zh-hans (Simplified_Chinese)...
=======================
| Processor | Package |
-----------------------
| tokenize  | gsdsimp |
| pos       | gsdsimp |
| pretrain  | gsdsimp |
=======================
 
2020-04-15 07:27:38 INFO: File exists: /home/textminer/stanza_resources/zh-hans/tokenize/gsdsimp.pt.
2020-04-15 07:27:38 INFO: File exists: /home/textminer/stanza_resources/zh-hans/pos/gsdsimp.pt.
2020-04-15 07:27:39 INFO: File exists: /home/textminer/stanza_resources/zh-hans/pretrain/gsdsimp.pt.
2020-04-15 07:27:39 INFO: Finished downloading models and saved to /home/textminer/stanza_resources.
 
# 构建Pipeline时选择中文分词和词性标注,对其他语言同理
In [15]: zh_nlp = stanza.Pipeline('zh', processors='tokenize,pos')                                
2020-04-15 07:28:12 INFO: "zh" is an alias for "zh-hans"
2020-04-15 07:28:12 INFO: Loading these models for language: zh-hans (Simplified_Chinese):
=======================
| Processor | Package |
-----------------------
| tokenize  | gsdsimp |
| pos       | gsdsimp |
=======================
 
2020-04-15 07:28:13 INFO: Use device: gpu
2020-04-15 07:28:13 INFO: Loading: tokenize
2020-04-15 07:28:15 INFO: Loading: pos
2020-04-15 07:28:17 INFO: Done loading processors!
 
In [16]: doc = zh_nlp("我爱自然语言处理")                                                         
 
In [17]: print(doc)                                                                               
[
  [
    {
      "id": "1",
      "text": "我",
      "upos": "PRON",
      "xpos": "PRP",
      "feats": "Person=1",
      "misc": "start_char=0|end_char=1"
    },
    {
      "id": "2",
      "text": "爱",
      "upos": "VERB",
      "xpos": "VV",
      "misc": "start_char=1|end_char=2"
    },
    {
      "id": "3",
      "text": "自然",
      "upos": "NOUN",
      "xpos": "NN",
      "misc": "start_char=2|end_char=4"
    },
    {
      "id": "4",
      "text": "语言",
      "upos": "NOUN",
      "xpos": "NN",
      "misc": "start_char=4|end_char=6"
    },
    {
      "id": "5",
      "text": "处理",
      "upos": "VERB",
      "xpos": "VV",
      "misc": "start_char=6|end_char=8"
    }
  ]
]
 
# 这里单独使用Stanza的中文分词器
In [18]: zh_nlp = stanza.Pipeline('zh', processors='tokenize')                                    
2020-04-15 07:31:27 INFO: "zh" is an alias for "zh-hans"
2020-04-15 07:31:27 INFO: Loading these models for language: zh-hans (Simplified_Chinese):
=======================
| Processor | Package |
-----------------------
| tokenize  | gsdsimp |
=======================
 
2020-04-15 07:31:27 INFO: Use device: gpu
2020-04-15 07:31:27 INFO: Loading: tokenize
2020-04-15 07:31:27 INFO: Done loading processors!
 
In [19]: doc = zh_nlp("我爱自然语言处理")                                                         
 
In [20]: print(doc)                                                                               
[
  [
    {
      "id": "1",
      "text": "我",
      "misc": "start_char=0|end_char=1"
    },
    {
      "id": "2",
      "text": "爱",
      "misc": "start_char=1|end_char=2"
    },
    {
      "id": "3",
      "text": "自然",
      "misc": "start_char=2|end_char=4"
    },
    {
      "id": "4",
      "text": "语言",
      "misc": "start_char=4|end_char=6"
    },
    {
      "id": "5",
      "text": "处理",
      "misc": "start_char=6|end_char=8"
    }
  ]
]

zaipipelinegoujianshi,chulexuanzebutongdegongnengmokuaichuliqiwai,duiyuyouduogemoxingkeyixuanzeshiyongdegongnengmokuai,yekeyizhidingxuyaoshiyongnagemoxing,lingwaiyekeyizhidinglogjibie,zheixiekeyicankaoguanfangwendang。haiyouyidian,ruguonijuedeshiyonggpumeiyoubiyao,haikeyixuanzeshiyongcpu:

In [21]: zh_doc = stanza.Pipeline('zh', use_gpu=False)                                            
2020-04-15 07:44:04 INFO: "zh" is an alias for "zh-hans"
2020-04-15 07:44:04 INFO: Loading these models for language: zh-hans (Simplified_Chinese):
=========================
| Processor | Package   |
-------------------------
| tokenize  | gsdsimp   |
| pos       | gsdsimp   |
| lemma     | gsdsimp   |
| depparse  | gsdsimp   |
| ner       | ontonotes |
=========================
 
2020-04-15 07:44:04 INFO: Use device: cpu
2020-04-15 07:44:04 INFO: Loading: tokenize
2020-04-15 07:44:04 INFO: Loading: pos
2020-04-15 07:44:06 INFO: Loading: lemma
2020-04-15 07:44:06 INFO: Loading: depparse
2020-04-15 07:44:08 INFO: Loading: ner
2020-04-15 07:44:09 INFO: Done loading processors!

wojiangstanzadezhongyingwenmokuaibushuzaileainlpdehoutai,shiyongdejiushicpu,ganxingqudetongxuekeyiguanzhuainlpgongzhonghao,duihuaceshi,stanza+fenxineirongchufa,huizidongpanduanyuyanxuanzebutongdepipeline:

哥伦比亚大学经典自然语言处理公开课,数学之美中盛赞的柯林斯(Michael Collins)教授授课

AINLP

zaiwodushudeshihou,zuizaoshiconggugeheibanbaozhongwujunlaoshide《shuxuezhimei》lilejiedao michael collins jiaoshoude,zai“shuxuezhimei xilieshiwu fanyujian ziranyuyanchulidejiweijingying”,shizheiyangmiaoshutade:

​kelinsi:zhuiqiuwanmei

kelinsicongshiyuziranyuyanchulidashimakusi (mitch marcus)(womenyihouhaihuiduocitidaomakusi),congbinxifaliyadaxuehuodeboshixuewei,xianrenmashengligongxueyuan (mit) fujiaoshou(biekantashifujiaoshou,tadeshuipingzaidangjinziranyuyanchulilingyushishuyishuerde),zaizuoboshiqijian,kelinsixieleyigehoulaiyitamingzimingmingdeziranyuyanwenfafenxiqi (sentence parser),keyijiangshumianyudemeiyijuhuazhunquedijinxingwenfafenxi。wenfafenxishihenduoziranyuyanyingyongdejichu。suirankelinsideshixiongbulaier (eric brill) he ratnaparkhi yijishidi eisnar douwanchenglexiangdangbucuodeyuyanwenfafenxiqi,danshikelinsiquejiangtazuodaolejizhi,shitazaixiangdangzhangyiduanshijianneichengweishijieshangzuihaodewenfafenxiqi。kelinsichenggongdeguanjianzaiyujiangwenfafenxidemeiyigexijiedouyanjiudehenzixi。kelinsiyongdeshuxuemoxingyehenpiaoliang,zhenggegongzuokeyiyongwanmeilaixingrong。wocengyinweiyanjiudexuyao,zhaokelinsiyaoguotawenfafenxiqideyuanchengxu,tahenshuangkuaidijilewo。woshitujiangtadechengxuxiugaiyixialaimanzuwotedingyingyongdeyaoqiu,danhoulaifaxian,tadechengxuxijietaiduoyizhiyuhennanjinyibuyouhua。kelinsideboshilunwenkanchengshiziranyuyanchulilingyudefanwen。taxiangyibenyouxiudexiaoshuo,basuoyoushiqingdelailongqumaijieshaodeqingqingchuchu,duiyurenheyouyidianjisuanjiheziranyuyanchulizhishideren,doukeyiqingeryijudidudongtafuzadefangfa。

3d和值振幅连线走势图kelinsibiyehou,zai at&t shiyanshiduguolesanniankuailedeshiguang。zaineilikelinsiwanchenglexuduoshijieyiliudeyanjiugongzuozhuruyinhanmaerkefumoxingdequbiexingxunlianfangfa,juanjihezaiziranyuyanchulizhongdeyingyongdengdeng。sannianhou,at&t tingzhileziranyuyanchulifangmiandeyanjiu,kelinsixingyundizai mit zhaodaolejiaozhi。zai mit deduanduanjinianjian,kelinsiduocizaiguojihuiyishanghuodezuijialunwenjiang。xiangbiqitatongxing,zheizhongchengjiushiduyiwuerde。kelinsidetedianjiushibashiqingzuodaojizhi。ruguoshuoyourenxihuan“fansuozhexue”,kelinsijiushiyige。

jiaoshoumuqianzaigelunbiyadaxuerenjiao,bujinjishushuipinggao,renyehenshuai,shigongrendenanshen,zheimenziranyuyanchuligongkaike(natural language processing)dayueluzhiyu2013nian,kechengzhuyebaokuolekejianjiqitaxiangguanziliao,ganxingqudetongxuekeyicankao:

这门课程大致录制于2013年左右,也是深度学习NLP的史前经典NLP课程,适合修完斯坦福NLP入门课程之后继续学习。我们建立了一个NLP入门学习群,感兴趣的同学可以添加微信AINLPer(id: ainlper) ,备注“NLP入门”进群一起交流学习。

3d和值振幅连线走势图guanyuzheimenkecheng,zaoqiyouyixietongxuezaililiuyanpingjiaguo,yixiaxuanzejitiaogongdajiacankao:

“feichanghaodeyimenke,buxiangqitakechengneimeshui,wanwanzhengzhengdegelunbiyakecheng,ruguorenzhenxuewankendingshouhuohenduo,huadeshijianjueduiwuyousuozhi。kelinsidejiangjiefeichangqingxi,neironghangaileyuyanjianmo,jiemasuanfa,xuexisuanfajigefangmian。

3d和值振幅连线走势图yuyanjifanyimoxing:nyuanmoxing,hmmmoxing,log-linearmoxing,glmmoxing,ibm 1moxing,ibm2 moxing,phrase-basedfanyimoxing,pcfgyufa,lpcfgyufa

jiemasuanfa:viterbisuanfa,ckysuanfa,glm viterbisuanfa

xuexisuanfa:brownjuleisuanfa,perceptronsuanfa,emsuanfa

yingyongjuli:cixingbiaozhu/shitishibie(hmm, glm, log-linear),yufashubiaozhu(pcfg, dependecny-based),jiqifanyi”

=========================================================================================

“proferssor collinsjiangkeshifenqingxi,kechengdatifugaidaolenlpdebijiaojichudeneirong,bianchengzuoyeshifenjuyouzhenduixing,youyubushitebieshuxipython,wozuoqilaitebiefeijing,jibenshangmeigepawodouzuole10xiaoshiyishang。kechengnanduzhongshang,jianyiyouyidingpythonhemachine learningjichudetongxuexuexi。”

=========================================================================================

“gensitanfuneimennlpbiqilai,zheimendelilunxinggengqiang,xueqilaiyeshaoweikuzaoyidian,danshigezhongmoxingjiangdehenjiandanmingle,tuijiankanlesitanfudenlphouzailaixuezheige”

wobazheimenkechengzhenglileyixiaanzhangjiefangzailebzhan,ganxingqudetongxuekeyiguanzhu,ruguonixuyaoshipinkejianzimuzhileidedabaowenjian,keyiguanzhuwomendegongzhonghao,huifu“collins"huoqubaiduwangpanlianjie:

gelunbiyadaxueziranyuyanchuligongkaike-diyijiangkechengjieshao

3d和值振幅连线走势图gelunbiyadaxueziranyuyanchuligongkaike-dierjiangyuyanmoxing

gelunbiyadaxueziranyuyanchuligongkaike-disanjiangyuyanmoxingcanshugujidisijiangzongjie

gelunbiyadaxueziranyuyanchuligongkaike-diwujiangcixingbiaozhuheyinmaerkefumoxing

gelunbiyadaxueziranyuyanchuligongkaike-diliujiangjufafenxiheshangxiawenwuguanwenfa

gelunbiyadaxueziranyuyanchuligongkaike-diqijianggailvshangxiawenwuguanwenfa

gelunbiyadaxueziranyuyanchuligongkaike-dibajiangdijiujianggailvshangxiawenwuguanwenfaderuodianyijicifahua

gelunbiyadaxueziranyuyanchuligongkaike-dishijiangjiqifanyijieshao

gelunbiyadaxueziranyuyanchuligongkaike-dishiyijiangibmfanyimoxing

gelunbiyadaxueziranyuyanchuligongkaike-dishierjiangjiyuduanyudejiqifanyimoxing

3d和值振幅连线走势图gelunbiyadaxueziranyuyanchuligongkaike-dishisanjiangjiqifanyijiemasuanfa

zhu:shipinyizhizaibzhanshenhe,shenhetongguohougengxinlianjie,huozhetongguowangpanlianjiehuoqu

gelunbiyadaxueziranyuyanchuligongkaike-dishisijiangduishuxianxingmoxing

3d和值振幅连线走势图gelunbiyadaxueziranyuyanchuligongkaike-dishiwujiangjiyuduishuxianxingmoxingdecixingbiaozhu

gelunbiyadaxueziranyuyanchuligongkaike-dishiliujiangjiyuduishuxianxingmoxingdejufafenxi

gelunbiyadaxueziranyuyanchuligongkaike-dishiqijiangwujianduxuexi

3d和值振幅连线走势图gelunbiyadaxueziranyuyanchuligongkaike-dishibajiangguangyixianxingmoxing

gelunbiyadaxueziranyuyanchuligongkaike-dishijiujiangjiyuguangyixianxingmoxingdecixingbiaozhu

gelunbiyadaxueziranyuyanchuligongkaike-diershijiangjiyuguangyixianxingmoxingdeyicunjufafenxi

zuihoufushangbaidubaikezhongguanyukelinsijiaoshoudejieshao:

ziranyuyanchulizhuanjia,gelunbiyadaxuejiaoshou,kaifalezhumingdejufafenxiqicollins parser。

gongzuojingli:

1999nian1yuezhi2002nian11yue,at&tshiyanshi,yanjiurenyuan;

2003nian1yuezhi2010nian12yue,meiguomashengligongxueyuan(mit),zhulijiaoshou/fujiaoshou;

2011nian1yuezhijin,meiguogelunbiyadaxue,vikram panditjiaoxijiaoshou。

zhuyaochengjiu:

3d和值振幅连线走势图huoemnlp 2002, emnlp 2004, uai 2004, uai 2005, conll 2008, emnlp 2010zuijialunwenjiang。

renwupingjia:

3d和值振幅连线走势图youxiexuezhejiangyigewentiyanjiudaojizhi,zhizhuzhuiqiuwanshanshenzhikeyishuodadaowanmeidechengdu。tamendegongzuoduitongxingyouhendadecankaojiazhi,yincizaikeyanzhonghenxuyaozheiyangdexuezhe。zaiziranyuyanchulifangmianxinyidaidedingjirenwumaikeer·kelinsijiushizheiyangderen。——wujun《shuxuezhimei》

自动作诗机&藏头诗生成器:五言、七言、绝句、律诗全了

AINLP

这是自然语言处理里面最有意思的任务之一:自然语言生成,本文主要是指古诗自动写诗,或者自动作诗机藏头诗生成器,目前支持五言绝句、七言绝句、五言律诗、七言律诗的自动生成(给定不超过7个字的开头内容自动续写)和藏头诗生成(给定不超过8个字的内容自动合成)。先看一下效果,也算是一个简单的自动作诗机和藏头诗生成器使用指南,感兴趣的同学请关注公众号AINLP,直接关键词触发测试:

自动作诗机或者自动写诗:
输入 “写诗 起头内容” 触发古诗自动生成(自动续写),输入内容不要超过7个字,会根据字数随机生成几首五言绝句、七言绝句、五言律诗、七言律诗:

藏头诗生成器:
输入 “藏头诗 藏头内容” 触发藏头诗自动生成,输入内容不超过8个字,会根据字数随机生成绝句或者律诗:

五言诗生成器:
输入“五言 起头内容” 触发五言诗自动生成,输入内容不要超过5个字,会随机生成五言绝句或者五言律诗

七言诗生成器:
输入 “七言 起头内容” 触发七言诗自动生成,输入内容不要超过7个字,会随机生成七言绝句或者七言律诗

绝句生成器:
输入 “绝句 起头内容” 触发绝句自动生成,输入内容不要超过7个字,会根据字数随机生成五言绝句或者七言绝句

律诗生成器:
输入 “律诗 起头内容” 触发律诗自动生成,输入内容不要超过7个字,会根据字数随机生成五言律诗或者七言律诗

五言绝句生成器和五言律诗生成器:
输入 “五言绝句 起头内容” 触发五言绝句自动生成,输入 “五言律诗 起头内容” 触发五言律诗自动生成,输入内容不要超过5个字:

七言绝句生成器和七言律诗生成器:
输入 “五言绝句 起头内容” 触发五言绝句自动生成,输入 “五言律诗 起头内容” 触发五言律诗自动生成,输入内容不要超过5个字:

最后让我们再看一下藏头诗自动生成的功能,支持任意8个字以内的输入,以下是对“自然语言”, “自然语言处理”,“我爱自然语言处理”的输入测试:

关于机器自动写诗,我们已经谈到多次,请参考:
AINLP公众号自动作诗上线
用GPT-2自动写诗,从五言绝句开始
鼠年春季,用GPT-2自动写对联和对对联

3d和值振幅连线走势图muqianyong zheigegongjuduigushiheduilianshujuyiqixunlian,shejihaoshujugeshi,dangemoxingkeyiyizhanshizhichiduozhongticaigushiheduilianshengcheng,feichangfangbian,zaicianli。

guanyugushiticaijieshao,yixialaiyuanyubaike:

3d和值振幅连线走势图wuyanjuejushizhongguochuantongshigedeyizhongticai,jianchengwujue,shizhiwuyansijueryouhehulvshiguifandexiaoshi,shuyujintishifanchou。citiyuanyuhandailefuxiaoshi,shenshouliuchaomingeyingxiang,chengshudingxingyutangdai。wujuemeishoujinershizi,biannengzhanxianchuyifufuqingxindetuhua,chuandayizhongzhongzhenqiedeyijing。yinxiaojianda,yishaozongduo,zaiduanzhangzhongbaohanzhefengfudeneirong,shiqizuidatese。wujueyouzeqi、pingqierge。daibiaozuopinyouwangweide《niaomingjian》、libaide《jingyesi》、dufude《bazhentu》、wangzhihuande《dengguanquelou》、liuzhangqingde《songlingcheshangren》deng。

qiyanjuejushizhongguochuantongshigedeyizhongticai,jianchengqijue,shuyujintishifanchou。citiquanshisiju,meijuqiyan,zaiyayun、zhanduidengfangmianyouyangedegelvyaoqiu。shitiqiyuanyunanchaolefugexinghuobeichaolefuminge,huokezhuisudaoxijindeminyao,dingxing、chengshuyutangdai。daibiaozuopinyouwangchanglingde《furonglousongxinjianershou》、libaide《zaofabaidicheng》、dufude《jiangnanfengliguinian》、lishengjiaode《guanchaoyougan》deng。

wuyanlvshi,shizhongguochuantongshigedeyizhongticai,jianchengwulv,shuyujintishifanchou。citifayuanyunanchaoqiyongmingshiqi,qichuxingshishenyuedengjiangjiushenglv、duioudexintishi,zhichutangshenquanqi、songzhiwenshijibendingxing,chengshuyushengtangshiqi。quanpiangongbaju,meijuwugezi,youzeqi、pingqiliangzhongjibenxingshi,zhongjianlianglianxuzuoduizhang。daibiaozuopinyoulibaide《songyouren》、dufude《chunwang》、wangweide《shanjuqiuming》、lishengjiaode《xinmaojichunyelihangergongci》deng。

3d和值振幅连线走势图qiyanlvshishizhongguochuantongshigedeyizhongticai,jianchengqilv,shuyujintishifanchou,qiyuanyunanchaoqiyongmingshishenyuedengjiangjiushenglv、duioudexintishi,zhichutangshenquanqi、songzhiwendengjinyibufazhandingxing,zhishengtangdufushouzhongchengshu。qigelvyanmi,yaoqiushijuzishuzhengqihuayi,youbajuzucheng,meijuqigezi,meiliangjuweiyilian,gongsilian,fenshoulian、hanlian、jinglianheweilian,zhongjianlianglianyaoqiuduizhang。daibiaozuopinyoucuihaode《huanghelou》、dufude《denggao》、lishangyinde《andingchenglou》deng。

斯坦福大学自然语言处理经典入门课程-Dan Jurafsky 和 Chris Manning 教授授课

AINLP

zheimenkechengluzhiyushenduxuexibaofaqianxi,shoukeshisitanfujiaoshou he jiaoshou,liangweidoushiziranyuyanchulilingyudeshenniu:qianzhexiele《speech and language processing》(zhongwenyiming:ziranyuyanchulizonglun),muqiandisanbanslp3haizaigengxinzhong;houzhexiele《foundations of statistical natural language processing》(zhongwenyiming:tongjiziranyuyanchuli)he《introduction to information retrieval》(zhongwenyiming:xinxijiansuodaolun),zheijibenshujihushinlperdebidushu。zheimenkechengshihenlprumenxuexi,keyilejiejibendeziranyuyanchulirenwuhezaoqijingdiandechulifangfa,yijihexinxijiansuoxiangguandeyixiefangfa。wobazheimenkechengzhenglileyixiaanzhangjiefangzailebzhan,ganxingqudetongxuekeyiguanzhu。

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-diyijiangkechengjieshaojidierjiangzhengzebiaodashi

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-disanjiangbianjijuli

sitanfuziranyuyanchulijingdianrumenkecheng-disijiangyuyanmoxing

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-diwujiangpinxiejiucuo

sitanfuziranyuyanchulijingdianrumenkecheng-diliujiangwenbenfenlei

sitanfuziranyuyanchulijingdianrumenkecheng-diqijiangqingganfenxi

sitanfuziranyuyanchulijingdianrumenkecheng-dibajiangshengchengmoxingpanbiemoxingzuidashangmoxingfenleiqi

sitanfuziranyuyanchulijingdianrumenkecheng-dijiujiangmingmingshitishibiener

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-dishijiangguanxichouqu

sitanfuziranyuyanchulijingdianrumenkecheng-dishiyijiangzuidashangmoxingjinjie

sitanfuziranyuyanchulijingdianrumenkecheng-dishierjiangcixingbiaozhu

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-dishisanjiangjufafenxi

sitanfuziranyuyanchulijingdianrumenkecheng-dishisi、shiwujianggailvjufafenxi

sitanfuziranyuyanchulijingdianrumenkecheng-dishiliujiangcifafenxi

sitanfuziranyuyanchulijingdianrumenkecheng-dishiqijiangyicunjufafenxi

sitanfuziranyuyanchulijingdianrumenkecheng-dishibajiangxinxijiansuo

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-dishijiujiangxinxijiansuojinjie

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-diershijiangyuyixue

3d和值振幅连线走势图sitanfuziranyuyanchulijingdianrumenkecheng-diershiyijiangwendaxitong

sitanfuziranyuyanchulijingdianrumenkecheng-diershierjiangwenbenzhaiyaoershisanjiangwanjiepian

斯坦福自然语言处理经典入门课程-第一讲课程介绍及第二讲正则表达式

AINLP

zheimenkechengluzhiyushenduxuexibaofaqianxi,shoukeshisitanfujiaoshou dan jurafsky he christopher manning jiaoshou,liangweidoushiziranyuyanchulilingyudeshenniu:qianzhexiele《speech and language processing》(zhongwenyiming:ziranyuyanchulizonglun),houzhexiele《foundations of statistical natural language processing》(zhongwenyiming:tongjiziranyuyanchulijichu),zheiliangbenshujihushinlperdebidushu。zheimenkechengshihenlprumenxuexi,keyilejiejibendeziranyuyanchulirenwuhezaoqijingdiandechulifangfa。

zheishidiyijiangkechengjieshaohedierjiangzhengzebiaodashidexiangguanneirong,shihuashishuo,zhengzebiaodashizaigongzuozhongyongdexiangdangzhiduole。

李宏毅老师2020新课深度学习与人类语言处理正式开放上线

AINLP

qianliangtianlihongyilaoshijiqixuexi2020banganggangshangxian,zheimetayoumabutingtidetuichuleyouyikuanliangxindazuo:shenduxuexiyurenleiyuyanchuli (deep learning for human language processing),feichangshihenlpermenlaizhui!

kechengzhuye,baohanshipinheqitaxiangguanziliaolianjie,jianyibaocun:

3d和值振幅连线走势图http://speech.ee.ntu.edu.tw/~tlkagk/courses_dlhlp20.html

3d和值振幅连线走势图kanlediyijiekechengshipin,zheimenkechengzhisuoyijiaozuoshenduxuexiyurenleiyuyanchuli,erbushishenduxuexiyuziranyuyanchuli,zhuyaoshizheimenkechengliwenziheyuyindeneironggezhanyiban,lingwaizhuyaoguanzhujin3niandexiangguanjishu,pirubertjizhihoudeyuchulimoxingjiangzhongdianjiangshu,feichangzhideqidai。womenjianlileyigezheimenkechengdexuexijiaoliuqun,ganxingqudetongxuekeyitianjiaweixinainlper(id: ainlper) ,beizhu“lihongyi”jinqunyiqijiaoliuxuexi。

3d和值振幅连线走势图muqianzheimenkechengyijingfangchule2jiekechengneirong,fenbieshikechenggailanheyuyinshibiediyibufen,ganxingqudetongxuekeyizhijieguankan:

3d和值振幅连线走势图ruguojuedezheigehaibuguoyin,keyiguanzhuainlpgongzhonghao,huifu"dlhlp",huoquzheimenkechengqian2jiekechengshipinheslides,yihouhuichixugengxinxiangguanziliao。

关于AINLP

ainlp shiyigeyouquyouaideziranyuyanchulishequ,zhuanzhuyu ai、nlp、jiqixuexi、shenduxuexi、tuijiansuanfadengxiangguanjishudefenxiang,zhutibaokuowenbenzhaiyao、zhinengwenda、liaotianjiqiren、jiqifanyi、zidongshengcheng、zhishitupu、yuxunlianmoxing、tuijianxitong、jisuanguanggao、zhaopinxinxi、qiuzhijingyanfenxiangdeng,huanyingguanzhu!jiajishujiaoliuqunqingtianjiaainlpjunweixin(id:ainlp2),beizhugongzuo/yanjiufangxiang+jiaqunmude。

用 GPT-2 自动写诗,从五言绝句开始

AINLP

春节前用 GPT2 训练了一个自动对联系统:鼠年春节,用 GPT-2 自动生成(写)春联和对对联 ,逻辑上来说这套NLG方法论可以应用于任何领域文本的自动生成,当然,格式越固定越好,这让我自然想到了自动写诗词,诗词的格式相对比较固定,我们之前已经有所涉及,譬如已经在AINLP公众号上上线了自动写藏头诗首字诗的功能,不过是直接复用的: ,另外还有一个更大的诗词数据项目可以用作自动作诗的“原料”:,加上 GPT2-Chinese 这个项目: ,可以说万事俱备,只欠试用。

3d和值振幅连线走势图suoyibenzhouwomencongwuyanjuejukaishijixuziranyuyanshengchengdezhuti,guanyuwuyanjueju,baidubaikeshizheiyangshuode:

wuyanjuejushizhongguochuantongshigedeyizhongticai,jianchengwujue,shizhiwuyansijueryouhehulvshiguifandexiaoshi,shuyujintishifanchou。citiyuanyuhandailefuxiaoshi,shenshouliuchaomingeyingxiang,chengshudingxingyutangdai。wujuemeishoujinershizi,biannengzhanxianchuyifufuqingxindetuhua,chuandayizhongzhongzhenqiedeyijing。yinxiaojianda,yishaozongduo,zaiduanzhangzhongbaohanzhefengfudeneirong,shiqizuidatese。wujueyouzeqi、pingqierge。daibiaozuopinyouwangweide《niaomingjian》、libaide《jingyesi》、dufude《bazhentu》、wangzhihuande《dengguanquelou》、liuzhangqingde《songlingcheshangren》deng。

3d和值振幅连线走势图wozhuyaoyongle lide《quantangshi》he《quansongshi》shuju ,shouxianxiangzheigexiangmudezuozhezhijing:

3d和值振幅连线走势图《quantangshi》shiqingkangxisishisinian(1705nian),pengdingqiu、shensanceng、yangzhongne、wangshihong、wangyi、yumei、xushuben、chedingjin、panconglv、chasili10renfengchibianxiao,“deshisiwanbaqianjiubaiyushou,fanerqianerbaiyuren”, gongji900juan,mulu12juan。 laizibaike

《quansongshi》jitangshidegaodufanrongzhihou,songshizaisixiangneirongheyishubiaoxianshangyouxindekaituohechuangzao,chuxianlexuduoyouxiuzuojiazuopin,xingchenglexuduoliupai,duiyuan、ming、qingdeshigefazhanchanshengleshenyuanyingxiang。

说明
《全唐诗》和《全宋诗》是繁体存储, 如有需要请自己转换, 但转换后的字不符合上下文。

这里需要首先通过OpenCC做了繁简转换,其次提取里面的五言绝句,最后转换为 GPT2-Chinese 的训练格式,然后就是训练和测试了,感兴趣的同学可以自己尝试,很方便,训练经验可以复用上文关于自动对联的:

1)训练数据可以按 GPT2-Chinese 训练数据的格式要求写个脚本进行转换,可以加一些标记符,这样在生成的时候可以基于这些标记符做trick;
2)训练时请将参数 min-length 设置为一个较小的数字,默认为128,由于对联数据长度比较短,按默认的设置训练后只会得到乱码,我直接设置为1;
3)根据自己GPU显存的大小调整 batch_size 和配置参数, 这里 batch_size 默认为8,训练时在1080TI的机器上会出现OOM,将其设置为4就可以完全跑通了,其他参数不用动;

3d和值振幅连线走势图zidongzuoshigpt2moxingxunlianwanchenghou,keyizhijiejiyu gpt2-chinese limiande generate.py jiaobenjinxingceshi,henfangbian,wojiyu generate.py he flask-restful xieleyige server banben,duijiedaoainlpgongzhonghaohoutaile,ganxingqudetongxuekeyiguanzhuainlpgongzhonghao,zhijiejinxingceshi:

guanjianci“xieshi/zuoshi”chufashigedezidongshengcheng,lirushuru“xieshichun”,zidongzuoshimoxinghuijiyu“chun”jinxingzidongxuxie,huijichuyi“chun”kaitoudeshi,jichuqitadezitongli,muqianbunengduoyuwugezi,yinweizhinengzidongshengchengwuyanjueju:

关键词“藏头诗”触发藏头诗生成,例如输入“藏头诗春夏秋冬",基于GPT2模型叠加trick生成:

zuihou,huanyingguanzhuainlpgongzhonghao,ceshizidongxieshizuoshihezangtoushishengchengqigongneng:

3d和值振幅连线走势图guanyuainlpduihuagongnengmokuai,ganxingqudetongxuekeyicankao:

腾讯词向量和相似词、相似度、词语游戏系列





NLP相关工具及在线测试(公众号对话测试)






自动对联及作诗机



夸夸聊天机器人及其他技能







如果对AINLP公众号感兴趣,也欢迎参考我们的年度阅读清单:AINLP年度阅读收藏清单

鼠年春节,用 GPT-2 自动生成(写)春联和对对联

AINLP

鼠年春节临近,来试试新的基于 GPT2-Chinese 自动对联系统:自动写对联(输入开头进行对联自动生成)和自动对对联(输入上联自动写下联)。老的自动对联功能是去年基于深度学习机器翻译模型上线的一个自动对对联的对话模块:风云三尺剑,花鸟一床书---对联数据集和自动对联机器人

zheiyinianlai,yibertweidaibiaodeyuxunlianmoxingbuduantuichenchuxin,xijuanzhenggeziranyuyanchuli(nlp)lingyu,zheiqizhongnlpdenantizhiyiziranyuyanshengcheng(nlg)yededaolehendadezhuli,tebieshiqunianshangbannian openai de gpt-2 detuichu,feichangjingyan,buguo gpt-2 demoxingzhuyaoshijiyuyingwenlingyudeyuliaoxunliande,suirandaomuqianweizhiyijingfabulehanyou15yicanshudewanzhengmoxing,duiyuyingwenlingyudezidongwenbenshengchengfeichangyoubangzhu,danshiduiyuzhongwenlingyudenlglaishuohaishihenshouxian。

huidaozhongwenlingyu,womenzhiqiantuijianguoainlpjishujiaoliuqunduzeyaotongxuedekaiyuanxiangmu gpt2-chinese::《》,zheigexiangmukeyizhenduizhongwenshujujinxinggpt-2moxingdexunlian,keyixieshi,xinwen,xiaoshuo,huoshixunliantongyongyuyanmoxing。suoyiduiyuzidongduilianshengchenglaishuo,wonengxiangdaodejiushijiyugpt2-chineseheduilianshujuxunlianyifenduilianlingyudegpt2moxing,yongyuduilianzidongshengcheng:xieduilianheduiduilian。xingyundeshi,duilianshujuyijingyoule,yiranshiwomenqunianshiyongguo ,tebieganxietigongzheifenshujudetongxue,zheifenduilianshujubaohan70duowantiaoduilian,weiyikexideshimeiyouhengpi,yaoshiyouhengpi,jiukeyizaogengwanzhengdezidongxieduilianheduiduilianxitongle。

tebiexuyaoshuomingdeshi,zheilibingbushijiyuyigedadezhongwen gpt-2 moxingjinxingtedinglingyu finetune de,suiranmuqianyijingyouledaxingdezhongwen gpt-2 yuxunlianmoxing: ,danshihe gpt2-chinese shilianggetixi,er gpt2-chinese muqianhaibuzhichizheigedamoxingdeqianyi。guanyuruheshiyong gpt2-chinese jinxingduilianshujude gpt2 moxingxunlian,zheigexiangmudedaimahewendangdouxiedefeichangqingchu,zhijiecankaojike,ruguoyouwenti,keyichakanyixiaissue,woyudaodewentijibenshangjiushitongguowendangheissuejiejuede,zheilitijigezhuyidedian:

1)训练数据可以按 GPT2-Chinese 训练数据的格式要求写个脚本进行转换,可以加一些标记符,譬如开头,结尾以及上联下联之间的分隔符,这样在生成的时候可以基于这些标记符做trick;
2)训练时请将参数 min-length 设置为一个较小的数字,默认为128,由于对联数据长度比较短,按默认的设置训练后只会得到乱码,我直接设置为1;
3d和值振幅连线走势图 3)根据自己GPU显存的大小调整 batch_size 和配置参数, 这里 batch_size 默认为8,训练时在1080TI的机器上会出现OOM,将其设置为4就可以完全跑通了,其他参数不用动;

duilian gpt-2 moxingxunlianwanchenghou,keyizhijiejiyu gpt2-chinese limiande generate.py jiaobenjinxingceshi,henfangbian,wojiyu generate.py he flask-restful xieleyige server banben,duijiedaoainlpgongzhonghaohoutaile,ganxingqudetongxuekeyiguanzhuainlpgongzhonghao,zhijiejinxingceshi:

guanjianci“xieduilian”chufaduilianzidongshengcheng,lirushuru“xieduilianshunian”,duilianmoxinghuijiyu“shunian”jinxingzidongxuxie,huijichuyi“shunian”kaitoudagai3geduilian:

guanjianci“duiduilian”chufajiyushanglianduixialian,lirushuru“duiduilian yifanfengshunniannianhao”,huijichudagai3gehouxuanduilian:

3d和值振幅连线走势图dangrannikeyiyong“shanglian”chufalaodeduilianbanbenjinxingduibi:

zhiyulianggebanbendexiaoguo,huanyingduozuoduibi,ruguoyudaolehenbangdejiqiduilian,yehuanyingzaipinglunlifenxiang。zuihou,huanyingguanzhuainlpgongzhonghao,ceshizidongshengchengduilianhezidongduiduiliangongneng:

guanyuainlpduihuagongnengmokuai,ganxingqudetongxuekeyicankao:

腾讯词向量和相似词、相似度、词语游戏系列





NLP相关工具及在线测试(公众号对话测试)






自动对联及作诗机


夸夸聊天机器人及其他技能







如果对AINLP公众号感兴趣,也欢迎参考我们的年度阅读清单:AINLP年度阅读收藏清单