fastNLP documentation

A Modularized and Extensible Toolkit for Natural Language Processing. Currently still in incubation.

Introduction

FastNLP is a modular Natural Language Processing system based on PyTorch, built for fast development of NLP models.

A deep learning NLP model is the composition of three types of modules:

module type functionality example
encoder encode the input into some abstract representation embedding, RNN, CNN, transformer
aggregator aggregate and reduce information self-attention, max-pooling
decoder decode the representation into the output MLP, CRF

For example:

_images/text_classification.png

User’s Guide

Installation

Make sure your environment satisfies https://github.com/fastnlp/fastNLP/blob/master/requirements.txt .

Run the following commands to install fastNLP package:

pip install fastNLP

Quickstart

FastNLP 1分钟上手教程

教程原文见 https://github.com/fastnlp/fastNLP/blob/master/tutorials/fastnlp_1min_tutorial.ipynb

step 1

读取数据集

from fastNLP import DataSet
# linux_path = "../test/data_for_tests/tutorial_sample_dataset.csv"
win_path = "C:\\Users\zyfeng\Desktop\FudanNLP\\fastNLP\\test\\data_for_tests\\tutorial_sample_dataset.csv"
ds = DataSet.read_csv(win_path, headers=('raw_sentence', 'label'), sep='\t')
step 2

数据预处理 1. 类型转换 2. 切分验证集 3. 构建词典

# 将所有数字转为小写
ds.apply(lambda x: x['raw_sentence'].lower(), new_field_name='raw_sentence')
# label转int
ds.apply(lambda x: int(x['label']), new_field_name='label_seq', is_target=True)

def split_sent(ins):
    return ins['raw_sentence'].split()
ds.apply(split_sent, new_field_name='words', is_input=True)
# 分割训练集/验证集
train_data, dev_data = ds.split(0.3)
print("Train size: ", len(train_data))
print("Test size: ", len(dev_data))
Train size:  54
Test size:  23
from fastNLP import Vocabulary
vocab = Vocabulary(min_freq=2)
train_data.apply(lambda x: [vocab.add(word) for word in x['words']])

# index句子, Vocabulary.to_index(word)
train_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
dev_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
step 3

定义模型

from fastNLP.models import CNNText
model = CNNText(embed_num=len(vocab), embed_dim=50, num_classes=5, padding=2, dropout=0.1)
step 4

开始训练

from fastNLP import Trainer, CrossEntropyLoss, AccuracyMetric
trainer = Trainer(model=model,
                  train_data=train_data,
                  dev_data=dev_data,
                  loss=CrossEntropyLoss(),
                  metrics=AccuracyMetric()
                  )
trainer.train()
print('Train finished!')
training epochs started 2018-12-07 14:03:41
HBox(children=(IntProgress(value=0, layout=Layout(flex='2'), max=6), HTML(value='')), layout=Layout(display='i…
Epoch 1/3. Step:2/6. AccuracyMetric: acc=0.26087
Epoch 2/3. Step:4/6. AccuracyMetric: acc=0.347826
Epoch 3/3. Step:6/6. AccuracyMetric: acc=0.608696
Train finished!
本教程结束。更多操作请参考进阶教程。

fastNLP 10分钟上手教程

教程原文见 https://github.com/fastnlp/fastNLP/blob/master/tutorials/fastnlp_10min_tutorial.ipynb

fastNLP提供方便的数据预处理,训练和测试模型的功能

DataSet & Instance

fastNLP用DataSet和Instance保存和处理数据。每个DataSet表示一个数据集,每个Instance表示一个数据样本。一个DataSet存有多个Instance,每个Instance可以自定义存哪些内容。

有一些read_*方法,可以轻松从文件读取数据,存成DataSet。

from fastNLP import DataSet
from fastNLP import Instance

# 从csv读取数据到DataSet
win_path = "C:\\Users\zyfeng\Desktop\FudanNLP\\fastNLP\\test\\data_for_tests\\tutorial_sample_dataset.csv"
dataset = DataSet.read_csv(win_path, headers=('raw_sentence', 'label'), sep='\t')
print(dataset[0])
{'raw_sentence': A series of escapades demonstrating the adage that what is good for the goose is also good for the gander , some of which occasionally amuses but none of which amounts to much of a story .,
'label': 1}
# DataSet.append(Instance)加入新数据

dataset.append(Instance(raw_sentence='fake data', label='0'))
dataset[-1]
{'raw_sentence': fake data,
'label': 0}
# DataSet.apply(func, new_field_name)对数据预处理

# 将所有数字转为小写
dataset.apply(lambda x: x['raw_sentence'].lower(), new_field_name='raw_sentence')
# label转int
dataset.apply(lambda x: int(x['label']), new_field_name='label_seq', is_target=True)
# 使用空格分割句子
dataset.drop(lambda x: len(x['raw_sentence'].split()) == 0)
def split_sent(ins):
    return ins['raw_sentence'].split()
dataset.apply(split_sent, new_field_name='words', is_input=True)
# DataSet.drop(func)筛除数据
# 删除低于某个长度的词语
dataset.drop(lambda x: len(x['words']) <= 3)
# 分出测试集、训练集

test_data, train_data = dataset.split(0.3)
print("Train size: ", len(test_data))
print("Test size: ", len(train_data))
Train size:  54
Test size:
Vocabulary

fastNLP中的Vocabulary轻松构建词表,将词转成数字

from fastNLP import Vocabulary

# 构建词表, Vocabulary.add(word)
vocab = Vocabulary(min_freq=2)
train_data.apply(lambda x: [vocab.add(word) for word in x['words']])
vocab.build_vocab()

# index句子, Vocabulary.to_index(word)
train_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
test_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)


print(test_data[0])
{'raw_sentence': the plot is romantic comedy boilerplate from start to finish .,
'label': 2,
'label_seq': 2,
'words': ['the', 'plot', 'is', 'romantic', 'comedy', 'boilerplate', 'from', 'start', 'to', 'finish', '.'],
'word_seq': [2, 13, 9, 24, 25, 26, 15, 27, 11, 28, 3]}
# 假设你们需要做强化学习或者gan之类的项目,也许你们可以使用这里的dataset
from fastNLP.core.batch import Batch
from fastNLP.core.sampler import RandomSampler

batch_iterator = Batch(dataset=train_data, batch_size=2, sampler=RandomSampler())
for batch_x, batch_y in batch_iterator:
    print("batch_x has: ", batch_x)
    print("batch_y has: ", batch_y)
    break
batch_x has:  {'words': array([list(['this', 'kind', 'of', 'hands-on', 'storytelling', 'is', 'ultimately', 'what', 'makes', 'shanghai', 'ghetto', 'move', 'beyond', 'a', 'good', ',', 'dry', ',', 'reliable', 'textbook', 'and', 'what', 'allows', 'it', 'to', 'rank', 'with', 'its', 'worthy', 'predecessors', '.']),
       list(['the', 'entire', 'movie', 'is', 'filled', 'with', 'deja', 'vu', 'moments', '.'])],
      dtype=object), 'word_seq': tensor([[  19,  184,    6,    1,  481,    9,  206,   50,   91, 1210, 1609, 1330,
          495,    5,   63,    4, 1269,    4,    1, 1184,    7,   50, 1050,   10,
            8, 1611,   16,   21, 1039,    1,    2],
        [   3,  711,   22,    9, 1282,   16, 2482, 2483,  200,    2,    0,    0,
            0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
            0,    0,    0,    0,    0,    0,    0]])}
batch_y has:  {'label_seq': tensor([3, 2])}
Model
# 定义一个简单的Pytorch模型

from fastNLP.models import CNNText
model = CNNText(embed_num=len(vocab), embed_dim=50, num_classes=5, padding=2, dropout=0.1)
model
CNNText(
  (embed): Embedding(
    (embed): Embedding(77, 50, padding_idx=0)
    (dropout): Dropout(p=0.0)
  )
  (conv_pool): ConvMaxpool(
    (convs): ModuleList(
      (0): Conv1d(50, 3, kernel_size=(3,), stride=(1,), padding=(2,))
      (1): Conv1d(50, 4, kernel_size=(4,), stride=(1,), padding=(2,))
      (2): Conv1d(50, 5, kernel_size=(5,), stride=(1,), padding=(2,))
    )
  )
  (dropout): Dropout(p=0.1)
  (fc): Linear(
    (linear): Linear(in_features=12, out_features=5, bias=True)
  )
)
Trainer & Tester

使用fastNLP的Trainer训练模型

from fastNLP import Trainer
from copy import deepcopy
from fastNLP import CrossEntropyLoss
from fastNLP import AccuracyMetric
# 进行overfitting测试
copy_model = deepcopy(model)
overfit_trainer = Trainer(model=copy_model,
                          train_data=test_data,
                          dev_data=test_data,
                          loss=CrossEntropyLoss(pred="output", target="label_seq"),
                          metrics=AccuracyMetric(),
                          n_epochs=10,
                          save_path=None)
overfit_trainer.train()
training epochs started 2018-12-07 14:07:20
HBox(children=(IntProgress(value=0, layout=Layout(flex='2'), max=20), HTML(value='')), layout=Layout(display='…
Epoch 1/10. Step:2/20. AccuracyMetric: acc=0.037037
Epoch 2/10. Step:4/20. AccuracyMetric: acc=0.296296
Epoch 3/10. Step:6/20. AccuracyMetric: acc=0.333333
Epoch 4/10. Step:8/20. AccuracyMetric: acc=0.555556
Epoch 5/10. Step:10/20. AccuracyMetric: acc=0.611111
Epoch 6/10. Step:12/20. AccuracyMetric: acc=0.481481
Epoch 7/10. Step:14/20. AccuracyMetric: acc=0.62963
Epoch 8/10. Step:16/20. AccuracyMetric: acc=0.685185
Epoch 9/10. Step:18/20. AccuracyMetric: acc=0.722222
Epoch 10/10. Step:20/20. AccuracyMetric: acc=0.777778
# 实例化Trainer,传入模型和数据,进行训练
trainer = Trainer(model=model,
                  train_data=train_data,
                  dev_data=test_data,
                  loss=CrossEntropyLoss(pred="output", target="label_seq"),
                  metrics=AccuracyMetric(),
                  n_epochs=5)
trainer.train()
print('Train finished!')
training epochs started 2018-12-07 14:08:10
HBox(children=(IntProgress(value=0, layout=Layout(flex='2'), max=5), HTML(value='')), layout=Layout(display='i…
Epoch 1/5. Step:1/5. AccuracyMetric: acc=0.037037
Epoch 2/5. Step:2/5. AccuracyMetric: acc=0.037037
Epoch 3/5. Step:3/5. AccuracyMetric: acc=0.037037
Epoch 4/5. Step:4/5. AccuracyMetric: acc=0.185185
Epoch 5/5. Step:5/5. AccuracyMetric: acc=0.240741
Train finished!
from fastNLP import Tester

tester = Tester(data=test_data, model=model, metrics=AccuracyMetric())
acc = tester.test()
[tester]
AccuracyMetric: acc=0.240741
In summary
fastNLP Trainer的伪代码逻辑
1. 准备DataSet,假设DataSet中共有如下的fields
['raw_sentence', 'word_seq1', 'word_seq2', 'raw_label','label']
通过
    DataSet.set_input('word_seq1', word_seq2', flag=True)将'word_seq1', 'word_seq2'设置为input
通过
    DataSet.set_target('label', flag=True)'label'设置为target
2. 初始化模型
class Model(nn.Module):
    def __init__(self):
        xxx
    def forward(self, word_seq1, word_seq2):
        # (1) 这里使用的形参名必须和DataSet中的input field的名称对应。因为我们是通过形参名, 进行赋值的
        # (2) input field的数量可以多于这里的形参数量。但是不能少于。
        xxxx
        # 输出必须是一个dict
3. Trainer的训练过程
(1) 从DataSet中按照batch_size取出一个batch,调用Model.forward
(2) 将 Model.forward的结果 与 标记为target的field 传入Losser当中。
       由于每个人写的Model.forward的output的dict可能key并不一样,比如有人是{'pred':xxx}, {'output': xxx};
       另外每个人将target可能也会设置为不同的名称, 比如有人是label, 有人设置为target;
    为了解决以上的问题,我们的loss提供映射机制
       比如CrossEntropyLosser的需要的输入是(prediction, target)。但是forward的output是{'output': xxx}; 'label'是target
       那么初始化losser的时候写为CrossEntropyLosser(prediction='output', target='label')即可
 (3) 对于Metric是同理的
     Metric计算也是从 forward的结果中取值 与 设置target的field中取值。 也是可以通过映射找到对应的值
一些问题.
1. DataSet中为什么需要设置input和target
只有被设置为input或者target的数据才会在train的过程中被取出来
(1.1) 我们只会在设置为input的field中寻找传递给Model.forward的参数。
(1.2) 我们在传递值给losser或者metric的时候会使用来自:
        (a)Model.forward的output
        (b)被设置为target的field
2. 我们是通过forwad中的形参名将DataSet中的field赋值给对应的参数
(1.1) 构建模型过程中,
 例如:
     DataSet中x,seq_lens是input,那么forward就应该是
     def forward(self, x, seq_lens):
         pass
     我们是通过形参名称进行匹配的field的
1. 加载数据到DataSet
2. 使用apply操作对DataSet进行预处理
(2.1) 处理过程中将某些field设置为input,某些field设置为target
3. 构建模型
(3.1) 构建模型过程中,需要注意forward函数的形参名需要和DataSet中设置为input的field名称是一致的。
例如:
    DataSet中x,seq_lens是input,那么forward就应该是
    def forward(self, x, seq_lens):
        pass
    我们是通过形参名称进行匹配的field的
(3.2) 模型的forward的output需要是dict类型的。
    建议将输出设置为{"pred": xx}.

API Reference

If you are looking for information on a specific function, class or method, this part of the documentation is for you.

fastNLP

fastNLP.api

fastNLP.api.api
class fastNLP.api.api.POS(model_path=None, device='cpu')[source]

FastNLP API for Part-Of-Speech tagging.

Parameters:
  • model_path (str) – the path to the model.
  • device (str) – device name such as “cpu” or “cuda:0”. Use the same notation as PyTorch.
predict(content)[source]
Parameters:content – list of list of str. Each string is a token(word).
Return answer:list of list of str. Each string is a tag.
test(file_path)[source]

Test performance over the given data set.

Parameters:file_path (str) –
Returns:a dictionary of metric values
fastNLP.api.converter
fastNLP.api.model_zoo
fastNLP.api.pipeline
class fastNLP.api.pipeline.Pipeline(processors=None)[source]

Pipeline takes a DataSet object as input, runs multiple processors sequentially, and outputs a DataSet object.

fastNLP.api.processor
class fastNLP.api.processor.FullSpaceToHalfSpaceProcessor(field_name, change_alpha=True, change_digit=True, change_punctuation=True, change_space=True)[source]

全角转半角,以字符为处理单元

class fastNLP.api.processor.Index2WordProcessor(vocab, field_name, new_added_field_name)[source]

将DataSet中某个为index的field根据vocab转换为str

class fastNLP.api.processor.IndexerProcessor(vocab, field_name, new_added_field_name, delete_old_field=False, is_input=True)[source]
给定一个vocabulary , 将指定field转换为index形式。指定field应该是一维的list,比如
[‘我’, ‘是’, xxx]
class fastNLP.api.processor.Num2TagProcessor(tag, field_name, new_added_field_name=None)[source]

将一句话中的数字转换为某个tag。

class fastNLP.api.processor.PreAppendProcessor(data, field_name, new_added_field_name=None)[source]
向某个field的起始增加data(应该为str类型)。该field需要为list类型。即新增的field为
[data] + instance[field_name]
class fastNLP.api.processor.SeqLenProcessor(field_name, new_added_field_name='seq_lens', is_input=True)[source]

根据某个field新增一个sequence length的field。取该field的第一维

class fastNLP.api.processor.SliceProcessor(start, end, step, field_name, new_added_field_name=None)[source]

从某个field中只取部分内容。等价于instance[field_name][start:end:step]

class fastNLP.api.processor.VocabIndexerProcessor(field_name, new_added_filed_name=None, min_freq=1, max_size=None, verbose=0, is_input=True)[source]
根据DataSet创建Vocabulary,并将其用数字index。新生成的index的field会被放在new_added_filed_name, 如果没有提供
new_added_field_name, 则覆盖原有的field_name.
construct_vocab(*datasets)[source]

使用传入的DataSet创建vocabulary

Parameters:datasets – DataSet类型的数据,用于构建vocabulary
Returns:
process(*datasets, only_index_dataset=None)[source]
若还未建立Vocabulary,则使用dataset中的DataSet建立vocabulary;若已经有了vocabulary则使用已有的vocabulary。得到vocabulary
后,则会index datasets与only_index_dataset。
Parameters:
  • datasets – DataSet类型的数据
  • only_index_dataset – DataSet, or list of DataSet. 该参数中的内容只会被用于index,不会被用于生成vocabulary。
Returns:

set_verbose(verbose)[source]

设置processor verbose状态。

Parameters:verbose – int, 0,不输出任何信息;1,输出vocab 信息。
Returns:
class fastNLP.api.processor.VocabProcessor(field_name, min_freq=1, max_size=None)[source]

传入若干个DataSet以建立vocabulary。

fastNLP.core

fastNLP.core.batch
class fastNLP.core.batch.Batch(dataset, batch_size, sampler=<fastNLP.core.sampler.RandomSampler object>, as_numpy=False, prefetch=False)[source]

Batch is an iterable object which iterates over mini-batches.

Example:

for batch_x, batch_y in Batch(data_set, batch_size=16, sampler=SequentialSampler()):
    # ...
Parameters:
  • dataset (DataSet) – a DataSet object
  • batch_size (int) – the size of the batch
  • sampler (Sampler) – a Sampler object
  • as_numpy (bool) – If True, return Numpy array. Otherwise, return torch tensors.
  • prefetch (bool) – If True, use multiprocessing to fetch next batch when training.
  • or torch.device device (str) – the batch’s device, if as_numpy is True, device is ignored.
fastNLP.core.dataset
class fastNLP.core.dataset.DataSet(data=None)[source]

DataSet is the collection of examples. DataSet provides instance-level interface. You can append and access an instance of the DataSet. However, it stores data in a different way: Field-first, Instance-second.

add_field(name, fields, padder=<fastNLP.core.fieldarray.AutoPadder object>, is_input=False, is_target=False)[source]

Add a new field to the DataSet.

Parameters:
  • name (str) – the name of the field.
  • fields – a list of int, float, or other objects.
  • padder (int) – PadBase对象,如何对该Field进行padding。大部分情况使用默认值即可
  • is_input (bool) – whether this field is model input.
  • is_target (bool) – whether this field is label or target.
append(ins)[source]

Add an instance to the DataSet. If the DataSet is not empty, the instance must have the same field names as the rest instances in the DataSet.

Parameters:ins – an Instance object
apply(func, new_field_name=None, **kwargs)[source]

Apply a function to every instance of the DataSet.

Parameters:
  • func – a function that takes an instance as input.
  • new_field_name (str) – If not None, results of the function will be stored as a new field.
  • **kwargs

    Accept parameters will be (1) is_input: boolean, will be ignored if new_field is None. If True, the new field will be as input. (2) is_target: boolean, will be ignored if new_field is None. If True, the new field will be as target.

Return results:

if new_field_name is not passed, returned values of the function over all instances.

delete_field(name)[source]

Delete a field based on the field name.

Parameters:name – the name of the field to be deleted.
drop(func)[source]

Drop instances if a condition holds.

Parameters:func – a function that takes an Instance object as input, and returns bool. The instance will be dropped if the function returns True.
get_all_fields()[source]

Return all the fields with their names.

Return field_arrays:
 the internal data structure of DataSet.
get_input_name()[source]

Get all field names with is_input as True.

Return field_names:
 a list of str
get_length()[source]

Fetch the length of the dataset.

Return length:
get_target_name()[source]

Get all field names with is_target as True.

Return field_names:
 a list of str
static load(path)[source]

Load a DataSet object from pickle.

Parameters:path (str) – the path to the pickle
Return data_set:
 
classmethod read_csv(csv_path, headers=None, sep=', ', dropna=True)[source]

Load data from a CSV file and return a DataSet object.

Parameters:
  • csv_path (str) – path to the CSV file
  • or Tuple[str] headers (List[str]) – headers of the CSV file
  • sep (str) – delimiter in CSV file. Default: “,”
  • dropna (bool) – If True, drop rows that have less entries than headers.
Return dataset:

the read data set

rename_field(old_name, new_name)[source]

Rename a field.

Parameters:
  • old_name (str) –
  • new_name (str) –
save(path)[source]

Save the DataSet object as pickle.

Parameters:path (str) – the path to the pickle
set_input(*field_name, flag=True)[source]

Set the input flag of these fields.

Parameters:
  • field_name – a sequence of str, indicating field names.
  • flag (bool) – Set these fields as input if True. Unset them if False.
set_pad_val(field_name, pad_val)[source]

为某个

Parameters:
  • field_name – str,修改该field的pad_val
  • pad_val – int,该field的padder会以pad_val作为padding index
Returns:

set_padder(field_name, padder)[source]

为field_name设置padder :param field_name: str, 设置field的padding方式为padder :param padder: PadderBase类型或None. 设置为None即删除padder。即对该field不进行padding操作. :return:

set_target(*field_names, flag=True)[source]

Change the target flag of these fields.

Parameters:
  • field_names – a sequence of str, indicating field names
  • flag (bool) – Set these fields as target if True. Unset them if False.
split(dev_ratio)[source]

Split the dataset into training and development(validation) set.

Parameters:dev_ratio (float) – the ratio of test set in all data.
Return (train_set, dev_set):
 train_set: the training set dev_set: the development set
fastNLP.core.dataset.construct_dataset(sentences)[source]

Construct a data set from a list of sentences.

Parameters:sentences – list of list of str
Return dataset:a DataSet object
fastNLP.core.fieldarray
class fastNLP.core.fieldarray.AutoPadder(pad_val=0)[source]

根据contents的数据自动判定是否需要做padding。 (1) 如果元素类型(元素类型是指field中最里层List的元素的数据类型, 可以通过FieldArray.dtype查看,比如[‘This’, ‘is’, …]的元素类

型为np.str, [[1,2], …]的元素类型为np.int64)的数据不为(np.int64, np.float64)则不会进行padding
  1. 如果元素类型为(np.int64, np.float64), (2.1) 如果该field的内容只有一个,比如为sequence_length, 则不进行padding (2.2) 如果该field的内容为List, 那么会将Batch中的List pad为一样长。若该List下还有里层的List需要padding,请使用其它padder。

    如果某个instance中field为[1, 2, 3],则可以pad; 若为[[1,2], [3,4, …]]则不能进行pad

class fastNLP.core.fieldarray.EngChar2DPadder(pad_val=0, pad_length=0)[source]
用于为英语执行character级别的2D padding操作。对应的field内容应该为[[‘T’, ‘h’, ‘i’, ‘s’], [‘a’], [‘d’, ‘e’, ‘m’, ‘o’]](这里为
了更直观,把它们写为str,但实际使用时它们应该是character的index)。
padded过后的batch内容,形状为(batch_size, max_sentence_length, max_word_length). max_sentence_length最大句子长度。
max_word_length最长的word的长度
class fastNLP.core.fieldarray.FieldArray(name, content, is_target=None, is_input=None, padder=<fastNLP.core.fieldarray.AutoPadder object>)[source]

FieldArray is the collection of Instance``s of the same field. It is the basic element of ``DataSet class.

Parameters:
  • name (str) – the name of the FieldArray
  • content (list) – a list of int, float, str or np.ndarray, or a list of list of one, or a np.ndarray.
  • is_target (bool) – If True, this FieldArray is used to compute loss.
  • is_input (bool) – If True, this FieldArray is used to the model input.
  • padder – PadderBase类型。大多数情况下都不需要设置该值,除非需要在多个维度上进行padding(比如英文中对character进行padding)
append(val)[source]

Add a new item to the tail of FieldArray.

Parameters:val – int, float, str, or a list of one.
get(indices, pad=True)[source]

Fetch instances based on indices.

Parameters:
  • indices – an int, or a list of int.
  • pad – bool, 是否对返回的结果进行padding。
Returns:

set_pad_val(pad_val)[source]

修改padder的pad_val. :param pad_val: int。 :return:

set_padder(padder)[source]

设置padding方式

Parameters:padder – PadderBase类型或None. 设置为None即删除padder.
Returns:
class fastNLP.core.fieldarray.PadderBase(pad_val=0, **kwargs)[source]

所有padder都需要继承这个类,并覆盖__call__()方法。 用于对batch进行padding操作。传入的element是inplace的,即直接修改element可能导致数据变化,建议inplace修改之前deepcopy一份。

fastNLP.core.instance
class fastNLP.core.instance.Instance(**fields)[source]

An Instance is an example of data. Example:

ins = Instance(field_1=[1, 1, 1], field_2=[2, 2, 2])
ins["field_1"]
>>[1, 1, 1]
ins.add_field("field_3", [3, 3, 3])
Parameters:fields – a dict of (str: list).
add_field(field_name, field)[source]

Add a new field to the instance.

Parameters:field_name – str, the name of the field.
fastNLP.core.losses
class fastNLP.core.losses.BCELoss(pred=None, target=None)[source]
class fastNLP.core.losses.CrossEntropyLoss(pred=None, target=None, padding_idx=-100)[source]
class fastNLP.core.losses.L1Loss(pred=None, target=None)[source]
class fastNLP.core.losses.LossBase[source]

Base class for all losses.

class fastNLP.core.losses.LossFunc(func, key_map=None, **kwargs)[source]

A wrapper of user-provided loss function.

class fastNLP.core.losses.LossInForward(loss_key='loss')[source]
class fastNLP.core.losses.NLLLoss(pred=None, target=None)[source]
fastNLP.core.losses.make_mask(lens, tar_len)[source]

To generate a mask over a sequence.

Parameters:
  • lens – list or LongTensor, [batch_size]
  • tar_len – int
Return mask:

ByteTensor

fastNLP.core.losses.mask(predict, truth, **kwargs)[source]

To select specific elements from Tensor. This method calls squash().

Parameters:
  • predict – Tensor, [batch_size , max_len , tag_size]
  • truth – Tensor, [batch_size , max_len]
  • **kwargs

    extra arguments, kwargs[“mask”]: ByteTensor, [batch_size , max_len], the mask Tensor. The position that is 1 will be selected.

Return predict , truth:
 

predict & truth after processing

fastNLP.core.losses.squash(predict, truth, **kwargs)[source]

To reshape tensors in order to fit loss functions in PyTorch.

Parameters:
  • predict – Tensor, model output
  • truth – Tensor, truth from dataset
  • **kwargs

    extra arguments

Return predict , truth:
 

predict & truth after processing

fastNLP.core.losses.unpad(predict, truth, **kwargs)[source]

To process padded sequence output to get true loss.

Parameters:
  • predict – Tensor, [batch_size , max_len , tag_size]
  • truth – Tensor, [batch_size , max_len]
  • kwargs – kwargs[“lens”] is a list or LongTensor, with size [batch_size]. The i-th element is true lengths of i-th sequence.
Return predict , truth:
 

predict & truth after processing

fastNLP.core.losses.unpad_mask(predict, truth, **kwargs)[source]

To process padded sequence output to get true loss.

Parameters:
  • predict – Tensor, [batch_size , max_len , tag_size]
  • truth – Tensor, [batch_size , max_len]
  • kwargs – kwargs[“lens”] is a list or LongTensor, with size [batch_size]. The i-th element is true lengths of i-th sequence.
Return predict , truth:
 

predict & truth after processing

fastNLP.core.metrics
class fastNLP.core.metrics.AccuracyMetric(pred=None, target=None, seq_lens=None)[source]

Accuracy Metric

evaluate(pred, target, seq_lens=None)[source]
Parameters:
  • pred – List of (torch.Tensor, or numpy.ndarray). Element’s shape can be: torch.Size([B,]), torch.Size([B, n_classes]), torch.Size([B, max_len]), torch.Size([B, max_len, n_classes])
  • target – List of (torch.Tensor, or numpy.ndarray). Element’s can be: torch.Size([B,]), torch.Size([B,]), torch.Size([B, max_len]), torch.Size([B, max_len])
  • seq_lens – List of (torch.Tensor, or numpy.ndarray). Element’s can be: None, None, torch.Size([B], torch.Size([B]). ignored if masks are provided.
get_metric(reset=True)[source]

Returns computed metric.

Parameters:reset (bool) – whether to recount next time.
Return evaluate_result:
 {“acc”: float}
class fastNLP.core.metrics.BMESF1PreRecMetric(b_idx=0, m_idx=1, e_idx=2, s_idx=3, pred=None, target=None, seq_lens=None)[source]
按照BMES标注方式计算f1, precision, recall。由于可能存在非法tag,比如”BS”,所以需要用以下的表格做转换,cur_B意思是当前tag是B,
next_B意思是后一个tag是B。则cur_B=S,即将当前被predict是B的tag标为S;next_M=B, 即将后一个被predict是M的tag标为B | | next_B | next_M | next_E | next_S | end | |:-----:|:——-:|:--------:|:——–:|:-------:|:——-:| | start | 合法 | next_M=B | next_E=S | 合法 | - | | cur_B | cur_B=S | 合法 | 合法 | cur_B=S | cur_B=S | | cur_M | cur_M=E | 合法 | 合法 | cur_M=E | cur_M=E | | cur_E | 合法 | next_M=B | next_E=S | 合法 | 合法 | | cur_S | 合法 | next_M=B | next_E=S | 合法 | 合法 |
举例:
prediction为BSEMS,会被认为是SSSSS.
本Metric不检验target的合法性,请务必保证target的合法性。
pred的形状应该为(batch_size, max_len) 或 (batch_size, max_len, 4)。 target形状为 (batch_size, max_len) seq_lens形状为 (batch_size, )
class fastNLP.core.metrics.MetricBase[source]

Base class for all metrics.

MetricBase handles validity check of its input dictionaries - pred_dict and target_dict. pred_dict is the output of forward() or prediction function of a model. target_dict is the ground truth from DataSet where is_target is set True. MetricBase will do the following type checks:

  1. whether self.evaluate has varargs, which is not supported.
  2. whether params needed by self.evaluate is not included in pred_dict, target_dict.
  3. whether params needed by self.evaluate duplicate in pred_dict, target_dict.
  4. whether params in pred_dict, target_dict are not used by evaluate.(Might cause warning)

Besides, before passing params into self.evaluate, this function will filter out params from output_dict and target_dict which are not used in self.evaluate. (but if **kwargs presented in self.evaluate, no filtering will be conducted.) However, in some cases where type check is not necessary, _fast_param_map will be used.

class fastNLP.core.metrics.SpanFPreRecMetric(tag_vocab, pred=None, target=None, seq_lens=None, encoding_type='bio', ignore_labels=None, only_gross=True, f_type='micro', beta=1)[source]

在序列标注问题中,以span的方式计算F, pre, rec. 最后得到的metric结果为 {

‘f’: xxx, # 这里使用f考虑以后可以计算f_beta值 ‘pre’: xxx, ‘rec’:xxx

} 若only_gross=False, 即还会返回各个label的metric统计值

{ ‘f’: xxx, ‘pre’: xxx, ‘rec’:xxx, ‘f-label’: xxx, ‘pre-label’: xxx, ‘rec-label’:xxx, …

}

evaluate(pred, target, seq_lens)[source]

A lot of design idea comes from allennlp’s measure :param pred: :param target: :param seq_lens: :return:

fastNLP.core.metrics.accuracy_topk(y_true, y_prob, k=1)[source]

Compute accuracy of y_true matching top-k probable labels in y_prob.

Parameters:
  • y_true – ndarray, true label, [n_samples]
  • y_prob – ndarray, label probabilities, [n_samples, n_classes]
  • k – int, k in top-k
Returns acc:

accuracy of top-k

fastNLP.core.metrics.bio_tag_to_spans(tags, ignore_labels=None)[source]
Parameters:
  • tags – List[str],
  • ignore_labels – List[str], 在该list中的label将被忽略
Returns:

List[Tuple[str, List[int, int]]]. [(label,[start, end])]

fastNLP.core.metrics.bmes_tag_to_spans(tags, ignore_labels=None)[source]
Parameters:
  • tags – List[str],
  • ignore_labels – List[str], 在该list中的label将被忽略
Returns:

List[Tuple[str, List[int, int]]]. [(label,[start, end])]

fastNLP.core.metrics.pred_topk(y_prob, k=1)[source]

Return top-k predicted labels and corresponding probabilities.

Parameters:
  • y_prob – ndarray, size [n_samples, n_classes], probabilities on labels
  • k – int, k of top-k
Returns (y_pred_topk, y_prob_topk):
 

y_pred_topk: ndarray, size [n_samples, k], predicted top-k labels y_prob_topk: ndarray, size [n_samples, k], probabilities for top-k labels

fastNLP.core.optimizer
class fastNLP.core.optimizer.Adam(lr=0.001, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, amsgrad=False, model_params=None)[source]
Parameters:
  • lr (float) – learning rate
  • weight_decay (float) –
  • model_params – a generator. E.g. model.parameters() for PyTorch models.
class fastNLP.core.optimizer.Optimizer(model_params, **kwargs)[source]
Parameters:
  • model_params – a generator. E.g. model.parameters() for PyTorch models.
  • kwargs – additional parameters.
class fastNLP.core.optimizer.SGD(lr=0.001, momentum=0, model_params=None)[source]
Parameters:
  • lr (float) – learning rate. Default: 0.01
  • momentum (float) – momentum. Default: 0
  • model_params – a generator. E.g. model.parameters() for PyTorch models.
fastNLP.core.predictor
class fastNLP.core.predictor.Predictor(network)[source]

An interface for predicting outputs based on trained models.

It does not care about evaluations of the model, which is different from Tester. This is a high-level model wrapper to be called by FastNLP. This class does not share any operations with Trainer and Tester. Currently, Predictor does not support GPU.

predict(data, seq_len_field_name=None)[source]

Perform inference using the trained model.

Parameters:
  • data – a DataSet object.
  • seq_len_field_name (str) – field name indicating sequence lengths
Returns:

list of batch outputs

fastNLP.core.sampler
class fastNLP.core.sampler.BaseSampler[source]

The base class of all samplers.

Sub-classes must implement the __call__ method. __call__ takes a DataSet object and returns a list of int - the sampling indices.

class fastNLP.core.sampler.BucketSampler(num_buckets=10, batch_size=32, seq_lens_field_name='seq_lens')[source]
Parameters:
  • num_buckets (int) – the number of buckets to use.
  • batch_size (int) – batch size per epoch.
  • seq_lens_field_name (str) – the field name indicating the field about sequence length.
class fastNLP.core.sampler.RandomSampler[source]

Sample data in random permutation order.

class fastNLP.core.sampler.SequentialSampler[source]

Sample data in the original order.

fastNLP.core.sampler.convert_to_torch_tensor(data_list, use_cuda)[source]

Convert lists into (cuda) Tensors.

Parameters:
  • data_list – 2-level lists
  • use_cuda – bool, whether to use GPU or not
Return data_list:
 

PyTorch Tensor of shape [batch_size, max_seq_len]

fastNLP.core.sampler.k_means_1d(x, k, max_iter=100)[source]

Perform k-means on 1-D data.

Parameters:
  • x – list of int, representing points in 1-D.
  • k – the number of clusters required.
  • max_iter – maximum iteration
Return centroids:
 

numpy array, centroids of the k clusters assignment: numpy array, 1-D, the bucket id assigned to each example.

fastNLP.core.sampler.k_means_bucketing(lengths, buckets)[source]

Assign all instances into possible buckets using k-means, such that instances in the same bucket have similar lengths.

Parameters:
  • lengths – list of int, the length of all samples.
  • buckets – list of int. The length of the list is the number of buckets. Each integer is the maximum length threshold for each bucket (This is usually None.).
Return data:

2-level list

[
    [index_11, index_12, ...],  # bucket 1
    [index_21, index_22, ...],  # bucket 2
    ...
]
fastNLP.core.sampler.simple_sort_bucketing(lengths)[source]
Parameters:

lengths – list of int, the lengths of all examples.

Return data:

2-level list

[
    [index_11, index_12, ...],  # bucket 1
    [index_21, index_22, ...],  # bucket 2
    ...
]
fastNLP.core.tester
class fastNLP.core.tester.Tester(data, model, metrics, batch_size=16, use_cuda=False, verbose=1)[source]

An collection of model inference and evaluation of performance, used over validation/dev set and test set.

Parameters:
  • data (DataSet) – a validation/development set
  • model (torch.nn.modules.module) – a PyTorch model
  • metrics (MetricBase) – a metric object or a list of metrics (List[MetricBase])
  • batch_size (int) – batch size for validation
  • use_cuda (bool) – whether to use CUDA in validation.
  • verbose (int) – the number of steps after which an information is printed.
test()[source]

Start test or validation.

Return eval_results:
 a dictionary whose keys are the class name of metrics to use, values are the evaluation results of these metrics.
fastNLP.core.trainer
fastNLP.core.utils
exception fastNLP.core.utils.CheckError(check_res: fastNLP.core.utils.CheckRes, func_signature: str)[source]

CheckError. Used in losses.LossBase, metrics.MetricBase.

class fastNLP.core.utils.CheckRes(missing, unused, duplicated, required, all_needed, varargs)
all_needed

Alias for field number 4

duplicated

Alias for field number 2

missing

Alias for field number 0

required

Alias for field number 3

unused

Alias for field number 1

varargs

Alias for field number 5

fastNLP.core.utils.get_func_signature(func)[source]

Given a function or method, return its signature. For example: (1) function

def func(a, b=’a’, *args):
xxxx

get_func_signature(func) # ‘func(a, b=’a’, *args)’

  1. method class Demo:

    def __init__(self):

    xxx

    def forward(self, a, b=’a’, **args)

    demo = Demo() get_func_signature(demo.forward) # ‘Demo.forward(self, a, b=’a’, **args)’

Parameters:func – a function or a method
Returns:str or None
fastNLP.core.utils.load_pickle(pickle_path, file_name)[source]

Load an object from a given pickle file.

Parameters:
  • pickle_path – str, the directory where the pickle file is.
  • file_name – str, the name of the pickle file.
Return obj:

an object stored in the pickle

fastNLP.core.utils.pickle_exist(pickle_path, pickle_name)[source]

Check if a given pickle file exists in the directory.

Parameters:
  • pickle_path – the directory of target pickle file
  • pickle_name – the filename of target pickle file
Returns:

True if file exists else False

class fastNLP.core.utils.pseudo_tqdm(**kwargs)[source]

当无法引入tqdm,或者Trainer中设置use_tqdm为false的时候,用该方法打印数据

fastNLP.core.utils.save_pickle(obj, pickle_path, file_name)[source]

Save an object into a pickle file.

Parameters:
  • obj – an object
  • pickle_path – str, the directory where the pickle file is to be saved
  • file_name – str, the name of the pickle file. In general, it should be ended by “pkl”.
fastNLP.core.utils.seq_lens_to_masks(seq_lens, float=False)[source]

Convert seq_lens to masks. :param seq_lens: list, np.ndarray, or torch.LongTensor, shape should all be (B,) :param float: if True, the return masks is in float type, otherwise it is byte. :return: list, np.ndarray or torch.Tensor, shape will be (B, max_length)

fastNLP.core.utils.seq_mask(seq_len, max_len)[source]

Create sequence mask.

Parameters:
  • seq_len – list or torch.Tensor, the lengths of sequences in a batch.
  • max_len – int, the maximum sequence length in a batch.
Return mask:

torch.LongTensor, [batch_size, max_len]

fastNLP.core.vocabulary
class fastNLP.core.vocabulary.Vocabulary(max_size=None, min_freq=None, unknown='<unk>', padding='<pad>')[source]

Use for word and index one to one mapping

Example:

vocab = Vocabulary()
word_list = "this is a word list".split()
vocab.update(word_list)
vocab["word"]
vocab.to_word(5)
Parameters:
  • max_size (int) – set the max number of words in Vocabulary. Default: None
  • min_freq (int) – set the min occur frequency of words in Vocabulary. Default: None
build_reverse_vocab()[source]

Build “index to word” dict based on “word to index” dict.

build_vocab()[source]

Build a mapping from word to index, and filter the word using max_size and min_freq.

to_index(w)[source]

Turn a word to an index. If w is not in Vocabulary, return the unknown label.

Parameters:w (str) – a word
fastNLP.core.vocabulary.check_build_status(func)[source]

A decorator to check whether the vocabulary updates after the last build.

fastNLP.core.vocabulary.check_build_vocab(func)[source]

A decorator to make sure the indexing is built before used.

fastNLP.io

fastNLP.io.base_loader
class fastNLP.io.base_loader.BaseLoader[source]

Base loader for all loaders.

classmethod load(data_path)[source]

先按行读取,去除一行两侧空白,再提取每行的字符。返回list of list of str

static load_lines(data_path)[source]

按行读取,舍弃每行两侧空白字符,返回list of str

classmethod load_with_cache(data_path, cache_path)[source]

缓存版的load

class fastNLP.io.base_loader.DataLoaderRegister[source]

Register for all data sets.

fastNLP.io.config_io
class fastNLP.io.config_io.ConfigLoader(data_path=None)[source]

Loader for configuration.

Parameters:data_path (str) – path to the config
static load_config(file_path, sections)[source]

Load section(s) of configuration into the sections provided. No returns.

Parameters:
  • file_path (str) – the path of config file
  • sections (dict) – the dict of {section_name(string): ConfigSection object}

Example:

test_args = ConfigSection()
ConfigLoader("config.cfg").load_config("./data_for_tests/config", {"POS_test": test_args})
class fastNLP.io.config_io.ConfigSaver(file_path)[source]

ConfigSaver is used to save config file and solve related conflicts.

Parameters:file_path (str) – path to the config file
save_config_file(section_name, section)[source]

This is the function to be called to change the config file with a single section and its name.

Parameters:
  • section_name (str) – The name of section what needs to be changed and saved.
  • section (ConfigSection) – The section with key and value what needs to be changed and saved.
class fastNLP.io.config_io.ConfigSection[source]

ConfigSection is the data structure storing all key-value pairs in one section in a config file.

fastNLP.io.dataset_loader
class fastNLP.io.dataset_loader.Conll2003Loader[source]

Loader for conll2003 dataset

More information about the given dataset cound be found on https://sites.google.com/site/ermasoftware/getting-started/ne-tagging-conll2003-data

convert(parsed_data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(dataset_path)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
class fastNLP.io.dataset_loader.ConllLoader[source]

loader for conll format files

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(data_path)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
static parse(lines)[source]
Parameters:lines (list) – a list containing all lines in a conll file.
Returns:a 3D list
class fastNLP.io.dataset_loader.ConllxDataLoader[source]

返回“词级别”的标签信息,包括词、词性、(句法)头依赖、(句法)边标签。跟``ZhConllPOSReader``完全不同。

class fastNLP.io.dataset_loader.DataSetLoader[source]

Interface for all DataSetLoaders.

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(path)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
class fastNLP.io.dataset_loader.DummyCWSReader[source]

Load pku dataset for Chinese word segmentation.

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(data_path, max_seq_len=32)[source]

Load pku dataset for Chinese word segmentation. CWS (Chinese Word Segmentation) pku training dataset format: 1. Each line is a sentence. 2. Each word in a sentence is separated by space. This function convert the pku dataset into three-level lists with labels <BMES>. B: beginning of a word M: middle of a word E: ending of a word S: single character

Parameters:
  • data_path (str) – path to the data set.
  • max_seq_len – int, the maximum length of a sequence. If a sequence is longer than it, split it into several sequences.
Returns:

three-level lists

class fastNLP.io.dataset_loader.DummyClassificationReader[source]

Loader for a dummy classification data set

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(data_path)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
static parse(lines)[source]

每行第一个token是标签,其余是字/词;由空格分隔。

Parameters:lines – lines from dataset
Returns:list(list(list())): the three level of lists are words, sentence, and dataset
class fastNLP.io.dataset_loader.DummyLMReader[source]

A Dummy Language Model Dataset Reader

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(data_path)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
class fastNLP.io.dataset_loader.DummyPOSReader[source]

A simple reader for a dummy POS tagging dataset.

In these datasets, each line are divided by ” “. The first Col is the vocabulary and the second Col is the label. Different sentence are divided by an empty line.

E.g:

Tom label1
and label2
Jerry   label1
.   label3
(separated by an empty line)
Hello   label4
world   label5
!   label3

In this example, there are two sentences “Tom and Jerry .” and “Hello world !”. Each word has its own label.

convert(data)[source]

Convert lists of strings into Instances with Fields.

load(data_path)[source]
Return data:

three-level list Example:

[
    [ [word_11, word_12, ...], [label_1, label_1, ...] ],
    [ [word_21, word_22, ...], [label_2, label_1, ...] ],
    ...
]
class fastNLP.io.dataset_loader.NaiveCWSReader(in_word_splitter=None)[source]

这个reader假设了分词数据集为以下形式, 即已经用空格分割好内容了 例如:

这是 fastNLP , 一个 非常 good   .

或者,即每个part后面还有一个pos tag 例如:

也/D  在/P  團員/Na  之中/Ng  ,/COMMACATEGORY
load(filepath, in_word_splitter=None, cut_long_sent=False)[source]
允许使用的情况有(默认以 或空格作为seg)
这是 fastNLP , 一个 非常 good 的 包 .
也/D 在/P 團員/Na 之中/Ng ,/COMMACATEGORY

如果splitter不为None则认为是第二种情况, 且我们会按splitter分割”也/D”, 然后取第一部分. 例如”也/D”.split(‘/’)[0]

Parameters:
  • filepath
  • in_word_splitter
  • cut_long_sent
Returns:

class fastNLP.io.dataset_loader.NativeDataSetLoader[source]

A simple example of DataSetLoader

load(path)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
class fastNLP.io.dataset_loader.PeopleDailyCorpusLoader[source]

人民日报数据集

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(data_path, pos=True, ner=True)[source]
Parameters:
  • data_path (str) – 数据路径
  • pos (bool) – 是否使用词性标签
  • ner (bool) – 是否使用命名实体标签
Returns:

a DataSet object

class fastNLP.io.dataset_loader.RawDataSetLoader[source]

A simple example of raw data reader

convert(data)[source]

Optional operation to build a DataSet.

Parameters:data – inner data structure (user-defined) to represent the data.
Returns:a DataSet object
load(data_path, split=None)[source]

Load data from a given file.

Parameters:path (str) – file path
Returns:a DataSet object
class fastNLP.io.dataset_loader.SNLIDataSetReader[source]

A data set loader for SNLI data set.

convert(data)[source]

Convert a 3D list to a DataSet object.

Parameters:data

A 3D tensor. Example:

[
    [ [premise_word_11, premise_word_12, ...], [hypothesis_word_11, hypothesis_word_12, ...], [label_1] ],
    [ [premise_word_21, premise_word_22, ...], [hypothesis_word_21, hypothesis_word_22, ...], [label_2] ],
    ...
]
Returns:A DataSet object.
load(path_list)[source]
Parameters:path_list (list) – A list of file name, in the order of premise file, hypothesis file, and label file.
Returns:A DataSet object.
class fastNLP.io.dataset_loader.ZhConllPOSReader[source]

读取中文Conll格式。返回“字级别”的标签,使用BMES记号扩展原来的词级别标签。

load(path)[source]
返回的DataSet, 包含以下的field
words:list of str, tag: list of str, 被加入了BMES tag, 比如原来的序列为[‘VP’, ‘NN’, ‘NN’, ..],会被认为是[“S-VP”, “B-NN”, “M-NN”,..]

假定了输入为conll的格式,以空行隔开两个句子,每行共7列,即

1   编者按     编者按     NN      O       11      nmod:topic
2   :       :       PU      O       11      punct
3   7月      7月      NT      DATE    4       compound:nn
4   12日     12日     NT      DATE    11      nmod:tmod
5   ,       ,       PU      O       11      punct

1   这       这       DT      O       3       det
2   款       款       M       O       1       mark:clf
3   飞行      飞行      NN      O       8       nsubj
4   从       从       P       O       5       case
5   外型      外型      NN      O       8       nmod:prep
fastNLP.io.dataset_loader.add_seg_tag(data)[source]
Parameters:data – list of ([word], [pos], [heads], [head_tags])
Returns:list of ([word], [pos])
fastNLP.io.dataset_loader.convert_seq2seq_dataset(data)[source]

Convert list of data into DataSet.

Parameters:data

list of list of strings, [num_examples, *]. Example:

[
    [ [word_11, word_12, ...], [label_1, label_1, ...] ],
    [ [word_21, word_22, ...], [label_2, label_1, ...] ],
    ...
]
Returns:a DataSet.
fastNLP.io.dataset_loader.convert_seq2tag_dataset(data)[source]

Convert list of data into DataSet.

Parameters:data

list of list of strings, [num_examples, *]. Example:

[
    [ [word_11, word_12, ...], label_1 ],
    [ [word_21, word_22, ...], label_2 ],
    ...
]
Returns:a DataSet.
fastNLP.io.dataset_loader.convert_seq_dataset(data)[source]

Create an DataSet instance that contains no labels.

Parameters:data

list of list of strings, [num_examples, *]. Example:

[
    [word_11, word_12, ...],
    ...
]
Returns:a DataSet.
fastNLP.io.dataset_loader.cut_long_sentence(sent, max_sample_length=200)[source]

将长于max_sample_length的sentence截成多段,只会在有空格的地方发生截断。所以截取的句子可能长于或者短于max_sample_length

Parameters:
  • sent – str.
  • max_sample_length – int.
Returns:

list of str.

fastNLP.io.embed_loader
class fastNLP.io.embed_loader.EmbedLoader[source]

docstring for EmbedLoader

static fast_load_embedding(emb_dim, emb_file, vocab)[source]

Fast load the pre-trained embedding and combine with the given dictionary. This loading method uses line-by-line operation.

Parameters:
  • emb_dim (int) – the dimension of the embedding. Should be the same as pre-trained embedding.
  • emb_file (str) – the pre-trained embedding file path.
  • vocab (Vocabulary) – a mapping from word to index, can be provided by user or built from pre-trained embedding
Return embedding_matrix:
 

numpy.ndarray

static load_embedding(emb_dim, emb_file, emb_type, vocab)[source]

Load the pre-trained embedding and combine with the given dictionary.

Parameters:
  • emb_dim (int) – the dimension of the embedding. Should be the same as pre-trained embedding.
  • emb_file (str) – the pre-trained embedding file path.
  • emb_type (str) – the pre-trained embedding format, support glove now
  • vocab (Vocabulary) – a mapping from word to index, can be provided by user or built from pre-trained embedding
Return (embedding_tensor, vocab):
 

embedding_tensor - Tensor of shape (len(word_dict), emb_dim); vocab - input vocab or vocab built by pre-train

fastNLP.io.logger
fastNLP.io.logger.create_logger(logger_name, log_path, log_format=None, log_level=20)[source]

Create a logger.

Parameters:
  • logger_name (str) –
  • log_path (str) –
  • log_format
  • log_level
Returns:

logger

To use a logger:

logger.debug("this is a debug message")
logger.info("this is a info message")
logger.warning("this is a warning message")
logger.error("this is an error message")
fastNLP.io.model_io
class fastNLP.io.model_io.ModelLoader[source]

Loader for models.

static load_pytorch(empty_model, model_path)[source]

Load model parameters from “.pkl” files into the empty PyTorch model.

Parameters:
  • empty_model – a PyTorch model with initialized parameters.
  • model_path (str) – the path to the saved model.
static load_pytorch_model(model_path)[source]

Load the entire model.

Parameters:model_path (str) – the path to the saved model.
class fastNLP.io.model_io.ModelSaver(save_path)[source]

Save a model

Parameters:save_path (str) – the path to the saving directory.

Example:

saver = ModelSaver("./save/model_ckpt_100.pkl")
saver.save_pytorch(model)
save_pytorch(model, param_only=True)[source]

Save a pytorch model into “.pkl” file.

Parameters:
  • model – a PyTorch model
  • param_only (bool) – whether only to save the model parameters or the entire model.

fastNLP.models

fastNLP.models.base_model
class fastNLP.models.base_model.BaseModel[source]

Base PyTorch model for all models.

class fastNLP.models.base_model.NaiveClassifier(in_feature_dim, out_feature_dim)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.models.biaffine_parser
class fastNLP.models.biaffine_parser.ArcBiaffine(hidden_size, bias=True)[source]

helper module for Biaffine Dependency Parser predicting arc

forward(head, dep)[source]

:param head arc-head tensor = [batch, length, emb_dim] :param dep arc-dependent tensor = [batch, length, emb_dim]

:return output tensor = [bacth, length, length]

class fastNLP.models.biaffine_parser.BiaffineParser(word_vocab_size, word_emb_dim, pos_vocab_size, pos_emb_dim, num_label, rnn_layers=1, rnn_hidden_size=200, arc_mlp_size=100, label_mlp_size=100, dropout=0.3, encoder='lstm', use_greedy_infer=False)[source]

Biaffine Dependency Parser implemantation. refer to ` Deep Biaffine Attention for Neural Dependency Parsing (Dozat and Manning, 2016) <https://arxiv.org/abs/1611.01734>`_ .

forward(word_seq, pos_seq, seq_lens, gold_heads=None)[source]
Parameters:
  • word_seq – [batch_size, seq_len] sequence of word’s indices
  • pos_seq – [batch_size, seq_len] sequence of word’s indices
  • seq_lens – [batch_size, seq_len] sequence of length masks
  • gold_heads – [batch_size, seq_len] sequence of golden heads
Return dict:

parsing results arc_pred: [batch_size, seq_len, seq_len] label_pred: [batch_size, seq_len, seq_len] mask: [batch_size, seq_len] head_pred: [batch_size, seq_len] if gold_heads is not provided, predicting the heads

static loss(arc_pred, label_pred, arc_true, label_true, mask)[source]

Compute loss.

Parameters:
  • arc_pred – [batch_size, seq_len, seq_len]
  • label_pred – [batch_size, seq_len, n_tags]
  • arc_true – [batch_size, seq_len]
  • label_true – [batch_size, seq_len]
  • mask – [batch_size, seq_len]
Returns:

loss value

predict(word_seq, pos_seq, seq_lens)[source]
Parameters:
  • word_seq
  • pos_seq
  • seq_lens
Returns:

arc_pred: [B, L] label_pred: [B, L]

class fastNLP.models.biaffine_parser.GraphParser[source]

Graph based Parser helper class, support greedy decoding and MST(Maximum Spanning Tree) decoding

class fastNLP.models.biaffine_parser.LabelBilinear(in1_features, in2_features, num_label, bias=True)[source]

helper module for Biaffine Dependency Parser predicting label

forward(x1, x2)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.models.biaffine_parser.ParserLoss(arc_pred=None, label_pred=None, arc_true=None, label_true=None)[source]
class fastNLP.models.biaffine_parser.ParserMetric(arc_pred=None, label_pred=None, arc_true=None, label_true=None, seq_lens=None)[source]
evaluate(arc_pred, label_pred, arc_true, label_true, seq_lens=None)[source]

Evaluate the performance of prediction.

fastNLP.models.biaffine_parser.mst(scores)[source]

with some modification to support parser output for MST decoding https://github.com/tdozat/Parser/blob/0739216129cd39d69997d28cbc4133b360ea3934/lib/models/nn.py#L692

fastNLP.models.char_language_model
class fastNLP.models.char_language_model.CharLM(char_emb_dim, word_emb_dim, vocab_size, num_char)[source]

CNN + highway network + LSTM # Input:

4D tensor with shape [batch_size, in_channel, height, width]
# Output:
2D Tensor with shape [batch_size, vocab_size]
# Arguments:
char_emb_dim: the size of each character’s attention word_emb_dim: the size of each word’s attention vocab_size: num of unique words num_char: num of characters use_gpu: True or False
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.models.char_language_model.Highway(input_size)[source]

Highway network

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.models.cnn_text_classification
class fastNLP.models.cnn_text_classification.CNNText(embed_num, embed_dim, num_classes, kernel_nums=(3, 4, 5), kernel_sizes=(3, 4, 5), padding=0, dropout=0.5)[source]

Text classification model by character CNN, the implementation of paper ‘Yoon Kim. 2014. Convolution Neural Networks for Sentence Classification.’

forward(word_seq)[source]
Parameters:word_seq – torch.LongTensor, [batch_size, seq_len]
Return output:dict of torch.LongTensor, [batch_size, num_classes]
predict(word_seq)[source]
Parameters:word_seq – torch.LongTensor, [batch_size, seq_len]
Return predict:dict of torch.LongTensor, [batch_size, seq_len]
fastNLP.models.sequence_modeling
class fastNLP.models.sequence_modeling.AdvSeqLabel(args, emb=None, id2words=None)[source]

Advanced Sequence Labeling Model

forward(word_seq, word_seq_origin_len, truth=None)[source]
Parameters:
  • word_seq – LongTensor, [batch_size, mex_len]
  • word_seq_origin_len – LongTensor, [batch_size, ]
  • truth – LongTensor, [batch_size, max_len]
Return y:

If truth is None, return list of [decode path(list)]. Used in testing and predicting. If truth is not None, return loss, a scalar. Used in training.

loss(**kwargs)[source]

Since the loss has been computed in forward(), this function simply returns x.

class fastNLP.models.sequence_modeling.SeqLabeling(args)[source]

PyTorch Network for sequence labeling

decode(x, pad=True)[source]
Parameters:
  • x – FloatTensor, [batch_size, max_len, tag_size]
  • pad – pad the output sequence to equal lengths
Return prediction:
 

list of [decode path(list)]

forward(word_seq, word_seq_origin_len, truth=None)[source]
Parameters:
  • word_seq – LongTensor, [batch_size, mex_len]
  • word_seq_origin_len – LongTensor, [batch_size,], the origin lengths of the sequences.
  • truth – LongTensor, [batch_size, max_len]
Return y:

If truth is None, return list of [decode path(list)]. Used in testing and predicting. If truth is not None, return loss, a scalar. Used in training.

loss(x, y)[source]

Since the loss has been computed in forward(), this function simply returns x.

fastNLP.models.snli
class fastNLP.models.snli.ESIM(**kwargs)[source]

PyTorch Network for SNLI task using ESIM model.

forward(premise, hypothesis, premise_len, hypothesis_len)[source]

Forward function

Parameters:
  • premise – A Tensor represents premise: [batch size(B), premise seq len(PL)].
  • hypothesis – A Tensor represents hypothesis: [B, hypothesis seq len(HL)].
  • premise_len – A Tensor record which is a real word and which is a padding word in premise: [B, PL].
  • hypothesis_len – A Tensor record which is a real word and which is a padding word in hypothesis: [B, HL].
Returns:

prediction: A Dict with Tensor of classification result: [B, n_labels(N)].

fastNLP.modules

fastNLP.modules.aggregator
fastNLP.modules.aggregator.attention
class fastNLP.modules.aggregator.attention.Attention(normalize=False)[source]
forward(query, memory, mask)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.aggregator.attention.Bi_Attention[source]
forward(in_x1, in_x2, x1_len, x2_len)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.aggregator.attention.DotAtte(key_size, value_size, dropout=0.1)[source]
forward(Q, K, V, mask_out=None)[source]
Parameters:
  • Q – [batch, seq_len, key_size]
  • K – [batch, seq_len, key_size]
  • V – [batch, seq_len, value_size]
  • mask_out – [batch, seq_len]
class fastNLP.modules.aggregator.attention.MultiHeadAtte(input_size, key_size, value_size, num_head, dropout=0.1)[source]
forward(Q, K, V, atte_mask_out=None)[source]
Parameters:
  • Q – [batch, seq_len, model_size]
  • K – [batch, seq_len, model_size]
  • V – [batch, seq_len, model_size]
  • seq_mask – [batch, seq_len]
fastNLP.modules.aggregator.avg_pool
class fastNLP.modules.aggregator.avg_pool.AvgPool(stride=None, padding=0)[source]

1-d average pooling module.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.aggregator.avg_pool.MeanPoolWithMask[source]
forward(tensor, mask, dim=0)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.aggregator.kmax_pool
class fastNLP.modules.aggregator.kmax_pool.KMaxPool(k=1)[source]

K max-pooling module.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.aggregator.max_pool
class fastNLP.modules.aggregator.max_pool.MaxPool(stride=None, padding=0, dilation=1)[source]

1-d max-pooling module.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.aggregator.max_pool.MaxPoolWithMask[source]
forward(tensor, mask, dim=0)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.aggregator.self_attention
class fastNLP.modules.aggregator.self_attention.SelfAttention(input_size, attention_unit=350, attention_hops=10, drop=0.5, initial_method=None, use_cuda=False)[source]

Self Attention Module.

Parameters:
  • input_size (int) –
  • attention_unit (int) –
  • attention_hops (int) –
  • drop (float) –
  • initial_method (str) –
  • use_cuda (bool) –
forward(input, input_origin)[source]
Parameters:
  • input – the matrix to do attention. [baz, senLen, h_dim]
  • inp – then token index include pad token( 0 ) [baz , senLen]
Return output1:

the input matrix after attention operation [baz, multi-head , h_dim]

Return output2:

the attention penalty term, a scalar [1]

penalization(attention)[source]

compute the penalization term for attention module

fastNLP.modules.decoder
fastNLP.modules.decoder.CRF
class fastNLP.modules.decoder.CRF.ConditionalRandomField(num_tags, include_start_end_trans=False, allowed_transitions=None, initial_method=None)[source]
Parameters:
  • num_tags (int) – 标签的数量。
  • include_start_end_trans (bool) – 是否包含起始tag
  • allowed_transitions (list) – List[Tuple[from_tag_id(int), to_tag_id(int)]]. 允许的跃迁,可以通过allowed_transitions()得到。 如果为None,则所有跃迁均为合法
  • initial_method (str) –
forward(feats, tags, mask)[source]

Calculate the neg log likelihood :param feats:FloatTensor, batch_size x max_len x num_tags :param tags:LongTensor, batch_size x max_len :param mask:ByteTensor batch_size x max_len :return:FloatTensor, batch_size

viterbi_decode(data, mask, get_score=False, unpad=False)[source]

Given a feats matrix, return best decode path and best score.

:param data:FloatTensor, batch_size x max_len x num_tags :param mask:ByteTensor batch_size x max_len :param get_score: bool, whether to output the decode score. :param unpad: bool, 是否将结果unpad,

如果False, 返回的是batch_size x max_len的tensor, 如果True,返回的是List[List[int]], List[int]为每个sequence的label,已经unpadding了,即每个

List[int]的长度是这个sample的有效长度
Returns:如果get_score为False,返回结果根据unpadding变动 如果get_score为True, 返回 (paths, List[float], )。第一个仍然是解码后的路径(根据unpad变化),第二个List[Float]
为每个seqence的解码分数。
fastNLP.modules.decoder.CRF.allowed_transitions(id2label, encoding_type='bio')[source]
Parameters:
  • id2label (dict) – key是label的indices,value是str类型的tag或tag-label。value可以是只有tag的, 比如”B”, “M”; 也可以是 “B-NN”, “M-NN”, tag和label之间一定要用”-“隔开。一般可以通过Vocabulary.get_id2word()id2label。
  • encoding_type – str, 支持”bio”, “bmes”。
Returns:

List[Tuple(int, int)]], 内部的Tuple是(from_tag_id, to_tag_id)。 返回的结果考虑了start和end,比如”BIO”中,B、O可以 位于序列的开端,而I不行。所以返回的结果中会包含(start_idx, B_idx), (start_idx, O_idx), 但是不包含(start_idx, I_idx). start_idx=len(id2label), end_idx=len(id2label)+1。

fastNLP.modules.decoder.CRF.is_transition_allowed(encoding_type, from_tag, from_label, to_tag, to_label)[source]
Parameters:
  • encoding_type – str, 支持”BIO”, “BMES”。
  • from_tag – str, 比如”B”, “M”之类的标注tag. 还包括start, end等两种特殊tag
  • from_label – str, 比如”PER”, “LOC”等label
  • to_tag – str, 比如”B”, “M”之类的标注tag. 还包括start, end等两种特殊tag
  • to_label – str, 比如”PER”, “LOC”等label
Returns:

bool,能否跃迁

fastNLP.modules.decoder.MLP
class fastNLP.modules.decoder.MLP.MLP(size_layer, activation='relu', initial_method=None, dropout=0.0)[source]

Multilayer Perceptrons as a decoder

Parameters:
  • size_layer (list) – list of int, define the size of MLP layers.
  • activation (str) – str or function, the activation function for hidden layers.
  • initial_method (str) – the name of initialization method.
  • dropout (float) – the probability of dropout.

Note

There is no activation function applying on output layer.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder
fastNLP.modules.encoder.char_embedding
class fastNLP.modules.encoder.char_embedding.ConvCharEmbedding(char_emb_size=50, feature_maps=(40, 30, 30), kernels=(3, 4, 5), initial_method=None)[source]

Character-level Embedding with CNN.

Parameters:
  • char_emb_size (int) – the size of character level embedding. Default: 50 say 26 characters, each embedded to 50 dim vector, then the input_size is 50.
  • feature_maps (tuple) – tuple of int. The length of the tuple is the number of convolution operations over characters. The i-th integer is the number of filters (dim of out channels) for the i-th convolution.
  • kernels (tuple) – tuple of int. The width of each kernel.
forward(x)[source]
Parameters:x[batch_size * sent_length, word_length, char_emb_size]
Returns:feature map of shape [batch_size * sent_length, sum(feature_maps), 1]
class fastNLP.modules.encoder.char_embedding.LSTMCharEmbedding(char_emb_size=50, hidden_size=None, initial_method=None)[source]

Character-level Embedding with LSTM.

Parameters:
  • char_emb_size (int) – the size of character level embedding. Default: 50 say 26 characters, each embedded to 50 dim vector, then the input_size is 50.
  • hidden_size (int) – the number of hidden units. Default: equal to char_emb_size.
forward(x)[source]
Parameters:x[ n_batch*n_word, word_length, char_emb_size]
Returns:[ n_batch*n_word, char_emb_size]
fastNLP.modules.encoder.conv
class fastNLP.modules.encoder.conv.Conv(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Basic 1-d convolution module, initialized with xavier_uniform.

Parameters:
  • in_channels (int) –
  • out_channels (int) –
  • kernel_size (tuple) –
  • stride (int) –
  • padding (int) –
  • dilation (int) –
  • groups (int) –
  • bias (bool) –
  • activation (str) –
  • initial_method (str) –
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.conv_maxpool
class fastNLP.modules.encoder.conv_maxpool.ConvMaxpool(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Convolution and max-pooling module with multiple kernel sizes.

Parameters:
  • in_channels (int) –
  • out_channels (int) –
  • kernel_sizes (tuple) –
  • stride (int) –
  • padding (int) –
  • dilation (int) –
  • groups (int) –
  • bias (bool) –
  • activation (str) –
  • initial_method (str) –
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.embedding
class fastNLP.modules.encoder.embedding.Embedding(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]

A simple lookup table.

Parameters:
  • nums (int) – the size of the lookup table
  • dims (int) – the size of each vector
  • padding_idx (int) – pads the tensor with zeros whenever it encounters this index
  • sparse (bool) – If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.linear
class fastNLP.modules.encoder.linear.Linear(input_size, output_size, bias=True, initial_method=None)[source]
Parameters:
  • input_size (int) – input size
  • output_size (int) – output size
  • bias (bool) –
  • initial_method (str) –
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.lstm
class fastNLP.modules.encoder.lstm.LSTM(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]

Long Short Term Memory

Parameters:
  • input_size (int) –
  • hidden_size (int) –
  • num_layers (int) –
  • dropout (float) –
  • batch_first (bool) –
  • bidirectional (bool) –
  • bias (bool) –
  • initial_method (str) –
  • get_hidden (bool) –
forward(x, h0=None, c0=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.masked_rnn
class fastNLP.modules.encoder.masked_rnn.MaskedGRU(*args, **kwargs)[source]

Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence. For each element in the input sequence, each layer computes the following function:

\[\begin{split}\begin{array}{ll} r_t = \mathrm{sigmoid}(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\ z_t = \mathrm{sigmoid}(W_{iz} x_t + b_{iz} + W_{hz} h_{(t-1)} + b_{hz}) \\ n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)}+ b_{hn})) \\ h_t = (1 - z_t) * n_t + z_t * h_{(t-1)} \\ \end{array}\end{split}\]

where \(h_t\) is the hidden state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(r_t\), \(z_t\), \(n_t\) are the reset, input, and new gates, respectively.

Parameters:
  • input_size (int) – The number of expected features in the input x
  • hidden_size (int) – The number of features in the hidden state h
  • num_layers (int) – Number of recurrent layers.
  • nonlinearity (str) – The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’
  • bias (bool) – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
  • batch_first (bool) – If True, then the input and output tensors are provided as (batch, seq, feature)
  • dropout (bool) – If non-zero, introduces a dropout layer on the outputs of each RNN layer except the last layer
  • bidirectional (bool) – If True, becomes a bidirectional RNN. Default: False
Inputs: input, mask, h_0
  • input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
  • h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
Outputs: output, h_n
  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_k) from the last layer of the RNN, for each k. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
class fastNLP.modules.encoder.masked_rnn.MaskedLSTM(*args, **kwargs)[source]

Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence. For each element in the input sequence, each layer computes the following function.

\[\begin{split}\begin{array}{ll} i_t = \mathrm{sigmoid}(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\ f_t = \mathrm{sigmoid}(W_{if} x_t + b_{if} + W_{hf} h_{(t-1)} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hc} h_{(t-1)} + b_{hg}) \\ o_t = \mathrm{sigmoid}(W_{io} x_t + b_{io} + W_{ho} h_{(t-1)} + b_{ho}) \\ c_t = f_t * c_{(t-1)} + i_t * g_t \\ h_t = o_t * \tanh(c_t) \end{array}\end{split}\]

where \(h_t\) is the hidden state at time t, \(c_t\) is the cell state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(i_t\), \(f_t\), \(g_t\), \(o_t\) are the input, forget, cell, and out gates, respectively.

Parameters:
  • input_size (int) – The number of expected features in the input x
  • hidden_size (int) – The number of features in the hidden state h
  • num_layers (int) – Number of recurrent layers.
  • bias (bool) – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
  • batch_first (bool) – If True, then the input and output tensors are provided as (batch, seq, feature)
  • dropout (bool) – If non-zero, introduces a dropout layer on the outputs of each RNN layer except the last layer
  • bidirectional (bool) – If True, becomes a bidirectional RNN. Default: False
Inputs: input, mask, (h_0, c_0)
  • input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
  • h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
  • c_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial cell state for each element in the batch.
Outputs: output, (h_n, c_n)
  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
  • c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len
class fastNLP.modules.encoder.masked_rnn.MaskedRNN(*args, **kwargs)[source]

Applies a multi-layer Elman RNN with costomized non-linearity to an input sequence. For each element in the input sequence, each layer computes the following function. \(h_t = \tanh(w_{ih} * x_t + b_{ih} + w_{hh} * h_{(t-1)} + b_{hh})\)

where \(h_t\) is the hidden state at time t, and \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer. If nonlinearity=’relu’, then ReLU is used instead of tanh.

Parameters:
  • input_size (int) – The number of expected features in the input x
  • hidden_size (int) – The number of features in the hidden state h
  • num_layers (int) – Number of recurrent layers.
  • nonlinearity (str) – The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’
  • bias (bool) – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
  • batch_first (bool) – If True, then the input and output tensors are provided as (batch, seq, feature)
  • dropout (float) – If non-zero, introduces a dropout layer on the outputs of each RNN layer except the last layer
  • bidirectional (bool) – If True, becomes a bidirectional RNN. Default: False
Inputs: input, mask, h_0
  • input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
  • h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
Outputs: output, h_n
  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_k) from the last layer of the RNN, for each k. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
class fastNLP.modules.encoder.masked_rnn.MaskedRNNBase(Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, layer_dropout=0, step_dropout=0, bidirectional=False, initial_method=None, **kwargs)[source]
forward(input, mask=None, hx=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

step(input, hx=None, mask=None)[source]

Execute one step forward (only for one-directional RNN).

Parameters:
  • input (Tensor) – input tensor of this step. (batch, input_size)
  • hx (Tensor) – the hidden state of last step. (num_layers, batch, hidden_size)
  • mask (Tensor) – the mask tensor of this step. (batch, )
Returns:

output (batch, hidden_size), tensor containing the output of this step from the last layer of RNN. hn (num_layers, batch, hidden_size), tensor containing the hidden state of this step

fastNLP.modules.encoder.transformer
class fastNLP.modules.encoder.transformer.TransformerEncoder(num_layers, **kargs)[source]
class SubLayer(model_size, inner_size, key_size, value_size, num_head, dropout=0.1)[source]
forward(input, seq_mask=None, atte_mask_out=None)[source]
Parameters:
  • input – [batch, seq_len, model_size]
  • seq_mask – [batch, seq_len]
Returns:

[batch, seq_len, model_size]

forward(x, seq_mask=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.variational_rnn
class fastNLP.modules.encoder.variational_rnn.VarGRU(*args, **kwargs)[source]

Variational Dropout GRU.

class fastNLP.modules.encoder.variational_rnn.VarLSTM(*args, **kwargs)[source]

Variational Dropout LSTM.

class fastNLP.modules.encoder.variational_rnn.VarRNN(*args, **kwargs)[source]

Variational Dropout RNN.

class fastNLP.modules.encoder.variational_rnn.VarRNNBase(mode, Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, input_dropout=0, hidden_dropout=0, bidirectional=False)[source]

Implementation of Variational Dropout RNN network. refer to A Theoretically Grounded Application of Dropout in Recurrent Neural Networks (Yarin Gal and Zoubin Ghahramani, 2016) https://arxiv.org/abs/1512.05287.

forward(input, hx=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.variational_rnn.VarRnnCellWrapper(cell, hidden_size, input_p, hidden_p)[source]

Wrapper for normal RNN Cells, make it support variational dropout

forward(input_x, hidden, mask_x, mask_h, is_reversed=False)[source]
Parameters:
  • input_x (PackedSequence) – [seq_len, batch_size, input_size]
  • hidden – for LSTM, tuple of (h_0, c_0), [batch_size, hidden_size] for other RNN, h_0, [batch_size, hidden_size]
  • mask_x – [batch_size, input_size] dropout mask for input
  • mask_h – [batch_size, hidden_size] dropout mask for hidden
Return PackedSequence output:
 

[seq_len, bacth_size, hidden_size] hidden: for LSTM, tuple of (h_n, c_n), [batch_size, hidden_size]

for other RNN, h_n, [batch_size, hidden_size]

fastNLP.modules.encoder.variational_rnn.flip(input, dims) → Tensor

Reverse the order of a n-D tensor along given axis in dims.

Args:
input (Tensor): the input tensor dims (a list or tuple): axis to flip on

Example:

>>> x = torch.arange(8).view(2, 2, 2)
>>> x
tensor([[[ 0,  1],
         [ 2,  3]],

        [[ 4,  5],
         [ 6,  7]]])
>>> torch.flip(x, [0, 1])
tensor([[[ 6,  7],
         [ 4,  5]],

        [[ 2,  3],
         [ 0,  1]]])
class fastNLP.modules.encoder.LSTM(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]

Long Short Term Memory

Parameters:
  • input_size (int) –
  • hidden_size (int) –
  • num_layers (int) –
  • dropout (float) –
  • batch_first (bool) –
  • bidirectional (bool) –
  • bias (bool) –
  • initial_method (str) –
  • get_hidden (bool) –
forward(x, h0=None, c0=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.Embedding(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]

A simple lookup table.

Parameters:
  • nums (int) – the size of the lookup table
  • dims (int) – the size of each vector
  • padding_idx (int) – pads the tensor with zeros whenever it encounters this index
  • sparse (bool) – If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.Linear(input_size, output_size, bias=True, initial_method=None)[source]
Parameters:
  • input_size (int) – input size
  • output_size (int) – output size
  • bias (bool) –
  • initial_method (str) –
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.Conv(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Basic 1-d convolution module, initialized with xavier_uniform.

Parameters:
  • in_channels (int) –
  • out_channels (int) –
  • kernel_size (tuple) –
  • stride (int) –
  • padding (int) –
  • dilation (int) –
  • groups (int) –
  • bias (bool) –
  • activation (str) –
  • initial_method (str) –
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.ConvMaxpool(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Convolution and max-pooling module with multiple kernel sizes.

Parameters:
  • in_channels (int) –
  • out_channels (int) –
  • kernel_sizes (tuple) –
  • stride (int) –
  • padding (int) –
  • dilation (int) –
  • groups (int) –
  • bias (bool) –
  • activation (str) –
  • initial_method (str) –
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.dropout
class fastNLP.modules.dropout.TimestepDropout(p=0.5, inplace=False)[source]

This module accepts a [batch_size, num_timesteps, embedding_dim)] and use a single dropout mask of shape (batch_size, embedding_dim) to apply on every time step.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.other_modules
class fastNLP.modules.other_modules.BiAffine(n_enc, n_dec, n_labels, biaffine=True, **kwargs)[source]
forward(input_d, input_e, mask_d=None, mask_e=None)[source]
Parameters:
  • input_d (Tensor) – the decoder input tensor with shape = [batch, length_decoder, input_size]
  • input_e (Tensor) – the child input tensor with shape = [batch, length_encoder, input_size]
  • mask_d – Tensor or None, the mask tensor for decoder with shape = [batch, length_decoder]
  • mask_e – Tensor or None, the mask tensor for encoder with shape = [batch, length_encoder]
Returns:

Tensor, the energy tensor with shape = [batch, num_label, length, length]

class fastNLP.modules.other_modules.BiLinear(n_left, n_right, n_out, bias=True)[source]
forward(input_left, input_right)[source]
Parameters:
  • input_left (Tensor) – the left input tensor with shape = [batch1, batch2, …, left_features]
  • input_right (Tensor) – the right input tensor with shape = [batch1, batch2, …, right_features]
class fastNLP.modules.other_modules.GroupNorm(num_features, num_groups=20, eps=1e-05)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.other_modules.LayerNormalization(layer_size, eps=0.001)[source]
Parameters:
  • layer_size (int) –
  • eps (float) – default=1e-3
forward(z)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.utils
fastNLP.modules.utils.initial_parameter(net, initial_method=None)[source]

A method used to initialize the weights of PyTorch models.

Parameters:
  • net – a PyTorch model
  • initial_method (str) –

    one of the following initializations.

    • xavier_uniform
    • xavier_normal (default)
    • kaiming_normal, or msra
    • kaiming_uniform
    • orthogonal
    • sparse
    • normal
    • uniform
fastNLP.modules.utils.seq_mask(seq_len, max_len)[source]

Create sequence mask.

Parameters:
  • seq_len – list or torch.Tensor, the lengths of sequences in a batch.
  • max_len – int, the maximum sequence length in a batch.
Returns:

mask, torch.LongTensor, [batch_size, max_len]

Indices and tables