fastNLP documentation¶
A Modularized and Extensible Toolkit for Natural Language Processing. Currently still in incubation.
Introduction¶
FastNLP is a modular Natural Language Processing system based on PyTorch, built for fast development of NLP models.
A deep learning NLP model is the composition of three types of modules:
module type | functionality | example |
---|---|---|
encoder | encode the input into some abstract representation | embedding, RNN, CNN, transformer |
aggregator | aggregate and reduce information | self-attention, max-pooling |
decoder | decode the representation into the output | MLP, CRF |
For example:

User’s Guide¶
Quickstart¶
FastNLP 1分钟上手教程¶
step 1¶
读取数据集
from fastNLP import DataSet
# linux_path = "../test/data_for_tests/tutorial_sample_dataset.csv"
win_path = "C:\\Users\zyfeng\Desktop\FudanNLP\\fastNLP\\test\\data_for_tests\\tutorial_sample_dataset.csv"
ds = DataSet.read_csv(win_path, headers=('raw_sentence', 'label'), sep='\t')
step 2¶
数据预处理 1. 类型转换 2. 切分验证集 3. 构建词典
# 将所有数字转为小写
ds.apply(lambda x: x['raw_sentence'].lower(), new_field_name='raw_sentence')
# label转int
ds.apply(lambda x: int(x['label']), new_field_name='label_seq', is_target=True)
def split_sent(ins):
return ins['raw_sentence'].split()
ds.apply(split_sent, new_field_name='words', is_input=True)
# 分割训练集/验证集
train_data, dev_data = ds.split(0.3)
print("Train size: ", len(train_data))
print("Test size: ", len(dev_data))
Train size: 54
Test size: 23
from fastNLP import Vocabulary
vocab = Vocabulary(min_freq=2)
train_data.apply(lambda x: [vocab.add(word) for word in x['words']])
# index句子, Vocabulary.to_index(word)
train_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
dev_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
step 3¶
定义模型
from fastNLP.models import CNNText
model = CNNText(embed_num=len(vocab), embed_dim=50, num_classes=5, padding=2, dropout=0.1)
step 4¶
开始训练
from fastNLP import Trainer, CrossEntropyLoss, AccuracyMetric
trainer = Trainer(model=model,
train_data=train_data,
dev_data=dev_data,
loss=CrossEntropyLoss(),
metrics=AccuracyMetric()
)
trainer.train()
print('Train finished!')
training epochs started 2018-12-07 14:03:41
HBox(children=(IntProgress(value=0, layout=Layout(flex='2'), max=6), HTML(value='')), layout=Layout(display='i…
Epoch 1/3. Step:2/6. AccuracyMetric: acc=0.26087
Epoch 2/3. Step:4/6. AccuracyMetric: acc=0.347826
Epoch 3/3. Step:6/6. AccuracyMetric: acc=0.608696
Train finished!
本教程结束。更多操作请参考进阶教程。¶
fastNLP上手教程¶
fastNLP提供方便的数据预处理,训练和测试模型的功能
DataSet & Instance¶
fastNLP用DataSet和Instance保存和处理数据。每个DataSet表示一个数据集,每个Instance表示一个数据样本。一个DataSet存有多个Instance,每个Instance可以自定义存哪些内容。
有一些read_*方法,可以轻松从文件读取数据,存成DataSet。
from fastNLP import DataSet
from fastNLP import Instance
# 从csv读取数据到DataSet
win_path = "C:\\Users\zyfeng\Desktop\FudanNLP\\fastNLP\\test\\data_for_tests\\tutorial_sample_dataset.csv"
dataset = DataSet.read_csv(win_path, headers=('raw_sentence', 'label'), sep='\t')
print(dataset[0])
{'raw_sentence': A series of escapades demonstrating the adage that what is good for the goose is also good for the gander , some of which occasionally amuses but none of which amounts to much of a story .,
'label': 1}
# DataSet.append(Instance)加入新数据
dataset.append(Instance(raw_sentence='fake data', label='0'))
dataset[-1]
{'raw_sentence': fake data,
'label': 0}
# DataSet.apply(func, new_field_name)对数据预处理
# 将所有数字转为小写
dataset.apply(lambda x: x['raw_sentence'].lower(), new_field_name='raw_sentence')
# label转int
dataset.apply(lambda x: int(x['label']), new_field_name='label_seq', is_target=True)
# 使用空格分割句子
dataset.drop(lambda x: len(x['raw_sentence'].split()) == 0)
def split_sent(ins):
return ins['raw_sentence'].split()
dataset.apply(split_sent, new_field_name='words', is_input=True)
# DataSet.drop(func)筛除数据
# 删除低于某个长度的词语
dataset.drop(lambda x: len(x['words']) <= 3)
# 分出测试集、训练集
test_data, train_data = dataset.split(0.3)
print("Train size: ", len(test_data))
print("Test size: ", len(train_data))
Train size: 54
Test size:
Vocabulary¶
fastNLP中的Vocabulary轻松构建词表,将词转成数字
from fastNLP import Vocabulary
# 构建词表, Vocabulary.add(word)
vocab = Vocabulary(min_freq=2)
train_data.apply(lambda x: [vocab.add(word) for word in x['words']])
vocab.build_vocab()
# index句子, Vocabulary.to_index(word)
train_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
test_data.apply(lambda x: [vocab.to_index(word) for word in x['words']], new_field_name='word_seq', is_input=True)
print(test_data[0])
{'raw_sentence': the plot is romantic comedy boilerplate from start to finish .,
'label': 2,
'label_seq': 2,
'words': ['the', 'plot', 'is', 'romantic', 'comedy', 'boilerplate', 'from', 'start', 'to', 'finish', '.'],
'word_seq': [2, 13, 9, 24, 25, 26, 15, 27, 11, 28, 3]}
# 假设你们需要做强化学习或者gan之类的项目,也许你们可以使用这里的dataset
from fastNLP.core.batch import Batch
from fastNLP.core.sampler import RandomSampler
batch_iterator = Batch(dataset=train_data, batch_size=2, sampler=RandomSampler())
for batch_x, batch_y in batch_iterator:
print("batch_x has: ", batch_x)
print("batch_y has: ", batch_y)
break
batch_x has: {'words': array([list(['this', 'kind', 'of', 'hands-on', 'storytelling', 'is', 'ultimately', 'what', 'makes', 'shanghai', 'ghetto', 'move', 'beyond', 'a', 'good', ',', 'dry', ',', 'reliable', 'textbook', 'and', 'what', 'allows', 'it', 'to', 'rank', 'with', 'its', 'worthy', 'predecessors', '.']),
list(['the', 'entire', 'movie', 'is', 'filled', 'with', 'deja', 'vu', 'moments', '.'])],
dtype=object), 'word_seq': tensor([[ 19, 184, 6, 1, 481, 9, 206, 50, 91, 1210, 1609, 1330,
495, 5, 63, 4, 1269, 4, 1, 1184, 7, 50, 1050, 10,
8, 1611, 16, 21, 1039, 1, 2],
[ 3, 711, 22, 9, 1282, 16, 2482, 2483, 200, 2, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0]])}
batch_y has: {'label_seq': tensor([3, 2])}
Model¶
# 定义一个简单的Pytorch模型
from fastNLP.models import CNNText
model = CNNText(embed_num=len(vocab), embed_dim=50, num_classes=5, padding=2, dropout=0.1)
model
CNNText(
(embed): Embedding(
(embed): Embedding(77, 50, padding_idx=0)
(dropout): Dropout(p=0.0)
)
(conv_pool): ConvMaxpool(
(convs): ModuleList(
(0): Conv1d(50, 3, kernel_size=(3,), stride=(1,), padding=(2,))
(1): Conv1d(50, 4, kernel_size=(4,), stride=(1,), padding=(2,))
(2): Conv1d(50, 5, kernel_size=(5,), stride=(1,), padding=(2,))
)
)
(dropout): Dropout(p=0.1)
(fc): Linear(
(linear): Linear(in_features=12, out_features=5, bias=True)
)
)
Trainer & Tester¶
使用fastNLP的Trainer训练模型
from fastNLP import Trainer
from copy import deepcopy
from fastNLP import CrossEntropyLoss
from fastNLP import AccuracyMetric
# 进行overfitting测试
copy_model = deepcopy(model)
overfit_trainer = Trainer(model=copy_model,
train_data=test_data,
dev_data=test_data,
loss=CrossEntropyLoss(pred="output", target="label_seq"),
metrics=AccuracyMetric(),
n_epochs=10,
save_path=None)
overfit_trainer.train()
training epochs started 2018-12-07 14:07:20
HBox(children=(IntProgress(value=0, layout=Layout(flex='2'), max=20), HTML(value='')), layout=Layout(display='…
Epoch 1/10. Step:2/20. AccuracyMetric: acc=0.037037
Epoch 2/10. Step:4/20. AccuracyMetric: acc=0.296296
Epoch 3/10. Step:6/20. AccuracyMetric: acc=0.333333
Epoch 4/10. Step:8/20. AccuracyMetric: acc=0.555556
Epoch 5/10. Step:10/20. AccuracyMetric: acc=0.611111
Epoch 6/10. Step:12/20. AccuracyMetric: acc=0.481481
Epoch 7/10. Step:14/20. AccuracyMetric: acc=0.62963
Epoch 8/10. Step:16/20. AccuracyMetric: acc=0.685185
Epoch 9/10. Step:18/20. AccuracyMetric: acc=0.722222
Epoch 10/10. Step:20/20. AccuracyMetric: acc=0.777778
# 实例化Trainer,传入模型和数据,进行训练
trainer = Trainer(model=model,
train_data=train_data,
dev_data=test_data,
loss=CrossEntropyLoss(pred="output", target="label_seq"),
metrics=AccuracyMetric(),
n_epochs=5)
trainer.train()
print('Train finished!')
training epochs started 2018-12-07 14:08:10
HBox(children=(IntProgress(value=0, layout=Layout(flex='2'), max=5), HTML(value='')), layout=Layout(display='i…
Epoch 1/5. Step:1/5. AccuracyMetric: acc=0.037037
Epoch 2/5. Step:2/5. AccuracyMetric: acc=0.037037
Epoch 3/5. Step:3/5. AccuracyMetric: acc=0.037037
Epoch 4/5. Step:4/5. AccuracyMetric: acc=0.185185
Epoch 5/5. Step:5/5. AccuracyMetric: acc=0.240741
Train finished!
from fastNLP import Tester
tester = Tester(data=test_data, model=model, metrics=AccuracyMetric())
acc = tester.test()
[tester]
AccuracyMetric: acc=0.240741
In summary¶
fastNLP Trainer的伪代码逻辑¶
1. 准备DataSet,假设DataSet中共有如下的fields¶
['raw_sentence', 'word_seq1', 'word_seq2', 'raw_label','label']
通过
DataSet.set_input('word_seq1', word_seq2', flag=True)将'word_seq1', 'word_seq2'设置为input
通过
DataSet.set_target('label', flag=True)将'label'设置为target
2. 初始化模型¶
class Model(nn.Module):
def __init__(self):
xxx
def forward(self, word_seq1, word_seq2):
# (1) 这里使用的形参名必须和DataSet中的input field的名称对应。因为我们是通过形参名, 进行赋值的
# (2) input field的数量可以多于这里的形参数量。但是不能少于。
xxxx
# 输出必须是一个dict
3. Trainer的训练过程¶
(1) 从DataSet中按照batch_size取出一个batch,调用Model.forward
(2) 将 Model.forward的结果 与 标记为target的field 传入Losser当中。
由于每个人写的Model.forward的output的dict可能key并不一样,比如有人是{'pred':xxx}, {'output': xxx};
另外每个人将target可能也会设置为不同的名称, 比如有人是label, 有人设置为target;
为了解决以上的问题,我们的loss提供映射机制
比如CrossEntropyLosser的需要的输入是(prediction, target)。但是forward的output是{'output': xxx}; 'label'是target
那么初始化losser的时候写为CrossEntropyLosser(prediction='output', target='label')即可
(3) 对于Metric是同理的
Metric计算也是从 forward的结果中取值 与 设置target的field中取值。 也是可以通过映射找到对应的值
一些问题.¶
1. DataSet中为什么需要设置input和target¶
只有被设置为input或者target的数据才会在train的过程中被取出来
(1.1) 我们只会在设置为input的field中寻找传递给Model.forward的参数。
(1.2) 我们在传递值给losser或者metric的时候会使用来自:
(a)Model.forward的output
(b)被设置为target的field
2. 我们是通过forwad中的形参名将DataSet中的field赋值给对应的参数¶
(1.1) 构建模型过程中,
例如:
DataSet中x,seq_lens是input,那么forward就应该是
def forward(self, x, seq_lens):
pass
我们是通过形参名称进行匹配的field的
1. 加载数据到DataSet¶
2. 使用apply操作对DataSet进行预处理¶
(2.1) 处理过程中将某些field设置为input,某些field设置为target
3. 构建模型¶
(3.1) 构建模型过程中,需要注意forward函数的形参名需要和DataSet中设置为input的field名称是一致的。
例如:
DataSet中x,seq_lens是input,那么forward就应该是
def forward(self, x, seq_lens):
pass
我们是通过形参名称进行匹配的field的
(3.2) 模型的forward的output需要是dict类型的。
建议将输出设置为{"pred": xx}.
API Reference¶
If you are looking for information on a specific function, class or method, this part of the documentation is for you.
fastNLP¶
fastNLP.api¶
fastNLP.api.api¶
fastNLP.api.converter¶
fastNLP.api.model_zoo¶
-
fastNLP.api.model_zoo.
load_url
(url, model_dir=None, map_location=None, progress=True)[source]¶ Loads the Torch serialized object at the given URL.
If the object is already present in model_dir, it’s deserialized and returned. The filename part of the URL should follow the naming convention
filename-<sha256>.ext
where<sha256>
is the first eight or more digits of the SHA256 hash of the contents of the file. The hash is used to ensure unique names and to verify the contents of the file.The default value of model_dir is
$TORCH_HOME/models
where$TORCH_HOME
defaults to~/.torch
. The default directory can be overridden with the$TORCH_MODEL_ZOO
environment variable.- Args:
- url (string): URL of the object to download model_dir (string, optional): directory in which to save the object map_location (optional): a function or a dict specifying how to remap storage locations (see torch.load) progress (bool, optional): whether or not to display a progress bar to stderr
- Example:
- # >>> state_dict = model_zoo.load_url(‘https://s3.amazonaws.com/pytorch/models/resnet18-5c106cde.pth’)
fastNLP.api.pipeline¶
fastNLP.api.processor¶
-
class
fastNLP.api.processor.
FullSpaceToHalfSpaceProcessor
(field_name, change_alpha=True, change_digit=True, change_punctuation=True, change_space=True)[source]¶ 全角转半角,以字符为处理单元
-
class
fastNLP.api.processor.
Index2WordProcessor
(vocab, field_name, new_added_field_name)[source]¶ 将DataSet中某个为index的field根据vocab转换为str
-
class
fastNLP.api.processor.
IndexerProcessor
(vocab, field_name, new_added_field_name, delete_old_field=False, is_input=True)[source]¶ - 给定一个vocabulary , 将指定field转换为index形式。指定field应该是一维的list,比如
- [‘我’, ‘是’, xxx]
-
class
fastNLP.api.processor.
Num2TagProcessor
(tag, field_name, new_added_field_name=None)[source]¶ 将一句话中的数字转换为某个tag。
-
class
fastNLP.api.processor.
PreAppendProcessor
(data, field_name, new_added_field_name=None)[source]¶ - 向某个field的起始增加data(应该为str类型)。该field需要为list类型。即新增的field为
- [data] + instance[field_name]
-
class
fastNLP.api.processor.
SeqLenProcessor
(field_name, new_added_field_name='seq_lens', is_input=True)[source]¶ 根据某个field新增一个sequence length的field。取该field的第一维
fastNLP.core¶
fastNLP.core.batch¶
-
class
fastNLP.core.batch.
Batch
(dataset, batch_size, sampler, as_numpy=False)[source]¶ Batch is an iterable object which iterates over mini-batches.
Example:
for batch_x, batch_y in Batch(data_set, batch_size=16, sampler=SequentialSampler()): # ...
Parameters: - dataset (DataSet) – a DataSet object
- batch_size (int) – the size of the batch
- sampler (Sampler) – a Sampler object
- as_numpy (bool) – If True, return Numpy array. Otherwise, return torch tensors.
fastNLP.core.dataset¶
-
class
fastNLP.core.dataset.
DataSet
(data=None)[source]¶ DataSet is the collection of examples. DataSet provides instance-level interface. You can append and access an instance of the DataSet. However, it stores data in a different way: Field-first, Instance-second.
-
add_field
(name, fields, padding_val=0, is_input=False, is_target=False)[source]¶ Add a new field to the DataSet.
Parameters: - name (str) – the name of the field.
- fields – a list of int, float, or other objects.
- padding_val (int) – integer for padding.
- is_input (bool) – whether this field is model input.
- is_target (bool) – whether this field is label or target.
-
append
(ins)[source]¶ Add an instance to the DataSet. If the DataSet is not empty, the instance must have the same field names as the rest instances in the DataSet.
Parameters: ins – an Instance object
-
apply
(func, new_field_name=None, **kwargs)[source]¶ Apply a function to every instance of the DataSet.
Parameters: - func – a function that takes an instance as input.
- new_field_name (str) – If not None, results of the function will be stored as a new field.
- **kwargs –
Accept parameters will be (1) is_input: boolean, will be ignored if new_field is None. If True, the new field will be as input. (2) is_target: boolean, will be ignored if new_field is None. If True, the new field will be as target.
Return results: if new_field_name is not passed, returned values of the function over all instances.
-
delete_field
(name)[source]¶ Delete a field based on the field name.
Parameters: name – the name of the field to be deleted.
-
drop
(func)[source]¶ Drop instances if a condition holds.
Parameters: func – a function that takes an Instance object as input, and returns bool. The instance will be dropped if the function returns True.
-
get_all_fields
()[source]¶ Return all the fields with their names.
Return field_arrays: the internal data structure of DataSet.
-
get_input_name
()[source]¶ Get all field names with is_input as True.
Return field_names: a list of str
-
get_target_name
()[source]¶ Get all field names with is_target as True.
Return field_names: a list of str
-
static
load
(path)[source]¶ Load a DataSet object from pickle.
Parameters: path (str) – the path to the pickle Return data_set:
-
classmethod
read_csv
(csv_path, headers=None, sep=', ', dropna=True)[source]¶ Load data from a CSV file and return a DataSet object.
Parameters: - csv_path (str) – path to the CSV file
- or Tuple[str] headers (List[str]) – headers of the CSV file
- sep (str) – delimiter in CSV file. Default: “,”
- dropna (bool) – If True, drop rows that have less entries than headers.
Return dataset: the read data set
-
rename_field
(old_name, new_name)[source]¶ Rename a field.
Parameters: - old_name (str) –
- new_name (str) –
-
save
(path)[source]¶ Save the DataSet object as pickle.
Parameters: path (str) – the path to the pickle
-
set_input
(*field_name, flag=True)[source]¶ Set the input flag of these fields.
Parameters: - field_name – a sequence of str, indicating field names.
- flag (bool) – Set these fields as input if True. Unset them if False.
-
fastNLP.core.fieldarray¶
-
class
fastNLP.core.fieldarray.
FieldArray
(name, content, padding_val=0, is_target=None, is_input=None)[source]¶ FieldArray
is the collection ofInstance``s of the same field. It is the basic element of ``DataSet
class.Parameters: - name (str) – the name of the FieldArray
- content (list) – a list of int, float, str or np.ndarray, or a list of list of one, or a np.ndarray.
- padding_val (int) – the integer for padding. Default: 0.
- is_target (bool) – If True, this FieldArray is used to compute loss.
- is_input (bool) – If True, this FieldArray is used to the model input.
fastNLP.core.instance¶
fastNLP.core.losses¶
-
class
fastNLP.core.losses.
LossFunc
(func, key_map=None, **kwargs)[source]¶ A wrapper of user-provided loss function.
-
fastNLP.core.losses.
make_mask
(lens, tar_len)[source]¶ To generate a mask over a sequence.
Parameters: - lens – list or LongTensor, [batch_size]
- tar_len – int
Return mask: ByteTensor
-
fastNLP.core.losses.
mask
(predict, truth, **kwargs)[source]¶ To select specific elements from Tensor. This method calls
squash()
.Parameters: - predict – Tensor, [batch_size , max_len , tag_size]
- truth – Tensor, [batch_size , max_len]
- **kwargs –
extra arguments, kwargs[“mask”]: ByteTensor, [batch_size , max_len], the mask Tensor. The position that is 1 will be selected.
Return predict , truth: predict & truth after processing
-
fastNLP.core.losses.
squash
(predict, truth, **kwargs)[source]¶ To reshape tensors in order to fit loss functions in PyTorch.
Parameters: - predict – Tensor, model output
- truth – Tensor, truth from dataset
- **kwargs –
extra arguments
Return predict , truth: predict & truth after processing
-
fastNLP.core.losses.
unpad
(predict, truth, **kwargs)[source]¶ To process padded sequence output to get true loss.
Parameters: - predict – Tensor, [batch_size , max_len , tag_size]
- truth – Tensor, [batch_size , max_len]
- kwargs – kwargs[“lens”] is a list or LongTensor, with size [batch_size]. The i-th element is true lengths of i-th sequence.
Return predict , truth: predict & truth after processing
-
fastNLP.core.losses.
unpad_mask
(predict, truth, **kwargs)[source]¶ To process padded sequence output to get true loss.
Parameters: - predict – Tensor, [batch_size , max_len , tag_size]
- truth – Tensor, [batch_size , max_len]
- kwargs – kwargs[“lens”] is a list or LongTensor, with size [batch_size]. The i-th element is true lengths of i-th sequence.
Return predict , truth: predict & truth after processing
fastNLP.core.metrics¶
-
class
fastNLP.core.metrics.
AccuracyMetric
(pred=None, target=None, seq_lens=None)[source]¶ Accuracy Metric
-
evaluate
(pred, target, seq_lens=None)[source]¶ Parameters: - pred – List of (torch.Tensor, or numpy.ndarray). Element’s shape can be: torch.Size([B,]), torch.Size([B, n_classes]), torch.Size([B, max_len]), torch.Size([B, max_len, n_classes])
- target – List of (torch.Tensor, or numpy.ndarray). Element’s can be: torch.Size([B,]), torch.Size([B,]), torch.Size([B, max_len]), torch.Size([B, max_len])
- seq_lens – List of (torch.Tensor, or numpy.ndarray). Element’s can be: None, None, torch.Size([B], torch.Size([B]). ignored if masks are provided.
-
-
class
fastNLP.core.metrics.
BMESF1PreRecMetric
(b_idx=0, m_idx=1, e_idx=2, s_idx=3, pred=None, target=None, seq_lens=None)[source]¶ - 按照BMES标注方式计算f1, precision, recall。由于可能存在非法tag,比如”BS”,所以需要用以下的表格做转换,cur_B意思是当前tag是B,
- next_B意思是后一个tag是B。则cur_B=S,即将当前被predict是B的tag标为S;next_M=B, 即将后一个被predict是M的tag标为B | | next_B | next_M | next_E | next_S | end | |:-----:|:——-:|:--------:|:——–:|:-------:|:——-:| | start | 合法 | next_M=B | next_E=S | 合法 | - | | cur_B | cur_B=S | 合法 | 合法 | cur_B=S | cur_B=S | | cur_M | cur_M=E | 合法 | 合法 | cur_M=E | cur_M=E | | cur_E | 合法 | next_M=B | next_E=S | 合法 | 合法 | | cur_S | 合法 | next_M=B | next_E=S | 合法 | 合法 |
- 举例:
- prediction为BSEMS,会被认为是SSSSS.
- 本Metric不检验target的合法性,请务必保证target的合法性。
- pred的形状应该为(batch_size, max_len) 或 (batch_size, max_len, 4)。 target形状为 (batch_size, max_len) seq_lens形状为 (batch_size, )
-
class
fastNLP.core.metrics.
MetricBase
[source]¶ Base class for all metrics.
MetricBase
handles validity check of its input dictionaries -pred_dict
andtarget_dict
.pred_dict
is the output offorward()
or prediction function of a model.target_dict
is the ground truth from DataSet whereis_target
is setTrue
.MetricBase
will do the following type checks:- whether self.evaluate has varargs, which is not supported.
- whether params needed by self.evaluate is not included in
pred_dict
,target_dict
. - whether params needed by self.evaluate duplicate in
pred_dict
,target_dict
. - whether params in
pred_dict
,target_dict
are not used by evaluate.(Might cause warning)
Besides, before passing params into self.evaluate, this function will filter out params from output_dict and target_dict which are not used in self.evaluate. (but if **kwargs presented in self.evaluate, no filtering will be conducted.) However, in some cases where type check is not necessary,
_fast_param_map
will be used.
-
class
fastNLP.core.metrics.
SpanFPreRecMetric
(tag_vocab, pred=None, target=None, seq_lens=None, encoding_type='bio', ignore_labels=None, only_gross=True, f_type='micro', beta=1)[source]¶ 在序列标注问题中,以span的方式计算F, pre, rec. 最后得到的metric结果为 {
‘f’: xxx, # 这里使用f考虑以后可以计算f_beta值 ‘pre’: xxx, ‘rec’:xxx} 若only_gross=False, 即还会返回各个label的metric统计值
{ ‘f’: xxx, ‘pre’: xxx, ‘rec’:xxx, ‘f-label’: xxx, ‘pre-label’: xxx, ‘rec-label’:xxx, …}
-
fastNLP.core.metrics.
accuracy_topk
(y_true, y_prob, k=1)[source]¶ Compute accuracy of y_true matching top-k probable labels in y_prob.
Parameters: - y_true – ndarray, true label, [n_samples]
- y_prob – ndarray, label probabilities, [n_samples, n_classes]
- k – int, k in top-k
Returns acc: accuracy of top-k
-
fastNLP.core.metrics.
bio_tag_to_spans
(tags, ignore_labels=None)[source]¶ Parameters: - tags – List[str],
- ignore_labels – List[str], 在该list中的label将被忽略
Returns: List[Tuple[str, List[int, int]]]. [(label,[start, end])]
-
fastNLP.core.metrics.
bmes_tag_to_spans
(tags, ignore_labels=None)[source]¶ Parameters: - tags – List[str],
- ignore_labels – List[str], 在该list中的label将被忽略
Returns: List[Tuple[str, List[int, int]]]. [(label,[start, end])]
-
fastNLP.core.metrics.
pred_topk
(y_prob, k=1)[source]¶ Return top-k predicted labels and corresponding probabilities.
Parameters: - y_prob – ndarray, size [n_samples, n_classes], probabilities on labels
- k – int, k of top-k
Returns (y_pred_topk, y_prob_topk): y_pred_topk: ndarray, size [n_samples, k], predicted top-k labels y_prob_topk: ndarray, size [n_samples, k], probabilities for top-k labels
fastNLP.core.optimizer¶
-
class
fastNLP.core.optimizer.
Adam
(lr=0.001, weight_decay=0, betas=(0.9, 0.999), eps=1e-08, amsgrad=False, model_params=None)[source]¶ Parameters: - lr (float) – learning rate
- weight_decay (float) –
- model_params – a generator. E.g.
model.parameters()
for PyTorch models.
fastNLP.core.predictor¶
-
class
fastNLP.core.predictor.
Predictor
[source]¶ An interface for predicting outputs based on trained models.
It does not care about evaluations of the model, which is different from Tester. This is a high-level model wrapper to be called by FastNLP. This class does not share any operations with Trainer and Tester. Currently, Predictor does not support GPU.
fastNLP.core.sampler¶
-
class
fastNLP.core.sampler.
BaseSampler
[source]¶ The base class of all samplers.
Sub-classes must implement the
__call__
method.__call__
takes a DataSet object and returns a list of int - the sampling indices.
-
class
fastNLP.core.sampler.
BucketSampler
(num_buckets=10, batch_size=32, seq_lens_field_name='seq_lens')[source]¶ Parameters: - num_buckets (int) – the number of buckets to use.
- batch_size (int) – batch size per epoch.
- seq_lens_field_name (str) – the field name indicating the field about sequence length.
-
fastNLP.core.sampler.
convert_to_torch_tensor
(data_list, use_cuda)[source]¶ Convert lists into (cuda) Tensors.
Parameters: - data_list – 2-level lists
- use_cuda – bool, whether to use GPU or not
Return data_list: PyTorch Tensor of shape [batch_size, max_seq_len]
-
fastNLP.core.sampler.
k_means_1d
(x, k, max_iter=100)[source]¶ Perform k-means on 1-D data.
Parameters: - x – list of int, representing points in 1-D.
- k – the number of clusters required.
- max_iter – maximum iteration
Return centroids: numpy array, centroids of the k clusters assignment: numpy array, 1-D, the bucket id assigned to each example.
-
fastNLP.core.sampler.
k_means_bucketing
(lengths, buckets)[source]¶ Assign all instances into possible buckets using k-means, such that instances in the same bucket have similar lengths.
Parameters: - lengths – list of int, the length of all samples.
- buckets – list of int. The length of the list is the number of buckets. Each integer is the maximum length threshold for each bucket (This is usually None.).
Return data: 2-level list
[ [index_11, index_12, ...], # bucket 1 [index_21, index_22, ...], # bucket 2 ... ]
fastNLP.core.tester¶
-
class
fastNLP.core.tester.
Tester
(data, model, metrics, batch_size=16, use_cuda=False, verbose=1)[source]¶ An collection of model inference and evaluation of performance, used over validation/dev set and test set.
Parameters: - data (DataSet) – a validation/development set
- model (torch.nn.modules.module) – a PyTorch model
- metrics (MetricBase) – a metric object or a list of metrics (List[MetricBase])
- batch_size (int) – batch size for validation
- use_cuda (bool) – whether to use CUDA in validation.
- verbose (int) – the number of steps after which an information is printed.
fastNLP.core.trainer¶
fastNLP.core.utils¶
-
exception
fastNLP.core.utils.
CheckError
(check_res: fastNLP.core.utils.CheckRes, func_signature: str)[source]¶ CheckError. Used in losses.LossBase, metrics.MetricBase.
-
class
fastNLP.core.utils.
CheckRes
(missing, unused, duplicated, required, all_needed, varargs)¶ -
all_needed
¶ Alias for field number 4
-
duplicated
¶ Alias for field number 2
-
missing
¶ Alias for field number 0
-
required
¶ Alias for field number 3
-
unused
¶ Alias for field number 1
-
varargs
¶ Alias for field number 5
-
-
fastNLP.core.utils.
get_func_signature
(func)[source]¶ Given a function or method, return its signature. For example: (1) function
method class Demo:
- def __init__(self):
xxx
def forward(self, a, b=’a’, **args)
demo = Demo() get_func_signature(demo.forward) # ‘Demo.forward(self, a, b=’a’, **args)’
Parameters: func – a function or a method Returns: str or None
-
fastNLP.core.utils.
load_pickle
(pickle_path, file_name)[source]¶ Load an object from a given pickle file.
Parameters: - pickle_path – str, the directory where the pickle file is.
- file_name – str, the name of the pickle file.
Return obj: an object stored in the pickle
-
fastNLP.core.utils.
pickle_exist
(pickle_path, pickle_name)[source]¶ Check if a given pickle file exists in the directory.
Parameters: - pickle_path – the directory of target pickle file
- pickle_name – the filename of target pickle file
Returns: True if file exists else False
-
fastNLP.core.utils.
save_pickle
(obj, pickle_path, file_name)[source]¶ Save an object into a pickle file.
Parameters: - obj – an object
- pickle_path – str, the directory where the pickle file is to be saved
- file_name – str, the name of the pickle file. In general, it should be ended by “pkl”.
-
fastNLP.core.utils.
seq_lens_to_masks
(seq_lens, float=False)[source]¶ Convert seq_lens to masks. :param seq_lens: list, np.ndarray, or torch.LongTensor, shape should all be (B,) :param float: if True, the return masks is in float type, otherwise it is byte. :return: list, np.ndarray or torch.Tensor, shape will be (B, max_length)
fastNLP.core.vocabulary¶
-
class
fastNLP.core.vocabulary.
Vocabulary
(max_size=None, min_freq=None, unknown='<unk>', padding='<pad>')[source]¶ Use for word and index one to one mapping
Example:
vocab = Vocabulary() word_list = "this is a word list".split() vocab.update(word_list) vocab["word"] vocab.to_word(5)
Parameters: - max_size (int) – set the max number of words in Vocabulary. Default: None
- min_freq (int) – set the min occur frequency of words in Vocabulary. Default: None
fastNLP.io¶
fastNLP.io.base_loader¶
fastNLP.io.config_io¶
-
class
fastNLP.io.config_io.
ConfigLoader
(data_path=None)[source]¶ Loader for configuration.
Parameters: data_path (str) – path to the config -
static
load_config
(file_path, sections)[source]¶ Load section(s) of configuration into the
sections
provided. No returns.Parameters: - file_path (str) – the path of config file
- sections (dict) – the dict of
{section_name(string): ConfigSection object}
Example:
test_args = ConfigSection() ConfigLoader("config.cfg", "").load_config("./data_for_tests/config", {"POS_test": test_args})
-
static
-
class
fastNLP.io.config_io.
ConfigSaver
(file_path)[source]¶ ConfigSaver is used to save config file and solve related conflicts.
Parameters: file_path (str) – path to the config file -
save_config_file
(section_name, section)[source]¶ This is the function to be called to change the config file with a single section and its name.
Parameters: - section_name (str) – The name of section what needs to be changed and saved.
- section (ConfigSection) – The section with key and value what needs to be changed and saved.
-
fastNLP.io.dataset_loader¶
-
class
fastNLP.io.dataset_loader.
ClassDataSetLoader
[source]¶ Loader for classification data sets
-
convert
(data)[source]¶ Optional operation to build a DataSet.
Parameters: data – inner data structure (user-defined) to represent the data. Returns: a DataSet object
-
-
class
fastNLP.io.dataset_loader.
Conll2003Loader
[source]¶ Self-defined loader of conll2003 dataset
More information about the given dataset cound be found on https://sites.google.com/site/ermasoftware/getting-started/ne-tagging-conll2003-data
-
class
fastNLP.io.dataset_loader.
ConllLoader
[source]¶ loader for conll format files
-
convert
(data)[source]¶ Optional operation to build a DataSet.
Parameters: data – inner data structure (user-defined) to represent the data. Returns: a DataSet object
-
-
class
fastNLP.io.dataset_loader.
DataSetLoader
[source]¶ Interface for all DataSetLoaders.
-
class
fastNLP.io.dataset_loader.
LMDataSetLoader
[source]¶ Language Model Dataset Loader
This loader produces data for language model training in a supervised way. That means it has X and Y.
-
class
fastNLP.io.dataset_loader.
POSDataSetLoader
[source]¶ Dataset Loader for a POS Tag dataset.
In these datasets, each line are divided by ” “. The first Col is the vocabulary and the second Col is the label. Different sentence are divided by an empty line.
E.g:
Tom label1 and label2 Jerry label1 . label3 (separated by an empty line) Hello label4 world label5 ! label3
In this example, there are two sentences “Tom and Jerry .” and “Hello world !”. Each word has its own label.
-
class
fastNLP.io.dataset_loader.
PeopleDailyCorpusLoader
[source]¶ People Daily Corpus: Chinese word segmentation, POS tag, NER
-
class
fastNLP.io.dataset_loader.
RawDataSetLoader
[source]¶ A simple example of raw data reader
-
class
fastNLP.io.dataset_loader.
SNLIDataSetLoader
[source]¶ A data set loader for SNLI data set.
-
convert
(data)[source]¶ Convert a 3D list to a DataSet object.
Parameters: data – A 3D tensor. Example:
[ [ [premise_word_11, premise_word_12, ...], [hypothesis_word_11, hypothesis_word_12, ...], [label_1] ], [ [premise_word_21, premise_word_22, ...], [hypothesis_word_21, hypothesis_word_22, ...], [label_2] ], ... ]
Returns: A DataSet object.
-
-
class
fastNLP.io.dataset_loader.
TokenizeDataSetLoader
[source]¶ Data set loader for tokenization data sets
-
convert
(data)[source]¶ Optional operation to build a DataSet.
Parameters: data – inner data structure (user-defined) to represent the data. Returns: a DataSet object
-
load
(data_path, max_seq_len=32)[source]¶ Load pku dataset for Chinese word segmentation. CWS (Chinese Word Segmentation) pku training dataset format: 1. Each line is a sentence. 2. Each word in a sentence is separated by space. This function convert the pku dataset into three-level lists with labels <BMES>. B: beginning of a word M: middle of a word E: ending of a word S: single character
Parameters: - data_path (str) – path to the data set.
- max_seq_len – int, the maximum length of a sequence. If a sequence is longer than it, split it into several sequences.
Returns: three-level lists
-
-
fastNLP.io.dataset_loader.
convert_seq2seq_dataset
(data)[source]¶ Convert list of data into DataSet.
Parameters: data – list of list of strings, [num_examples, *]. Example:
[ [ [word_11, word_12, ...], [label_1, label_1, ...] ], [ [word_21, word_22, ...], [label_2, label_1, ...] ], ... ]
Returns: a DataSet.
fastNLP.io.embed_loader¶
-
class
fastNLP.io.embed_loader.
EmbedLoader
[source]¶ docstring for EmbedLoader
-
static
fast_load_embedding
(emb_dim, emb_file, vocab)[source]¶ Fast load the pre-trained embedding and combine with the given dictionary. This loading method uses line-by-line operation.
Parameters: - emb_dim (int) – the dimension of the embedding. Should be the same as pre-trained embedding.
- emb_file (str) – the pre-trained embedding file path.
- vocab (Vocabulary) – a mapping from word to index, can be provided by user or built from pre-trained embedding
Return embedding_matrix: numpy.ndarray
-
static
load_embedding
(emb_dim, emb_file, emb_type, vocab)[source]¶ Load the pre-trained embedding and combine with the given dictionary.
Parameters: - emb_dim (int) – the dimension of the embedding. Should be the same as pre-trained embedding.
- emb_file (str) – the pre-trained embedding file path.
- emb_type (str) – the pre-trained embedding format, support glove now
- vocab (Vocabulary) – a mapping from word to index, can be provided by user or built from pre-trained embedding
Return (embedding_tensor, vocab): embedding_tensor - Tensor of shape (len(word_dict), emb_dim); vocab - input vocab or vocab built by pre-train
-
static
fastNLP.io.logger¶
-
fastNLP.io.logger.
create_logger
(logger_name, log_path, log_format=None, log_level=20)[source]¶ Create a logger.
Parameters: - logger_name (str) –
- log_path (str) –
- log_format –
- log_level –
Returns: logger
To use a logger:
logger.debug("this is a debug message") logger.info("this is a info message") logger.warning("this is a warning message") logger.error("this is an error message")
fastNLP.io.model_io¶
-
class
fastNLP.io.model_io.
ModelLoader
[source]¶ Loader for models.
fastNLP.models¶
fastNLP.models.base_model¶
-
class
fastNLP.models.base_model.
NaiveClassifier
(in_feature_dim, out_feature_dim)[source]¶ -
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.models.biaffine_parser¶
-
class
fastNLP.models.biaffine_parser.
ArcBiaffine
(hidden_size, bias=True)[source]¶ helper module for Biaffine Dependency Parser predicting arc
-
class
fastNLP.models.biaffine_parser.
BiaffineParser
(word_vocab_size, word_emb_dim, pos_vocab_size, pos_emb_dim, word_hid_dim, pos_hid_dim, rnn_layers, rnn_hidden_size, arc_mlp_size, label_mlp_size, num_label, dropout, use_var_lstm=False, use_greedy_infer=False)[source]¶ Biaffine Dependency Parser implemantation. refer to ` Deep Biaffine Attention for Neural Dependency Parsing (Dozat and Manning, 2016) <https://arxiv.org/abs/1611.01734>`_ .
-
forward
(word_seq, pos_seq, word_seq_origin_len, gold_heads=None, **_)[source]¶ Parameters: - word_seq – [batch_size, seq_len] sequence of word’s indices
- pos_seq – [batch_size, seq_len] sequence of word’s indices
- word_seq_origin_len – [batch_size, seq_len] sequence of length masks
- gold_heads – [batch_size, seq_len] sequence of golden heads
Return dict: parsing results arc_pred: [batch_size, seq_len, seq_len] label_pred: [batch_size, seq_len, seq_len] mask: [batch_size, seq_len] head_pred: [batch_size, seq_len] if gold_heads is not provided, predicting the heads
-
loss
(arc_pred, label_pred, head_indices, head_labels, mask, **_)[source]¶ Compute loss.
Parameters: - arc_pred – [batch_size, seq_len, seq_len]
- label_pred – [batch_size, seq_len, n_tags]
- head_indices – [batch_size, seq_len]
- head_labels – [batch_size, seq_len]
- mask – [batch_size, seq_len]
Returns: loss value
-
-
class
fastNLP.models.biaffine_parser.
GraphParser
[source]¶ Graph based Parser helper class, support greedy decoding and MST(Maximum Spanning Tree) decoding
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.models.biaffine_parser.
LabelBilinear
(in1_features, in2_features, num_label, bias=True)[source]¶ helper module for Biaffine Dependency Parser predicting label
-
forward
(x1, x2)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
fastNLP.models.biaffine_parser.
mst
(scores)[source]¶ with some modification to support parser output for MST decoding https://github.com/tdozat/Parser/blob/0739216129cd39d69997d28cbc4133b360ea3934/lib/models/nn.py#L692
fastNLP.models.char_language_model¶
-
class
fastNLP.models.char_language_model.
CharLM
(char_emb_dim, word_emb_dim, vocab_size, num_char)[source]¶ CNN + highway network + LSTM # Input:
4D tensor with shape [batch_size, in_channel, height, width]- # Output:
- 2D Tensor with shape [batch_size, vocab_size]
- # Arguments:
- char_emb_dim: the size of each character’s attention word_emb_dim: the size of each word’s attention vocab_size: num of unique words num_char: num of characters use_gpu: True or False
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
fastNLP.models.char_language_model.
Highway
(input_size)[source]¶ Highway network
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.models.cnn_text_classification¶
-
class
fastNLP.models.cnn_text_classification.
CNNText
(embed_num, embed_dim, num_classes, kernel_nums=(3, 4, 5), kernel_sizes=(3, 4, 5), padding=0, dropout=0.5)[source]¶ Text classification model by character CNN, the implementation of paper ‘Yoon Kim. 2014. Convolution Neural Networks for Sentence Classification.’
fastNLP.models.sequence_modeling¶
-
class
fastNLP.models.sequence_modeling.
AdvSeqLabel
(args, emb=None, id2words=None)[source]¶ Advanced Sequence Labeling Model
-
forward
(word_seq, word_seq_origin_len, truth=None)[source]¶ Parameters: - word_seq – LongTensor, [batch_size, mex_len]
- word_seq_origin_len – LongTensor, [batch_size, ]
- truth – LongTensor, [batch_size, max_len]
Return y: If truth is None, return list of [decode path(list)]. Used in testing and predicting. If truth is not None, return loss, a scalar. Used in training.
-
-
class
fastNLP.models.sequence_modeling.
SeqLabeling
(args)[source]¶ PyTorch Network for sequence labeling
-
decode
(x, pad=True)[source]¶ Parameters: - x – FloatTensor, [batch_size, max_len, tag_size]
- pad – pad the output sequence to equal lengths
Return prediction: list of [decode path(list)]
-
forward
(word_seq, word_seq_origin_len, truth=None)[source]¶ Parameters: - word_seq – LongTensor, [batch_size, mex_len]
- word_seq_origin_len – LongTensor, [batch_size,], the origin lengths of the sequences.
- truth – LongTensor, [batch_size, max_len]
Return y: If truth is None, return list of [decode path(list)]. Used in testing and predicting. If truth is not None, return loss, a scalar. Used in training.
-
fastNLP.models.snli¶
-
class
fastNLP.models.snli.
SNLI
(args, init_embedding=None)[source]¶ PyTorch Network for SNLI.
-
forward
(premise, hypothesis, premise_len, hypothesis_len)[source]¶ Forward function
Parameters: - premise – A Tensor represents premise: [batch size(B), premise seq len(PL), hidden size(H)].
- hypothesis – A Tensor represents hypothesis: [B, hypothesis seq len(HL), H].
- premise_len – A Tensor record which is a real word and which is a padding word in premise: [B, PL].
- hypothesis_len – A Tensor record which is a real word and which is a padding word in hypothesis: [B, HL].
Returns: prediction: A Tensor of classification result: [B, n_labels(N)].
-
fastNLP.modules¶
fastNLP.modules.aggregator¶
fastNLP.modules.aggregator.attention¶
-
class
fastNLP.modules.aggregator.attention.
Attention
(normalize=False)[source]¶ -
forward
(query, memory, mask)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.aggregator.attention.
MultiHeadAtte
(input_size, output_size, key_size, value_size, num_atte)[source]¶ -
forward
(Q, K, V, seq_mask=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.aggregator.avg_pool¶
-
class
fastNLP.modules.aggregator.avg_pool.
AvgPool
(stride=None, padding=0)[source]¶ 1-d average pooling module.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.aggregator.kmax_pool¶
-
class
fastNLP.modules.aggregator.kmax_pool.
KMaxPool
(k=1)[source]¶ K max-pooling module.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.aggregator.max_pool¶
-
class
fastNLP.modules.aggregator.max_pool.
MaxPool
(stride=None, padding=0, dilation=1)[source]¶ 1-d max-pooling module.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.aggregator.self_attention¶
-
class
fastNLP.modules.aggregator.self_attention.
SelfAttention
(input_size, attention_unit=350, attention_hops=10, drop=0.5, initial_method=None, use_cuda=False)[source]¶ Self Attention Module.
Args: input_size: int, the size for the input vector dim: int, the width of weight matrix. num_vec: int, the number of encoded vectors
-
forward
(input, input_origin)[source]¶ Parameters: - input – the matrix to do attention. [baz, senLen, h_dim]
- inp – then token index include pad token( 0 ) [baz , senLen]
Return output1: the input matrix after attention operation [baz, multi-head , h_dim]
Return output2: the attention penalty term, a scalar [1]
-
fastNLP.modules.decoder¶
fastNLP.modules.decoder.CRF¶
-
class
fastNLP.modules.decoder.CRF.
ConditionalRandomField
(num_tags, include_start_end_trans=False, allowed_transitions=None, initial_method=None)[source]¶ -
forward
(feats, tags, mask)[source]¶ Calculate the neg log likelihood :param feats:FloatTensor, batch_size x max_len x num_tags :param tags:LongTensor, batch_size x max_len :param mask:ByteTensor batch_size x max_len :return:FloatTensor, batch_size
-
viterbi_decode
(data, mask, get_score=False, unpad=False)[source]¶ Given a feats matrix, return best decode path and best score. :param data:FloatTensor, batch_size x max_len x num_tags :param mask:ByteTensor batch_size x max_len :param get_score: bool, whether to output the decode score. :param unpad: bool, 是否将结果unpad,
如果False, 返回的是batch_size x max_len的tensor, 如果True,返回的是List[List[int]], List[int]为每个sequence的label,已经unpadding了,即每个
List[int]的长度是这个sample的有效长度Returns: 如果get_score为False,返回结果根据unpadding变动 如果get_score为True, 返回 (paths, List[float], )。第一个仍然是解码后的路径(根据unpad变化),第二个List[Float] 为每个seqence的解码分数。
-
-
fastNLP.modules.decoder.CRF.
allowed_transitions
(id2label, encoding_type='bio')[source]¶ Parameters: - id2label – dict, key是label的indices,value是str类型的tag或tag-label。value可以是只有tag的, 比如”B”, “M”; 也可以是 “B-NN”, “M-NN”, tag和label之间一定要用”-“隔开。一般可以通过Vocabulary.get_id2word()id2label。
- encoding_type – str, 支持”bio”, “bmes”。
- :return:List[Tuple(int, int)]], 内部的Tuple是(from_tag_id, to_tag_id)。 返回的结果考虑了start和end,比如”BIO”中,B、O可以
- 位于序列的开端,而I不行。所以返回的结果中会包含(start_idx, B_idx), (start_idx, O_idx), 但是不包含(start_idx, I_idx). start_idx=len(id2label), end_idx=len(id2label)+1。
-
fastNLP.modules.decoder.CRF.
is_transition_allowed
(encoding_type, from_tag, from_label, to_tag, to_label)[source]¶ Parameters: - encoding_type – str, 支持”BIO”, “BMES”。
- from_tag – str, 比如”B”, “M”之类的标注tag. 还包括start, end等两种特殊tag
- from_label – str, 比如”PER”, “LOC”等label
- to_tag – str, 比如”B”, “M”之类的标注tag. 还包括start, end等两种特殊tag
- to_label – str, 比如”PER”, “LOC”等label
Returns: bool,能否跃迁
fastNLP.modules.decoder.MLP¶
-
class
fastNLP.modules.decoder.MLP.
MLP
(size_layer, activation='relu', initial_method=None, dropout=0.0)[source]¶ -
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.decoder.
ConditionalRandomField
(num_tags, include_start_end_trans=False, allowed_transitions=None, initial_method=None)[source]¶ -
forward
(feats, tags, mask)[source]¶ Calculate the neg log likelihood :param feats:FloatTensor, batch_size x max_len x num_tags :param tags:LongTensor, batch_size x max_len :param mask:ByteTensor batch_size x max_len :return:FloatTensor, batch_size
-
viterbi_decode
(data, mask, get_score=False, unpad=False)[source]¶ Given a feats matrix, return best decode path and best score. :param data:FloatTensor, batch_size x max_len x num_tags :param mask:ByteTensor batch_size x max_len :param get_score: bool, whether to output the decode score. :param unpad: bool, 是否将结果unpad,
如果False, 返回的是batch_size x max_len的tensor, 如果True,返回的是List[List[int]], List[int]为每个sequence的label,已经unpadding了,即每个
List[int]的长度是这个sample的有效长度Returns: 如果get_score为False,返回结果根据unpadding变动 如果get_score为True, 返回 (paths, List[float], )。第一个仍然是解码后的路径(根据unpad变化),第二个List[Float] 为每个seqence的解码分数。
-
-
class
fastNLP.modules.decoder.
MLP
(size_layer, activation='relu', initial_method=None, dropout=0.0)[source]¶ -
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder¶
fastNLP.modules.encoder.char_embedding¶
-
class
fastNLP.modules.encoder.char_embedding.
ConvCharEmbedding
(char_emb_size=50, feature_maps=(40, 30, 30), kernels=(3, 4, 5), initial_method=None)[source]¶
-
class
fastNLP.modules.encoder.char_embedding.
LSTMCharEmbedding
(char_emb_size=50, hidden_size=None, initial_method=None)[source]¶ Character Level Word Embedding with LSTM with a single layer. :param char_emb_size: int, the size of character level embedding. Default: 50
say 26 characters, each embedded to 50 dim vector, then the input_size is 50.Parameters: hidden_size – int, the number of hidden units. Default: equal to char_emb_size.
fastNLP.modules.encoder.conv¶
-
class
fastNLP.modules.encoder.conv.
Conv
(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Basic 1-d convolution module. initialize with xavier_uniform
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.conv_maxpool¶
-
class
fastNLP.modules.encoder.conv_maxpool.
ConvMaxpool
(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Convolution and max-pooling module with multiple kernel sizes.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.embedding¶
-
class
fastNLP.modules.encoder.embedding.
Embedding
(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]¶ A simple lookup table Args: nums : the size of the lookup table dims : the size of each vector padding_idx : pads the tensor with zeros whenever it encounters this index sparse : If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.linear¶
-
class
fastNLP.modules.encoder.linear.
Linear
(input_size, output_size, bias=True, initial_method=None)[source]¶ Linear module Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers dropout : dropout rate bidirectional : If True, becomes a bidirectional RNN
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.lstm¶
-
class
fastNLP.modules.encoder.lstm.
LSTM
(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]¶ Long Short Term Memory
Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers. Default: 1 dropout : dropout rate. Default: 0.5 bidirectional : If True, becomes a bidirectional RNN. Default: False.
-
forward
(x, h0=None, c0=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.masked_rnn¶
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedGRU
(*args, **kwargs)[source]¶ Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:
\begin{array}{ll} r_t = \mathrm{sigmoid}(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\ z_t = \mathrm{sigmoid}(W_{iz} x_t + b_{iz} + W_{hz} h_{(t-1)} + b_{hz}) \\ n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)}+ b_{hn})) \\ h_t = (1 - z_t) * n_t + z_t * h_{(t-1)} \\ \end{array}
where \(h_t\) is the hidden state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(r_t\), \(z_t\), \(n_t\) are the reset, input, and new gates, respectively. Args:
input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. nonlinearity: The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’ bias: If False, then the layer does not use bias weights b_ih and b_hh.
Default: True- batch_first: If True, then the input and output tensors are provided
- as (batch, seq, feature)
- dropout: If non-zero, introduces a dropout layer on the outputs of each
- RNN layer except the last layer
bidirectional: If True, becomes a bidirectional RNN. Default: False
- Inputs: input, mask, h_0
- input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
- Outputs: output, h_n
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. - h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedLSTM
(*args, **kwargs)[source]¶ Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:
\begin{array}{ll} i_t = \mathrm{sigmoid}(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\ f_t = \mathrm{sigmoid}(W_{if} x_t + b_{if} + W_{hf} h_{(t-1)} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hc} h_{(t-1)} + b_{hg}) \\ o_t = \mathrm{sigmoid}(W_{io} x_t + b_{io} + W_{ho} h_{(t-1)} + b_{ho}) \\ c_t = f_t * c_{(t-1)} + i_t * g_t \\ h_t = o_t * \tanh(c_t) \end{array}
where \(h_t\) is the hidden state at time t, \(c_t\) is the cell state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(i_t\), \(f_t\), \(g_t\), \(o_t\) are the input, forget, cell, and out gates, respectively. Args:
input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. bias: If False, then the layer does not use bias weights b_ih and b_hh.
Default: True- batch_first: If True, then the input and output tensors are provided
- as (batch, seq, feature)
- dropout: If non-zero, introduces a dropout layer on the outputs of each
- RNN layer except the last layer
bidirectional: If True, becomes a bidirectional RNN. Default: False
- Inputs: input, mask, (h_0, c_0)
- input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
- c_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial cell state for each element in the batch.
- Outputs: output, (h_n, c_n)
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_t) from the last layer of the RNN,
for each t. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. - h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
- c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_t) from the last layer of the RNN,
for each t. If a
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedRNN
(*args, **kwargs)[source]¶ Applies a multi-layer Elman RNN with costomized non-linearity to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:
h_t = \tanh(w_{ih} * x_t + b_{ih} + w_{hh} * h_{(t-1)} + b_{hh})
where \(h_t\) is the hidden state at time t, and \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer. If nonlinearity=’relu’, then ReLU is used instead of tanh. Args:
input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. nonlinearity: The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’ bias: If False, then the layer does not use bias weights b_ih and b_hh.
Default: True- batch_first: If True, then the input and output tensors are provided
- as (batch, seq, feature)
- dropout: If non-zero, introduces a dropout layer on the outputs of each
- RNN layer except the last layer
bidirectional: If True, becomes a bidirectional RNN. Default: False
- Inputs: input, mask, h_0
- input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
- Outputs: output, h_n
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. - h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedRNNBase
(Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, layer_dropout=0, step_dropout=0, bidirectional=False, initial_method=None, **kwargs)[source]¶ -
forward
(input, mask=None, hx=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
step
(input, hx=None, mask=None)[source]¶ execute one step forward (only for one-directional RNN). Args:
input (batch, input_size): input tensor of this step. hx (num_layers, batch, hidden_size): the hidden state of last step. mask (batch): the mask tensor of this step.- Returns:
- output (batch, hidden_size): tensor containing the output of this step from the last layer of RNN. hn (num_layers, batch, hidden_size): tensor containing the hidden state of this step
-
fastNLP.modules.encoder.transformer¶
-
class
fastNLP.modules.encoder.transformer.
TransformerEncoder
(num_layers, **kargs)[source]¶ -
class
SubLayer
(input_size, output_size, key_size, value_size, num_atte)[source]¶ -
forward
(input, seq_mask)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
forward
(x, seq_mask=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
fastNLP.modules.encoder.variational_rnn¶
-
class
fastNLP.modules.encoder.variational_rnn.
VarGRU
(*args, **kwargs)[source]¶ Variational Dropout GRU.
-
class
fastNLP.modules.encoder.variational_rnn.
VarLSTM
(*args, **kwargs)[source]¶ Variational Dropout LSTM.
-
class
fastNLP.modules.encoder.variational_rnn.
VarRNN
(*args, **kwargs)[source]¶ Variational Dropout RNN.
-
class
fastNLP.modules.encoder.variational_rnn.
VarRNNBase
(mode, Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, input_dropout=0, hidden_dropout=0, bidirectional=False)[source]¶ Implementation of Variational Dropout RNN network. refer to A Theoretically Grounded Application of Dropout in Recurrent Neural Networks (Yarin Gal and Zoubin Ghahramani, 2016) https://arxiv.org/abs/1512.05287.
-
forward
(input, hx=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.variational_rnn.
VarRnnCellWrapper
(cell, hidden_size, input_p, hidden_p)[source]¶ Wrapper for normal RNN Cells, make it support variational dropout
-
forward
(input, hidden, mask_x=None, mask_h=None)[source]¶ Parameters: - input – [seq_len, batch_size, input_size]
- hidden – for LSTM, tuple of (h_0, c_0), [batch_size, hidden_size] for other RNN, h_0, [batch_size, hidden_size]
- mask_x – [batch_size, input_size] dropout mask for input
- mask_h – [batch_size, hidden_size] dropout mask for hidden
Return output: [seq_len, bacth_size, hidden_size] hidden: for LSTM, tuple of (h_n, c_n), [batch_size, hidden_size]
for other RNN, h_n, [batch_size, hidden_size]
-
-
fastNLP.modules.encoder.variational_rnn.
flip
(input, dims) → Tensor¶ Reverse the order of a n-D tensor along given axis in dims.
- Args:
- input (Tensor): the input tensor dims (a list or tuple): axis to flip on
Example:
>>> x = torch.arange(8).view(2, 2, 2) >>> x tensor([[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]]]) >>> torch.flip(x, [0, 1]) tensor([[[ 6, 7], [ 4, 5]], [[ 2, 3], [ 0, 1]]])
-
class
fastNLP.modules.encoder.
LSTM
(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]¶ Long Short Term Memory
Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers. Default: 1 dropout : dropout rate. Default: 0.5 bidirectional : If True, becomes a bidirectional RNN. Default: False.
-
forward
(x, h0=None, c0=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
Embedding
(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]¶ A simple lookup table Args: nums : the size of the lookup table dims : the size of each vector padding_idx : pads the tensor with zeros whenever it encounters this index sparse : If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
Linear
(input_size, output_size, bias=True, initial_method=None)[source]¶ Linear module Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers dropout : dropout rate bidirectional : If True, becomes a bidirectional RNN
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
Conv
(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Basic 1-d convolution module. initialize with xavier_uniform
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
ConvMaxpool
(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Convolution and max-pooling module with multiple kernel sizes.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.dropout¶
-
class
fastNLP.modules.dropout.
TimestepDropout
(p=0.5, inplace=False)[source]¶ This module accepts a [batch_size, num_timesteps, embedding_dim)] and use a single dropout mask of shape (batch_size, embedding_dim) to apply on every time step.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.other_modules¶
-
class
fastNLP.modules.other_modules.
BiAffine
(n_enc, n_dec, n_labels, biaffine=True, **kwargs)[source]¶ -
forward
(input_d, input_e, mask_d=None, mask_e=None)[source]¶ - Args:
- input_d: Tensor
- the decoder input tensor with shape = [batch, length_decoder, input_size]
- input_e: Tensor
- the child input tensor with shape = [batch, length_encoder, input_size]
- mask_d: Tensor or None
- the mask tensor for decoder with shape = [batch, length_decoder]
- mask_e: Tensor or None
- the mask tensor for encoder with shape = [batch, length_encoder]
- Returns: Tensor
- the energy tensor with shape = [batch, num_label, length, length]
-
-
class
fastNLP.modules.other_modules.
GroupNorm
(num_features, num_groups=20, eps=1e-05)[source]¶ -
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.other_modules.
LayerNormalization
(layer_size, eps=0.001)[source]¶ Layer normalization module
-
forward
(z)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.utils¶
-
fastNLP.modules.utils.
initial_parameter
(net, initial_method=None)[source]¶ A method used to initialize the weights of PyTorch models.
Parameters: - net – a PyTorch model
- initial_method –
str, one of the following initializations
- xavier_uniform
- xavier_normal (default)
- kaiming_normal, or msra
- kaiming_uniform
- orthogonal
- sparse
- normal
- uniform
-
fastNLP.modules.utils.
seq_mask
(seq_len, max_len)[source]¶ Create sequence mask.
Parameters: - seq_len – list or torch.Tensor, the lengths of sequences in a batch.
- max_len – int, the maximum sequence length in a batch.
Return mask: torch.LongTensor, [batch_size, max_len]
-
class
fastNLP.modules.
TimestepDropout
(p=0.5, inplace=False)[source]¶ This module accepts a [batch_size, num_timesteps, embedding_dim)] and use a single dropout mask of shape (batch_size, embedding_dim) to apply on every time step.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-