Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Overview



pypi Build Status codecov Documentation Status License

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation.

Texar was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes. A mirror of this repository is maintained by Petuum Open Source.

Key Features

  • Two Versions, (Mostly) Same Interfaces. Texar-TensorFlow (this repo) and Texar-PyTorch have mostly the same interfaces. Both further combine the best design of TF and PyTorch:
    • Interfaces and variable sharing in PyTorch convention
    • Excellent factorization and rich functionalities in TF convention.
  • Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
  • Fully Customizable at multiple abstraction level -- both novice-friendly and expert-friendly.
    • Free to plug in whatever external modules, since Texar is fully compatible with the native TF/PyTorch APIs.
  • Versatile to support broad tasks, models, algorithms, data processing, evaluation, etc.
    • encoder(s) to decoder(s), sequential- and self-attentions, memory, hierarchical models, classifiers...
    • maximum likelihood learning, reinforcement learning, adversarial learning, probabilistic modeling, ...
  • Modularized for maximal re-use and clean APIs, based on principled decomposition of Learning-Inference-Model Architecture.
  • Distributed model training with multiple GPUs.
  • Clean, detailed documentation and rich examples.


Library API Example

Builds an encoder-decoder model, with maximum likelihood learning:

import texar.tf as tx

# Data 
data = tx.data.PairedTextData(hparams=hparams_data) # a dict of hyperparameters 
iterator = tx.data.DataIterator(data)
batch = iterator.get_next()                         # get a data mini-batch

# Model architecture
embedder = tx.modules.WordEmbedder(data.target_vocab.size, hparams=hparams_emb)
encoder = tx.modules.TransformerEncoder(hparams=hparams_enc)
outputs_enc = encoder(inputs=embedder(batch['source_text_ids']),  # call as a function
                      sequence_length=batch['source_length'])
                      
decoder = tx.modules.TransformerDecoder(
    output_layer=tf.transpose(embedder.embedding) # tie input embedding w/ output layer
    hparams=hparams_decoder)
outputs, _, _ = decoder(memory=output_enc, 
                        memory_sequence_length=batch['source_length'],
                        inputs=embedder(batch['target_text_ids']),
                        sequence_length=batch['target_length']-1,
                        decoding_strategy='greedy_train')    # teacher-forcing decoding
                        
# Loss for maximum likelihood learning
loss = tx.losses.sequence_sparse_softmax_cross_entropy(
    labels=batch['target_text_ids'][:, 1:],
    logits=outputs.logits,
    sequence_length=batch['target_length']-1)  # automatic sequence masks

# Beam search decoding
outputs_bs, _, _ = tx.modules.beam_search_decode(
    decoder,
    embedding=embedder,
    start_tokens=[data.target_vocab.bos_token_id]*num_samples,
    end_token=data.target_vocab.eos_token_id)

The same model, but with adversarial learning:

helper = tx.modules.GumbelSoftmaxTraingHelper( # Gumbel-softmax decoding
    start_tokens=[BOS]*batch_size, end_token=EOS, embedding=embedder)
outputs, _ = decoder(helper=helper)            # automatic re-use of the decoder variables

discriminator = tx.modules.BertClassifier(hparams=hparams_bert)        # pre-trained model

G_loss, D_loss = tx.losses.binary_adversarial_losses(
    real_data=data['target_text_ids'][:, 1:],
    fake_data=outputs.sample_id,
    discriminator_fn=discriminator)

The same model, but with RL policy gradient learning:

agent = tx.agents.SeqPGAgent(samples=outputs.sample_id,
                             logits=outputs.logits,
                             sequence_length=batch['target_length']-1,
                             hparams=config_model.agent)

Many more examples are available here

Installation

(Note: Texar>0.2.3 requires Python 3.6 or 3.7. To use with older Python versions, please use Texar<=0.2.3)

Texar requires:

After tensorflow and tensorflow_probability are installed, install Texar from PyPI:

pip install texar

To use cutting-edge features or develop locally, install from source:

git clone https://github.com/asyml/texar.git
cd texar
pip install .

Getting Started

Reference

If you use Texar, please cite the tech report with the following BibTex entry:

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wanrong Zhu, Devendra Sachan and Eric Xing
ACL 2019

@inproceedings{hu2019texar,
  title={Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation},
  author={Hu, Zhiting and Shi, Haoran and Tan, Bowen and Wang, Wentao and Yang, Zichao and Zhao, Tiancheng and He, Junxian and Qin, Lianhui and Wang, Di and others},
  booktitle={ACL 2019, System Demonstrations},
  year={2019}
}

License

Apache License 2.0

Companies and Universities Supporting Texar

                  

Comments
  • Potential issue with mlp transform

    Potential issue with mlp transform

    Hi,

    Currently, there is an issue in _mlp_transform https://github.com/asyml/texar/blob/master/texar/modules/connectors/connectors.py#L73 , thanks @huzecong who found this issue.

    For input with shape [batch_size, max_time, dim], current _mlp_transform reshapes it to [batch_size, max_time * dim] and processes linear layer transform. The transform matrix has shape [max_time * dim, mlp_output_dim], which is equivalent to max_time number of smaller matrixes with size [dim, mlp_output_dim]. However, regarding same-time-vector should transform with same matrix, current _mlp_transform can not make transform in that way. Should _mlp_transform be modified?

    enhancement 
    opened by TomNong 9
  • ValueError: Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.

    ValueError: Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.

    when i try to do model_utils.init_bert_checkpoint(init_checkpoint) I get ValueError:Assignment map with scope only name bert/position_embeddings should map to scope only bert/embeddings/position_embeddings. Should be 'scope/': 'other_scope/'.

    This was working fine earlier ,could this be due to the new changes introduced in texar ,as I have not changed my code .

    opened by Vibha111094 9
  • Transformer decoder is giving the same out for every example

    Transformer decoder is giving the same out for every example

    I have been using texar library to solve summarization problem. I have replaced encoder part with Bert and decode is Transformer Decoder with beam size 5. My loss got decreased from 11 to around 3 after 50 k iteration but when I try to infer on test data it gives me same output for every example any thoughts on this? Link to the code: https://github.com/santhoshkolloju/bert_summ

    opened by santhoshkolloju 8
  • problem with 'allow_smaller_final_batch' and 'beam_search_decode'

    problem with 'allow_smaller_final_batch' and 'beam_search_decode'

    Hi there, I encountered an error when trying the seq2seq_attn example with iwslt14 dataset. The full error log is appended at the end of the post.

    The error occurs when operating this line: https://github.com/asyml/texar/blob/9c699e8143fd8ecb5d65a41ceef09c45832b9258/examples/seq2seq_attn/seq2seq_attn.py#L125

    It indicates that the training stage is ok, and the bug occurs in the validation stage.

    Here are two pieces of error logs:

    2018-09-11 16:04:10.567407: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/fw/fw/dynamic_rnn/input_0_32362: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    2018-09-11 16:04:10.567928: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/bw/bw/dynamic_rnn/input_0_32364: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    ......
    
    Caused by op 'attention_rnn_decoder_5/tile_batch_1/Reshape', defined at:
      File "seq2seq_attn.py", line 161, in <module>
        main()
      File "seq2seq_attn.py", line 93, in main
        train_op, infer_outputs = build_model(batch, train_data)
      File "seq2seq_attn.py", line 77, in build_model
        max_decoding_length=60)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/beam_search_decode.py", line 193, in beam_search_decode
        cell = decoder_or_cell._get_beam_search_cell(beam_width=beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/rnn_decoders.py", line 545, in _get_beam_search_cell
        memory_seq_length, beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 122, in tile_batch
        return nest.map_structure(lambda t_: _tile_batch(t_, multiplier), t)
    ......
    

    As there is no problem in training stage, I guess there might be something wrong in the implementation of beam_search_decode.

    The error can be described as: the tensor expects a dimension of 32 while we feed 23 instead. And I find that the batch size is 32, and there are 887 validation examples in valid.de, where 887 % 32 == 23. Also, add 'allow_smaller_final_batch': False to the val and test item of config_iwslt14.py can get rid of the error.

    But this "fix" is not what we really want. Theoretically, we are supposed to run validation and test on all the dev/test samples.

    I am using tensorflow 1.8. Please let me know if I need to provide any other environment information.

    The full logs:

    $ python seq2seq_attn.py --config_model config_model --config_data config_iwslt14
    2018-09-11 15:50:03.793617: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
    2018-09-11 15:50:04.038875: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
    name: Tesla P40 major: 6 minor: 1 memoryClockRate(GHz): 1.531
    pciBusID: 0000:02:00.0
    totalMemory: 22.38GiB freeMemory: 22.21GiB
    2018-09-11 15:50:04.038945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
    2018-09-11 15:50:04.384688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
    2018-09-11 15:50:04.384752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0
    2018-09-11 15:50:04.385091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N
    2018-09-11 15:50:04.385631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21549 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:02:00.0, compute capability: 6.1)
    step=0, loss=481.5847
    step=500, loss=101.2404
    step=1000, loss=75.9185
    step=1500, loss=102.7388
    step=2000, loss=81.9897
    step=2500, loss=64.7623
    step=3000, loss=76.1445
    step=3500, loss=81.1186
    step=4000, loss=48.0918
    step=4500, loss=54.7355
    step=5000, loss=74.8126
    2018-09-11 16:04:10.567407: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/fw/fw/dynamic_rnn/input_0_32362: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    2018-09-11 16:04:10.567928: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at tensor_array_ops.cc:1122 : Invalid argument: TensorArray bidirectional_rnn_encoder_2/bidirectional_rnn/bw/bw/dynamic_rnn/input_0_32364: Could not write to TensorArray index 0 because the value shape is [23,256] which is incompatible with the TensorArray's inferred element shape: [32,256] (consider setting infer_shape=False).
    Traceback (most recent call last):
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
        return fn(*args)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
        options, feed_dict, fetch_list, target_list, run_metadata)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
        run_metadata)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 230 values, but the requested shape has 320
    	 [[Node: attention_rnn_decoder_5/tile_batch_1/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attention_rnn_decoder_5/tile_batch_1/Tile/_4547, attention_rnn_decoder_5/tile_batch_1/concat)]]
    	 [[Node: attention_rnn_decoder_6/decoder/while/LoopCond/_4655 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_790_attention_rnn_decoder_6/decoder/while/LoopCond", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopattention_rnn_decoder_6/decoder/while/BeamSearchDecoderStep/next_beam_word_ids/y/_4501)]]
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "seq2seq_attn.py", line 161, in <module>
        main()
      File "seq2seq_attn.py", line 149, in main
        val_bleu = _eval_epoch(sess, 'val')
      File "seq2seq_attn.py", line 125, in _eval_epoch
        sess.run(fetches, feed_dict=feed_dict)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
        run_metadata_ptr)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
        feed_dict_tensor, options, run_metadata)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
        run_metadata)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
        raise type(e)(node_def, op, message)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 230 values, but the requested shape has 320
    	 [[Node: attention_rnn_decoder_5/tile_batch_1/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attention_rnn_decoder_5/tile_batch_1/Tile/_4547, attention_rnn_decoder_5/tile_batch_1/concat)]]
    	 [[Node: attention_rnn_decoder_6/decoder/while/LoopCond/_4655 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_790_attention_rnn_decoder_6/decoder/while/LoopCond", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopattention_rnn_decoder_6/decoder/while/BeamSearchDecoderStep/next_beam_word_ids/y/_4501)]]
    
    Caused by op 'attention_rnn_decoder_5/tile_batch_1/Reshape', defined at:
      File "seq2seq_attn.py", line 161, in <module>
        main()
      File "seq2seq_attn.py", line 93, in main
        train_op, infer_outputs = build_model(batch, train_data)
      File "seq2seq_attn.py", line 77, in build_model
        max_decoding_length=60)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/beam_search_decode.py", line 193, in beam_search_decode
        cell = decoder_or_cell._get_beam_search_cell(beam_width=beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/texar/modules/decoders/rnn_decoders.py", line 545, in _get_beam_search_cell
        memory_seq_length, beam_width)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 122, in tile_batch
        return nest.map_structure(lambda t_: _tile_batch(t_, multiplier), t)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 375, in map_structure
        structure[0], [func(*x) for x in entries])
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 375, in <listcomp>
        structure[0], [func(*x) for x in entries])
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 122, in <lambda>
        return nest.map_structure(lambda t_: _tile_batch(t_, multiplier), t)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/beam_search_decoder.py", line 90, in _tile_batch
        ([shape_t[0] * multiplier], shape_t[1:]), 0))
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 6113, in reshape
        "Reshape", tensor=tensor, shape=shape, name=name)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
        op_def=op_def)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
        op_def=op_def)
      File "/home/luban/miniconda3/envs/tf18/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
        self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access
    
    InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 230 values, but the requested shape has 320
    	 [[Node: attention_rnn_decoder_5/tile_batch_1/Reshape = Reshape[T=DT_INT32, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](attention_rnn_decoder_5/tile_batch_1/Tile/_4547, attention_rnn_decoder_5/tile_batch_1/concat)]]
    	 [[Node: attention_rnn_decoder_6/decoder/while/LoopCond/_4655 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_790_attention_rnn_decoder_6/decoder/while/LoopCond", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_cloopattention_rnn_decoder_6/decoder/while/BeamSearchDecoderStep/next_beam_word_ids/y/_4501)]]
    
    opened by Luolc 8
  • import texar.tf as tx .No module named 'texar.tf' via pip install

    import texar.tf as tx .No module named 'texar.tf' via pip install

    Hi guys

    I am very new to texar. I have installed it via pip. When trying to run one of the examples, namely seq2seq_exposure_bias. I get the following: File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'texar.tf' . I have tried to run it via jupyter and bash, and neither works, nor a direct bash command python import texar.tf as tx.

    Then I have tried to run the example from the git cloned texar repo. The texar.tf is imported successfully in that case, but then I get tensorflow.python.framework.errors_impl.NotFoundError: ./data/iwslt14/vocab.de; No such file or directory when running python interpolation_main.py \ --config_model configs.config_model \ --config_data configs.config_iwslt14 \ --lambdas_init [0.04,0.96,0.0] \ --delta_lambda_self 0.06 \ --delta_lambda_reward 0.06 \ --lambda_reward_steps 4

    Tensorflow is installed and working.

    question 
    opened by pol690 7
  • Transformer Training fails on custom dataset

    Transformer Training fails on custom dataset

    Hi, I'm training Transformer on a custom dataset on CPU. I've used spm encoding and have followed instructions to a T, but the training always fails with the below error trace. The same error occurs regardless of BPE, SPM or raw encodings. Kindly help!

    step: 400, loss: 5.6811
    step: 500, loss: 5.3085
    Traceback (most recent call last):
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
        return fn(*args)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
        options, feed_dict, fetch_list, target_list, run_metadata)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
        run_metadata)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
    	 [[{{node sinusoid_posisiton_embedder/embedding_lookup}}]]
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "transformer_main.py", line 308, in <module>
        main()
      File "transformer_main.py", line 293, in main
        step = _train_epoch(sess, epoch, step, smry_writer)
      File "transformer_main.py", line 273, in _train_epoch
        _eval_epoch(sess, epoch, mode='eval')
      File "transformer_main.py", line 193, in _eval_epoch
        fetches_ = sess.run(fetches, feed_dict=feed_dict)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
        run_metadata_ptr)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
        feed_dict_tensor, options, run_metadata)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
        run_metadata)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
        raise type(e)(node_def, op, message)
    tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[55,128] = 128 is not in [0, 128)
    	 [[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]
    
    Caused by op 'sinusoid_posisiton_embedder/embedding_lookup', defined at:
      File "transformer_main.py", line 308, in <module>
        main()
      File "transformer_main.py", line 96, in main
        src_pos_embeds = pos_embedder(sequence_length=src_seq_len)
      File "/home/karthik/installs/texar/texar/module_base.py", line 116, in __call__
        return self._template(*args, **kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 360, in __call__
        return self._call_func(args, kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 311, in _call_func
        result = self._func(*args, **kwargs)
      File "/home/karthik/installs/texar/texar/modules/embedders/position_embedders.py", line 327, in _build
        outputs = tf.nn.embedding_lookup(embedding, inputs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 316, in embedding_lookup
        transform_fn=None)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/embedding_ops.py", line 133, in _embedding_lookup_and_transform
        result = _clip(array_ops.gather(params[0], ids, name=name),
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
        return target(*args, **kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 3273, in gather
        return gen_array_ops.gather_v2(params, indices, axis, name=name)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3748, in gather_v2
        "GatherV2", params=params, indices=indices, axis=axis, name=name)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
        op_def=op_def)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
        return func(*args, **kwargs)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
        op_def=op_def)
      File "/home/karthik/installs/miniconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
        self._traceback = tf_stack.extract_stack()
    
    InvalidArgumentError (see above for traceback): indices[55,128] = 128 is not in [0, 128)
    	 [[node sinusoid_posisiton_embedder/embedding_lookup (defined at /home/karthik/installs/texar/texar/modules/embedders/position_embedders.py:327) ]]
    
    question 
    opened by karthikb23 7
  • Error running gpt2_generate_main.py

    Error running gpt2_generate_main.py

    When I try to run the gpt2_generate_main.py file, I face the following error,

    ValueError: The shape for transformer_decoder_1/transformer_decoder/while/Merge_27:0 is not an invariant for the loop. It enters the loop with shape (1, 768), but has shape (?, 768) after one iteration. Provide shape invariants using either the shape_invariants argument of tf.while_loop or set_shape() on the loop variables.

    Also, how to use this model for conditioned text generation tasks? I am working on Reading Comprehension task that takes in a single stream input (Passage + ": " + Question + "? " + Answer) and am using a custom mask to extract loss between the answer start and sequence length indices. Is there a more elegant way to get this done?

    Here is the entire list of callbacks:

    Traceback (most recent call last): File "gpt2_generate_main.py", line 210, in tf.app.run() File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "gpt2_generate_main.py", line 144, in main mode=tf.estimator.ModeKeys.PREDICT) File "/home1/deepak/RaviTej/texar/texar/module_base.py", line 116, in call return self._template(*args, **kwargs) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 455, in call result = self._call_func(args, kwargs) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 406, in _call_func result = self._func(*args, **kwargs) File "/home1/deepak/RaviTej/texar/texar/modules/decoders/transformer_decoders.py", line 569, in _build scope=self.variable_scope) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 309, in dynamic_decode swap_memory=swap_memory) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3202, in while_loop result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2940, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2914, in _BuildLoop next_vars.append(_AddNextAndBackEdge(m, v)) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 688, in _AddNextAndBackEdge _EnforceShapeInvariant(m, v) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 632, in _EnforceShapeInvariant (merge_var.name, m_shape, n_shape)) ValueError: The shape for transformer_decoder_1/transformer_decoder/while/Merge_27:0 is not an invariant for the loop. It enters the loop with shape (1, 768), but has shape (?, 768) after one iteration. Provide shape invariants using either the shape_invariants argument of tf.while_loop or set_shape() on the loop variables.

    originally defined at: File "gpt2_generate_main.py", line 133, in main hparams=gpt2_config.decoder) File "/home1/deepak/RaviTej/texar/texar/modules/decoders/transformer_decoders.py", line 98, in init ModuleBase.init(self, hparams) File "/home1/deepak/RaviTej/texar/texar/module_base.py", line 73, in init create_scope_now_=True) File "/home1/deepak/anaconda/envs/nlp_proj/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 153, in make_template **kwargs)

    opened by Akella17 7
  • Making Input Different From Text

    Making Input Different From Text

    Hello,

    I want to train an image captioning model with the transformer. How should I configure the input (i.e. VGG imagenet features) so that I can have an encoder-decoder model that is an image captioning model?

    Please share any links or examples to implement this.

    opened by ghost 7
  • Need some help to train gpt-2. Thank you very much!

    Need some help to train gpt-2. Thank you very much!

    The figure below is from the README of gpt-2.

    image

    source_hidden_states and some other variables are missing, if I coding based on gpt2_generate_main.py.

    Hope you could please provide a more detailed example? or only an related example link of other model is OK.

    Thank you!

    opened by guotong1988 7
  • Move Texar-TF under `texar/tf/`

    Move Texar-TF under `texar/tf/`

    This PR includes the following changes:

    • Move entire Texar-TF codebase under texar/tf/
    • Add DeprecationWarning when user imports from texar but not texar.tf in Python 3
    opened by huzecong 6
  • A Question About the Example -- sentence_classifier

    A Question About the Example -- sentence_classifier

    I am a new user of texar, when I try one of the examples named "sentence_classifier"(https://github.com/asyml/texar/tree/master/examples/sentence_classifier), I find some problems hard to resolve. In the example, the language of the data is english however I want to try it in chinese. I just replace the SSTdata with my data. Here is the form of the data I use: 2 但是您的所作所为不合适。 1 我觉得这个苹果味道一般。 0 今天终于修正了一个错误。 About the vocab ,I use my own word segmenter for chinese sentence. The program displays an error: image I guess it is because the size of the batch. So I change the batch size from 50 to 64 in config_kim.py. Then there is another error appears: image I don't know how to explain it so I hope someone can tell me if I use Texar incorrectly or there are other problems exist.

    opened by mahaoxiang822 6
  • Update dataset_utils_test.py

    Update dataset_utils_test.py

    Hello,this issue shows the reason that I commit this PR. I sincerely wish my PR will help you.And if you think my PR has a little work,Hoping you could merge it. Thank you,my friend.

    opened by DLPerf 1
  •  Performance issues in the program

    Performance issues in the program

    Hello,I found a performance issue in the definition of test_make_chained_transformation , asyml/texar/blob/master/tests/data/data/dataset_utils_test.py, dataset.map was called without num_parallel_calls. I think it will increase the efficiency of your program if you add this.

    Here is the documemtation of tensorflow to support this thing.

    opened by DLPerf 0
  • Question about the installation about the version 0.2.1

    Question about the installation about the version 0.2.1

    Hello, I am trying to reproduce a project , there say i need to install texar 0.2.1,Because the texar 0.2.1 is saved there . https://github.com/fabrahman/Emo-Aware-Storytelling/tree/master/third_party/texar And i run -> pip install . And the system show Successfully installed texar-0.2.1 ; But when i run python to enter python environment and run import texar to test if it is successfully downloaded. Something happened! It says :>>> import texar Traceback (most recent call last): File "", line 1, in File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/init.py", line 29, in from texar import modules File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/modules/init.py", line 24, in from texar.modules.networks import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/modules/networks/init.py", line 24, in from texar.modules.networks.network_base import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/modules/networks/network_base.py", line 26, in from texar.core.layers import get_layer File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/core/init.py", line 24, in from texar.core.layers import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/texar/core/layers.py", line 25, in import tensorflow.contrib.rnn as rnn File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/init.py", line 38, in from tensorflow.contrib import cloud File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/init.py", line 24, in from tensorflow.contrib.cloud.python.ops.bigquery_reader_ops import * File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/python/ops/bigquery_reader_ops.py", line 21, in from tensorflow.contrib.cloud.python.ops import gen_bigquery_reader_ops File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/python/ops/gen_bigquery_reader_ops.py", line 307, in _op_def_lib = _InitOpDefLibrary(b"\n\355\001\n\016BigQueryReader\032\024\n\rreader_handle\030\007\200\001\001"\027\n\tcontainer\022\006string\032\002\022\000"\031\n\013shared_name\022\006string\032\002\022\000"\024\n\nproject_id\022\006string"\024\n\ndataset_id\022\006string"\022\n\010table_id\022\006string"\027\n\007columns\022\014list(string)"\027\n\020timestamp_millis\022\003int"\034\n\016test_end_point\022\006string\032\002\022\000\210\001\001\n\331\001\n GenerateBigQueryReaderPartitions\032\016\n\npartitions\030\007"\024\n\nproject_id\022\006string"\024\n\ndataset_id\022\006string"\022\n\010table_id\022\006string"\027\n\007columns\022\014list(string)"\027\n\020timestamp_millis\022\003int"\025\n\016num_partitions\022\003int"\034\n\016test_end_point\022\006string\032\002\022\000") File "/home/lixin/enter/envs/lx03/lib/python3.6/site-packages/tensorflow/contrib/cloud/python/ops/gen_bigquery_reader_ops.py", line 215, in _InitOpDefLibrary _op_def_registry.register_op_list(op_list) AttributeError: module 'tensorflow.python.framework.op_def_registry' has no attribute 'register_op_list'

    Can you give me some useful advide ,I am so sorry to bother you ! @ZhitingHu

    opened by shaoniana1997 1
  • Output is not as expected(style does not getting transferred)

    Output is not as expected(style does not getting transferred)

    I have followed your tutorial to generate the output fully. I have not changed a tiny bit. Did whatever has been mentioned in your github tutorial. But not getting the expected output. Please let me know how you can help me to track the issue? As I am doing exactly the same thing you mentioned and same data and all, but still let me know if I have to upload any file from my side to track the issue? Thank you.

    Output Sample:

    you can find lots of gifts that are rare and the price is right ! you can find lots of gifts that are rare and the price is right ! do n't bother with this place . do n't bother with this place . the customer service is awful . the customer service is awful . it is n't expensive at all and the staff is so nice . it is n't expensive at all and the staff is so nice . jill was very patient with me and my receipts . haha was very patient with me and my quotes . went for breakfast this morning , was n't very busy and service was horrible . went for breakfast this morning , was n't very busy and service was horrible . price was more than right , less than he quoted me . price was more than right , less than he quoted me . got my drink correct and quick ! got my drink correct and quick ! so good ! so good !

    opened by SOURO 0
  • ASYML project suggestions

    ASYML project suggestions

    As open-sourced projects, we love to hear the voices of the community and learn what features you are expecting from machine learning libraries and NLP toolkits. To facilitate this, we will maintain a master list of project suggestions of all of our ASYML projects from the communities, which can help future contributors and us to decide what to work on. We will maintain a high-level list in this issue:

    • If you have an idea/direction, you can raise it here. Use the issues if you have a concrete feature request or bug report.
    • If you are interested in working on any of these directions, you can open issues and comment here with the issue link.

    Texar (issue list)

    • Tensorflow-2.0 support.

    Texar (issue list)

    • Support distributed training with AdaptDL.

    Forte (issue list)

    • Support Automatic Data Augmentation functions.

    Stave (issue list)

    • Easy configurable active learning for annotation projects.
    • Allow Embed Stave visualization to webpages.
    opened by hunterhector 0
Releases(v0.2.4)
  • v0.2.4(Nov 19, 2019)

    New features

    • Support only Python 3.6 and 3.7. Drop support of older Python versions. (#211)
    • Add Tokenizers including tokenizers for pretrained models (BERTTokenizer, XLNetTokenizer, etc). (#225)
    • Add GPT2 modules (GPT2Encoder, GPT2Decoder, GPT2Classifier, etc). (#228)

    Feature improvements

    • Update embedder modules dropout_strategy=='item' to support TensorFlow v1.15. (#231)
    • Update .gitignore and add .gitignore files to all examples. (#233)
    • Polish code style according to flake8. (#234)
    • Add GPT2 XL pretrained checkpoint. (#243)

    Fixes

    • Fix examples/transformer/scripts/wmt14_en_de.sh to create output dir automatically. (#238)
    • Fix variable scope issue in texar.tf.modules.decoders.dynamic_decode. (#246)
    Source code(tar.gz)
    Source code(zip)
  • v0.2.3(Sep 26, 2019)

  • v0.2.2(Aug 5, 2019)

  • v0.2.1(Jul 28, 2019)

    New features

    • Add support for GPT-2 345M model in examples/gpt-2. (#156)
    • Add BERT modules, including texar.modules.BERTEncoder (doc) and texar.modules.BERTClassifier (doc). (#167)

    Feature improvements

    • Refactor TransformerEncoder and TransformerDecoder to separate position embeddings from the modules. (#126)
    • Allow passing a Tensor to output_layer of decoders' constructors -- used for weight tie b/w the output layer and input embedding matrix. (#126)
    • TransformerDecoder constructor interface made exact the same with RNN decoders constructor interfaces. (#126)
    • Refactor decoder Helpers to allow two-argument embedding_fn (supporting for position embedding). (#126)
    • Refactor SinusoidsPositionEmbedder to enable infinite large or negative position indexes. (#176)

    Fixes

    • Fix texar.losses.reduce_batch_time when sequence has dtype other than tf.float32. (#143)
    • Fix texar.losses.reduce_dimensions when average_axes or sum_axes is int. (#141)
    • Fix GPT-2 tokenization loading path. (#165)
    • Fix examples/vae_text EOS bug. (#168)
    • Fix transformer bleu_tool.py when translation_length is 0. (#176)
    • Fix StochasticConnector and ReparameterizedStochasticConnector when transform=False. (#179)
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 9, 2019)

    New features

    • TFRecordData: A new data module for reading and processing TFRecord data, with support for, e.g., image data, feature data, etc. (#107)
    • GPT-2: OpenAI pretrained language model. (#91, example)
    • TopKSampleEmbeddingHelper to perform top_k random sample decoding. (baa09ff)

    Feature improvements

    • Refactor BERT example using TFRecordData data module.
    • TransformerDecoder supports helper arguments to specify decoding strategy. (#76)

    Fixes

    • Fix variable collection bug in examples/seqgan. (#110)
    • Fix error when beam_search_decode with output_layer=tf.identity (#77)
    • Fix readthedocs compilation error (#85)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Feb 6, 2019)

Owner
ASYML
Machine Learning as Machine Assembly, part of the CASL project https://casl-project.ai/
ASYML
Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the model for this program is one of the deep-learning NLP(Natural Language Process) model struc

RUO 2 Feb 22, 2022
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020

Google Research 3k Dec 26, 2022
**NSFW** A chatbot based on GPT2-chitchat

DangBot -- 好怪哦,再来一句 卡群怪话bot,powered by GPT2 for Chinese chitchat Training Example: python train.py --lr 5e-2 --epochs 30 --max_len 300 --batch_size 8

Tommy Yang 11 Jul 21, 2022
A CRM department in a local bank works on classify their lost customers with their past datas. So they want predict with these method that average loss balance and passive duration for future.

Rule-Based-Classification-in-a-Banking-Case. A CRM department in a local bank works on classify their lost customers with their past datas. So they wa

ÖMER YILDIZ 4 Mar 20, 2022
[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

UVA Computer Vision 86 Dec 28, 2022
Sequence-to-Sequence Framework in PyTorch

nmtpytorch allows training of various end-to-end neural architectures including but not limited to neural machine translation, image captioning and au

LIUM 395 Nov 21, 2022
PG-19 Language Modelling Benchmark

PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje

DeepMind 161 Oct 30, 2022
This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

POS-Tagger This repository details the creation of a Part-of-Speech tagger using Trigram Hidden Markov Models to predict word tags in a word sequence.

Raihan Ahmed 1 Dec 09, 2021
Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

"Kötü söz sahibine aittir." -Anonim Nedir? sinkaf uygunsuz yorumların bulunmasını sağlayan bir python kütüphanesidir. Farkı nedir? Diğer algoritmalard

KaraGoz 4 Feb 18, 2022
📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Well-formed Limericks and Haikus with GPT2 📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation In collaboration with Matthew Korahais &

Bardia Shahrestani 2 May 26, 2022
中文生成式预训练模型

T5 PEGASUS 中文生成式预训练模型,以mT5为基础架构和初始权重,通过类似PEGASUS的方式进行预训练。 详情可见:https://kexue.fm/archives/8209 Tokenizer 我们将T5 PEGASUS的Tokenizer换成了BERT的Tokenizer,它对中文更

410 Jan 03, 2023
A paper list of pre-trained language models (PLMs).

Large-scale pre-trained language models (PLMs) such as BERT and GPT have achieved great success and become a milestone in NLP.

RUCAIBox 124 Jan 02, 2023
Weird Sort-and-Compress Thing

Weird Sort-and-Compress Thing A weird integer sorting + compression algorithm inspired by a conversation with Luthingx (it probably already exists by

Douglas 1 Jan 03, 2022
Twewy-discord-chatbot - Build a Discord AI Chatbot that Speaks like Your Favorite Character

Build a Discord AI Chatbot that Speaks like Your Favorite Character! This is a Discord AI Chatbot that uses the Microsoft DialoGPT conversational mode

Lynn Zheng 231 Dec 30, 2022
Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Realistic Few-Shot Relation Extraction This repository contains code to reproduce the results in the paper "Towards Realistic Few-Shot Relation Extrac

Bloomberg 8 Nov 09, 2022
Clone a voice in 5 seconds to generate arbitrary speech in real-time

This repository is forked from Real-Time-Voice-Cloning which only support English. English | 中文 Features 🌍 Chinese supported mandarin and tested with

Weijia Chen 25.6k Jan 06, 2023
Host your own GPT-3 Discord bot

GPT3 Discord Bot Host your own GPT-3 Discord bot i'd host and make the bot invitable myself, however GPT3 terms of service prohibit public use of GPT3

[something hillarious here] 8 Jan 07, 2023
Unsupervised Language Modeling at scale for robust sentiment classification

** DEPRECATED ** This repo has been deprecated. Please visit Megatron-LM for our up to date Large-scale unsupervised pretraining and finetuning code.

NVIDIA Corporation 1k Nov 17, 2022
Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

SunLu Z 7 Nov 11, 2022
MEDIALpy: MEDIcal Abbreviations Lookup in Python

A small python package that allows the user to look up common medical abbreviations.

Aberystwyth Systems Biology 7 Nov 09, 2022