![]() GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. ![]() As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization-all without task-specific training.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |