OpenAI's Generative Pre-trained Transformer-2 (GPT-2 ) is capable of generating text from short writing samples. How good is it? Maybe too good.
Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.
GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data.
GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation.
Better yet, let the always informative and entertaining Siraj Raval explain it to you:
Science fiction fans are not shocked to read that AIs write as well as we do. In Studio 5, The Stars, a 1971 story by JG Ballard, a verse transcriber is used to write poetry on demand:
"Do you mean she wrote these herself?"
I nodded. "It has been done that way. In fact the method enjoyed quite a vogue for twenty or thirty centuries. Shakespeare tried it, Milton, Keats and Shelley - it worked reasonably well for them."
"But not now," Tony said. "Not since the VT set. How can you compete with an IBM heavy-duty logomatic analogue?"
"...Hold on," I told him. I was pasting down one of Xero's satirical pastiches of Rubert Brooke and was six lines short. I handed Tony the master tape and he played it into the IBM, set the meter, rhyme scheme, verbal pairs, and then switched on, waited for the tape to chunter out of the delivery head, tore off six lines and passed them back to me. I didn't even need to read them.
For the next two hours we worked hard, at dusk had completed over 1,000 lines and broke off for a well-earned drink.
Scroll down for more stories in the same category. (Story submitted 2/15/2019)