The GPT-3 API has been on the waiting list, but all the folks who could get a chance to try it shared their interesting findings and amazing results of this powerful model. Here are a few things that were observed while experimenting on the API’s interface called the Playground.
- Settings and Presets:
Upon clicking on the settings icon, one can configure various parameters like the text length, temperature (from low/boring to standard to chaotic/creative), start and stop generated text etc. And there are multiple presets to choose and pay around with like Chat, Q&A, Parsing Unstructured Data, Summarize for a 2nd grader
The chat preset looks more like a chatbot where you can set the character of the AI as friendly, creative, clever and helpful mode which provides informative answers in a very polite manner whereas if you set the character of the AI to brutal it responds exactly as the character suggests!
Question answering needs some training before it starts answering our questions and people did not have any complaints with the kind of answers received.
- Parsing Unstructured Data:
This is an interesting preset of the model which can comprehend and extract structured information from the unstructured text
- Summarize for 2nd Grader:
This preset shows another level of text compression by rephrasing the difficult sentences and concepts into simpler words and sentences that can be easily understood by a kid
- Multilingual text processing:
GPT-3 can handle languages other than English better than the GPT-2. People have tried tasks in various languages German, Russian and Japanese it did perform well and were very much ready for multilingual text processing.
- Text Generation:
It can generate poems on demand that too in a particular style if required, can write stories and essays with some fine tuning even in other languages.
- Code Generation:
People have claimed that this API can generate code with a minimum prompts.
Here is an article which showcases all it’s capabilities and excerpts from social media.
And this is how the AI interface looks like (Below image shows the Q&A preset):
Fig-6: Preview of the AI Playground page for a Q&A preset
Unlike a lot of language models, GPT-3 does not need Transfer Learning, where the model is fine-tuned on task specific data sets for specific tasks. The author of a research paper on GPT-3 mentions the following advantages of having a task-agnostic model:
- Collecting task-specific data is difficult
- Fine-tuning might yield out-of-distribution performance
- Need for an adaptable NLP system similar to humans, which can understand the natural language (English) and perform tasks with few or no prompts
The applications of GPT-3 are in-context learning, where a model is fed with a task/prompt/shot or an example and it responds to it on the basis of the skills and pattern recognition abilities that were learnt during the training to adapt the current specific task.
Despite its tremendous useability, the huge model size is the biggest factor hindering the usage for most people, except those with available resources. However, there are discussions in the fraternity that distillation might come to the rescue!
The Open AI founder himself said that “GPT-3 has weaknesses and it makes silly mistakes”. It is weak in the segment of sentence comparison where it has to see the usage of a word in 2 different sentences.
As per the researchers, it still faces some problems in the following tasks:
- Coherence loss
- Drawing real conclusions
- Multiple digit additions and subtractions
It is great to have an NLP system that doesn’t require large amounts of custom-task specific datasets and custom-model architecture to solve specific NLP tasks. The experiments conducted show its power, potential and impact on the future of NLP advancement.
Though GPT-3 doesn’t do well on everything and the size of it makes it difficult to use by everyone, this is just the threshold of a lot of new improvements to come in the field of NLP!
Bhaskar Ammu is a Senior Data Scientist at Sigmoid. He specializes in designing data science solutions for clients, building database architectures and managing projects and teams.