The GPT-3 API has been on the waiting list, but all the folks who could get a chance to try it shared their interesting findings and amazing results of this powerful model. Here are a few things that were observed while experimenting on the API’s interface called the Playground.
Settings and Presets:
Upon clicking the settings icon, one can configure various parameters like the text length, temperature (from low/boring to standard to chaotic/creative), start and stop generated text etc. And there are multiple presets to choose and play around with like Chat, Q&A, Parsing Unstructured Data, Summarize for a 2nd grader
The chat preset looks more like a chatbot where you can set the character of the AI as friendly, creative, clever and helpful mode which provides informative answers in a very polite manner whereas if you set the character of the AI to brutal it responds exactly as the character suggests!
Question answering needs some training before it starts answering our questions and people did not have any complaints with the kind of answers received.
- Parsing Unstructured Data:
This is an interesting preset of the model which can comprehend and extract structured information from the unstructured text
Here is an article which showcases all it’s capabilities and excerpts from social media.
Above are a few of the GPT-3 examples that are taking AI language model development research by stride.
This is how the AI model interface looks like. (Below image shows the Q&A preset):
Fig-6: Preview of the AI Playground page for a Q&A preset
Unlike a lot of language models, GPT-3 does not need Transfer Learning, where the model is fine-tuned on task specific data sets for specific tasks. The author of a research paper on GPT-3 mentions the following advantages of having a task-agnostic model:
- Collecting task-specific data is difficult
- Fine-tuning might yield out-of-distribution performance
- Need for an adaptable NLP system similar to humans, which can understand the natural language (English) and perform tasks with few or no prompts
The applications of GPT-3 are in context learning, where a model is fed with a task/prompt/shot or an example and it responds to it on the basis of the skills and pattern recognition abilities that were learned during the training to adapt to the current specific task.
Despite its tremendous useability, the huge model size is the biggest factor hindering the usage for most people, except those with available resources. However, there are discussions in the fraternity that distillation might come to the rescue!
The Open AI founder himself said that “GPT-3 has weaknesses and it makes silly mistakes”. It is weak in the segment of sentence comparison where it has to see the usage of a word in 2 different sentences.
As per the researchers, it still faces some problems in the following tasks:
- Coherence loss
- Drawing real conclusions
- Multiple digit additions and subtractions
It is great to have an NLP system that doesn’t require large amounts of custom-task-specific datasets and custom-model architecture to solve specific NLP tasks. The experiments conducted show its power, potential, and impact on the future of NLP advancement.
GPT-3 is a great example of how far AI model development has come. Even though GPT-3 doesn’t do well on everything so far and the size of it makes it difficult to be used by everyone, this is just the threshold of a lot of new improvements to come in the field of NLP!
Bhaskar Ammu is a Senior Data Scientist at Sigmoid. He specializes in designing data science solutions for clients, building database architectures and managing projects and teams.