The present and future of 'ChatGPT' according to AI search experts

2023/01/19 | 3mins
 
  • Jaekyung Bae (Upstage AI Product Leader, former Kakao Search Engine Leader)

  • THOSE WHO ARE CURIOUS ABOUT AI AND IT INDUSTRY ISSUES

    Those who are curious about the working principle and capabilities of ChatGPT

    ARE YOU CURIOUS ABOUT THE IMPACT AI WILL HAVE ON THE FUTURE?

  • The craze for 'ChatGPT', an interactive artificial intelligence chatbot that can freely speak languages like a human, is not going away. How are AI search experts seeing this? About the present and future of 'ChatGPT', Bae Jae-kyung, who was in charge of A to Z development of a neuralnet-based translation engine as an AI Product leader of Upstage and a former Kakao search engine leader, expressed his opinion.

  • ✔️ What is a Language Model?

    ✔️ Differences between language models and human-generated sentences

    ✔️ ChatGPT's language inference

    ✔️ Strengths of ChatGPT

    ✔️ Advanced reasoning, still unique to humans

    ✔️ ChatGPT will change the search market landscape

The craze for 'ChatGPT', a huge artificial intelligence (AI)-based chatbot that was unveiled last December, is not going away. Many people were surprised by the appearance of ChatGPT, which speaks naturally as if it were talking to a person, points out false assumptions, and boasts excellent writing skills. Even Google, which provides search services, felt a sense of crisis and issued a 'code red' internally, giving it a huge impact.

How do AI search experts view ChatGPT? As the AI Product leader of Upstage and the former Kakao Search Engine leader, Bae Jae-kyung, who was in charge of A to Z development of the neuralnet-based translation engine, has expressed his opinion in a contributory article.

language model
What is (Language Model)?

ChatGPT is a large-scale language model trained by OpenAI. Because understanding natural language requires very high-level reasoning, current machine learning models have been considered to be limited. Therefore, it was thought that the level of performance currently shown by ChatGPT would be difficult to achieve, at least in the near term (within a few years). However, ChatGPT is already showing a rapid pace of development beyond imagination. At the same time, it is also true that the meaning and potential of these achievements are being talked about inflated. In order to make clearer judgments and predictions about large AI models, including ChatGPT, we must first understand what a “Language Model” is.

A language model is a model that predicts the next word or letter based on context information (which can be partial information) in the form of text (or speech). A language model uses vast text information as training data, and the most commonly used objective is to predict the content to enter in the blank after erasing a specific word or letter .

ex. I ______ rice. ➡️ At this time, the content to be followed in the blank can be predicted with “eat”

What would fill in the blanks in the example above? Anyone who understands Korean can easily make predictions, which is what language models do. Models are trained by giving one or more objectives and modifying parameters in a way that achieves that objective. And the number of these parameters is the size of the model. Usually, many documents consist of several sentences, and since these sentences are not randomly connected but correlated, matching whether or not they are consecutive sentences can also be used as an objective of the model.

Although the language model's goal of "filling in the blank" may seem simple , it is simple but very powerful as the model is trained to look at more surrounding words or sentences to improve accuracy. Another advantage of this objective is that all text information in the world can be used for learning . In a way, it can be said that this is why a super-giant model was born. The way humans learn languages is also similar to language models. Even if you don't intend to do this, you are already learning to do it without realizing it. Since this is a very natural and effective way of learning, there is no reason the human brain won't follow it.

language models and people
The difference between the sentences you create

So, what is the difference between sentences produced by language models and sentences produced by humans? The biggest feature of human-generated sentences is that they have a richer and more complex “context” in addition to the given text information.

The context can be information (visual/auditory) other than text, or meta-information derived from one's long memories or prejudices and based on them. In particular, when we create a sentence, there is usually an important context, such as an intention or purpose. However, plain text usually does not contain this information or only partially contains it. Because of this, sentences created by machines and sentences created by people often come with a different feeling.

Instead, a language model learns the vast amount of information in the world, which can far exceed the range of knowledge a single individual can handle. Compared to the physical limit of human brain capacity, machines can increase capacity almost infinitely. Because of this difference, the language model is suitable for generating sentences by combining existing contents rather than generating high-level reasoning or new contents, and can show good performance especially when dealing with knowledge .

ChatGPT's linguistic inference

How ChatGPT Works (Source: OpenAI Blog)

How ChatGPT Works (Source: OpenAI Blog )

So, what is the level of inference of a language model like ChatGPT? In fact, it is not possible to give a clear answer because the level and degree of inference can be defined in many different ways, but it is possible to judge how high-level inference is. For example, looking at a tree and saying "That's a tree" and saying "That's one of the gene expression resulting from hundreds of millions of years of evolution... ” is a very different level.

The inference level of a particular language model can be predicted to some extent by looking at what data and Objective it was trained on. ChatGPT predicts very well which word fits well in which context through a blank matching objective based on extensive text information. The initial purpose was to guess blanks, but even find hidden objectives on their own. This is the case if the training data contains a sufficiently large number of patterns. For example, if there are several pairs of original text/translation text in the training data, and there are many phrases meaning “This is a translation from language A to language B” around these pairs, then the objective is not only 'Fill in the blank' but also 'translation' self-expanding form. When GPT-3 first came out, it learned well even these extended objectives and drew attention by creating plausible sentences with just a few prompts.

ChatGPT maximized the performance of inferring knowledge in the form of conversation through additional learning called “Fine-Tuning” in addition to the objective of guessing blanks. In the fine-tuning stage, an additional objective called 'more suitable for human eyes' is learned. For this purpose, in addition to the vast amount of text information collected, additional training data directly annotated by humans was used. The exact scale is not known, but it is estimated that it was built extensively by investing a considerable amount of money for this annotation work. This is the main reason for the leap brought by ChatGPT compared to the existing GPT, and it can be seen as OpenAI's 'one move' to make bold investment based on the conviction that this method will work.

However, there are limitations to the annotation method that can be taken to achieve the objective of 'more suitable for human eyes'. In many cases, infinitely large amounts of data are required to create results that are more suitable and naturally readable when viewed by humans, but since this is impossible, ChatGPT focuses on questions and answers about knowledge in the form of conversation, We conducted fine-tuning by building learning data in the direction of creating results .

Example of using ChatGPT

Example of using ChatGPT

 

Advantages of ChatGPT

The strengths of these ChatGPT can be largely summarized into two.

1. ChatGPT is
It mimics human short-term memory.

Even if ChatGPT exchanges many questions and answers with the user, it understands the context of a conversation from a long time ago to some extent and generates results that reflect it , so its strength is to exercise memory well, but at the same time, it can be said that it is still in the stage of mimicking human short-term memory. can. This is because ChatGPT's comprehension is still judged by its perception of low-level patterns rather than its logical comprehension. For example, ChatGPT repeats the same phrase at the end of every answer, and no matter how much I ask to omit it, it doesn't understand. Nonetheless, it's strength is that it is surprisingly good at contextualizing previous conversations and often responds well to the user's needs.

2. ChatGPT
There are many different types of work you can do.

The second strength of ChatGPT is that it doesn't just do one thing well, it does a variety of things that require an understanding of the language compared to traditional models. ChatGPT organizes and informs knowledge, translates and summarizes, even writes code, and helps you fix it while talking.

Because of these characteristics, some people see GPT-3 or ChatGPT as early models of AGI (Artificial General Intelligence). It may be different depending on how you view G (General) in AGI, but nevertheless, AGI has meant reaching a high level of abstraction ability that humans have, rather than simply being a term used simply because it can do various things. , Still, it seems difficult to see that the term AGI suits the current ChatGPT.

The reason why ChatGPT can do various things is that it can be seen as a patterned work called 'the ability to combine existing knowledge well'. Of course, the term 'knowledge' is a more expanded form, so it is ambiguous whether it is appropriate to categorically say that it is good at only one thing, but even so, it is still difficult to compare it with human abilities.

advanced reasoning,
still human domain

With the advent of ChatGPT, language models have advanced remarkably, but nevertheless, high-level reasoning capabilities are still unique to humans.

In the case of machine learning models, the power of reasoning can be largely spread out in a “lateral” direction. It can translate hundreds of languages and even generate code in many different programming languages. In comparison, human reasoning can stretch more deeply in the direction of “species.” When translating, you can translate more naturally considering the situation or conditions, and when creating codes, you can create new logic that did not exist before, rather than combining existing codes. It might make more sense if you look at how humans understand and react to jokes. When humans are children, they don't understand jokes very well, but as they grow up, they gradually understand the ulterior motives of words. It takes considerable language skills to understand even “serious jokes.”

If we could invest infinite resources to build training data, then it would be a different story. If this becomes a reality, machines will be able to surpass humans in all aspects. However, since it is impossible to invest infinite resources, the journey to find a way to learn as efficiently as possible will continue, and it is not known at what point the AGI will emerge.

In conclusion, what ChatGPT does well at its current level is its ability to combine existing knowledge and generate more accurate and relevant results, especially while communicating with humans in natural language. One of the important implications of this is that, as predicted by many experts, it can change the game of search .

ChatGPT
The changing search market landscape

Today, we rely on search engines to find information we don't know, but in the future we're very likely to use models like ChatGPT.

This phenomenon is expected to flow similarly to when artificial neural network-based translation first came out. When artificial neural network-based translation first emerged from rule-based, statistical machine translation in the past, many experts were quite shocked. In fact, anyone who has developed a translation engine in the traditional way would have had a stronger impression than the current GPT-3 or ChatGPT. I was amazed at how something like this could happen.

However, we currently recognize machine translation as a natural tool rather than a surprise, and use it freely in our daily lives, such as when traveling abroad or when learning a foreign language is needed. What we expect from machine translation services is not perfect performance, but understanding if we make occasional mistakes, and moderate help when we need it.

After that, a model that can translate more than dozens of languages with one model (this also corresponds to a kind of multi-task learning with multiple objectives) comes out, and after pre-learning that utilizes massive mono-lingual data in addition to parallel corpus, fine- Through the tuning method, the performance has been gradually improved. Both ChatGPT and Translator are generative models, and there is not much difference from a technical point of view. Except that ChatGPT is much larger and used more data. (Of course, ChatGPT is a model that can even translate.)

Therefore, ChatGPT is expected to flow similarly to the case of the translation model. Because you can get more satisfying results with far less energy. It doesn't have to be perfect. If it can help in many situations, as it did with translation, that's good enough. If you need higher performance results or if you are willing to take risks, you can find a real expert or use a search engine.

A CONVERSATION BETWEEN AN AI AND A HUMAN MADE WITH THE IMAGE-GENERATING ARTIFICIAL INTELLIGENCE MODEL 'DALL-E'

 

With the advent of ChatGPT, the future search engine development direction is expected to flow as follows. First of all, the UX will change to a model like ChatGPT and show the model results for specific queries, or only the model results. As personalization technology develops further, there will be more cases where model results are shown alone.

It seems that the troubles of choosing an appropriate search term will change to a form in which natural language can be entered comfortably and for a long time. That doesn't mean that traditional search engines will become obsolete. Because the intentions and types of queries are so diverse, search engines will continue to play an active role, especially in areas where timeliness and accuracy are required.

Alternatively, a new type of service may be introduced that can obtain information as if having a conversation using only a model, completely separate from the current search engine. It seems that such a service can be created one by one in several domains. in medicine, law, etc. It can be used by ordinary people, but there is also a lot of room for experts to use it. Since no expert can know everything, there is no reason not to rely on machines with near-infinite memory.

Currently, it is a model that shows excellent performance for general knowledge such as Wikipedia, but if you build learning data for a specific domain and fine-tun it, it can be quickly expanded to domains such as medical and law. However, ChatGPT's learning data could be annotated by the general public, but it may take a little longer because the specialized domain area is not. In the end, there are problems with model speed, inference cost, and training cost, but it's more of a matter of time.

It remains to be seen how large IT companies that operate domestic search engines will respond in the future, but better performance results can be obtained by using language models, and natural language and conversational UX seem to be irresistible, so they are not standing still. looks like it won't In particular, although Naver is already showing good results, it is expected that they will try to quickly tap into Korean-specific learning and domain areas that GPT has not yet fine-tuned . This does not seem to be an issue exclusive to large IT companies. It seems that Open AI or Google will open a pre-learning model in a form that can already be fine-tuned, so several startups and IT companies are paying attention. We look forward to the new value that AI will create in the future.

 
 
 
  • Upstage, founded in October 2020, offers a no-code/low-code solution called "Upstage AI Pack" to help clients innovate in AI. This solution applies the latest AI technologies to various industries in a customized manner. Upstage AI Pack includes OCR technology that extracts desired information from images, recommendation technology that considers customer information and product/service features, and natural language processing search technology that enables meaning-based search. By using the Upstage AI Pack, companies can easily utilize data processing, AI modeling, and metric management. They can also receive support for continuous updates, allowing them to use the latest AI technologies conveniently. Additionally, Upstage offers practical, AI-experienced training and a strong foundation in AI through an education content business. This helps cultivate differentiated professionals who can immediately contribute to AI business.

    Led by top talents from global tech giants like Google, Apple, Amazon, Nvidia, Meta, and Naver, Upstage has established itself as a unique AI technology leader. The company has presented excellent papers at world-renowned AI conferences, such as NeurIPS, ICLR, CVPR, ECCV, WWW, CHI, and WSDM. In addition, Upstage is the only Korean company to have won double-digit gold medals in Kaggle competitions. CEO Sung Kim, an associate professor at Hong Kong University of Science and Technology, is a world-class AI guru who has received the ACM Sigsoft Distinguished Paper Award four times for his research on bug prediction and automatic source code generation. He is also well-known as a lecturer for "Deep Learning for Everyone," which has recorded over 7 million views on YouTube. Co-founders include CTO Hwal-suk Lee, who led Naver's Visual AI/OCR and achieved global success, and CSO Eun-jeong Park, who led the modelling of the world's best translation tool, Papago.

    Go to Upstage Homepage

 
Previous
Previous

OCR model learning is easy and powerful for anyone! - [Starview Vol. 5] Changhyun & Inha & Joohyeon of Upstage Labeling Space

Next
Next

Education powerhouse helping expand AI base - [Star View Vol. 4] Contents & Education Team