Upstage Dominates Global AI OCR Competition, Beat Amazon and Nvidia in Four Categories


4/24/2023
  • Upstage sweeps first places at prestigious ICDAR 2023 Competition, surpassing Amazon and Nvidia

  • Upstage Hong Kong branch earns second place, cementing the company's leading position in AI innovation

  • With superior AI OCR technology, Upstage is committed to supporting innovation in all industries.


 

(Seoul, Apr. 25, 2023 /Upstage) Upstage, Korea’s leading AI startup, has demonstrated its advanced artificial intelligence (AI) technology by winning first place in four categories at the prestigious International Conference on Document Analysis and Recognition (ICDAR) competition.


At the ICDAR 2023 Competition, Upstage secured the first places in four categories, including HierText-1/2, VQAonBD, and IHTR 4, highlighting its global dominance in OCR technology.


Organized by the International Association for Pattern Recognition (IAPR), the ICDAR is the world's premier AI Optical Character Recognition (OCR) conference who challenges the field of Robust Reading, which involves detecting and recognizing text in digital images and videos. Since its inception in 1991, the conference attracts top researchers, companies, and experts in document analysis and recognition.


OCR technology is divided into two significant areas: "detection" and "recognition," which refer to locating and identifying characters in an image, respectively. Competing against major players in the global tech industry, such as Amazon, Nvidia, Alibaba, and Huawei, Upstage has outperformed them in both character detection and recognition fields, demonstrating its exceptional AI capabilities at a global scale.


Upstage's exceptional character recognition technology enabled the company to secure the top spot in the Indic Handwriting Text Recognition (IHTR) category, addressing character recognition challenges in ten widely spoken Indian languages. Despite having no prior experience with the languages, Upstage's character recognition technology delivered the highest-performing model with remarkable results.


Moreover, Upstage Korea and Hong Kong branches also performed exceptionally well in the "Hierarchical Text Detection and Recognition" category, claiming the first and second spots, respectively. This category tested a new task in OCR, utilizing a real-image dataset providing hierarchical annotations of text, including word, line, and paragraph level annotations called "HierText." This success highlighted Upstage's overwhelming technological advantage against its competitors.


In the Visual Question Answering on Business Document Images (VQAonBD) category, where simple OCR technology cannot receive high marks, Upstage showcased its potential by achieving the highest score with a significant margin over the second-place contestant. The VQAonBD category involved extracting the answer from data within the document image, performing operations to obtain values such as ratios, averages, minimum, and maximum values. Tasks included finding accurate answers to complex tax document questions such as "What is the tax total for 2019?". 


The Upstage team, comprising the Upstage Challenge team, the only Korean company with a two-digit gold medal at the Kaggle competition, and the Upstage OCR team, worked together to achieve remarkable results. The Upstage Challenge team includes two Kaggle Grandmasters and one Master, notably including Upstage Engineer Yunsu Kim, Korea's youngest Kaggle Grandmaster who achieved first and second place in the Kaggle competition for two consecutive years, along with Upstage's excellent researchers performing various OCR tasks for companies in the industry.


Upstage credits its outstanding performance in the competition to the use of new methodologies, intense research, and different approaches. While conventional detection technology reduces the word area significantly to avoid overlap between adjacent word areas during learning, the Upstage team improved the model performance by predicting word boxes and utilizing gaps between word areas instead of reducing the word area by just a little. 


This approach led to a significant improvement in the similarity (“tightness”) between the predicted box and the correct box, a newly established evaluation criterion for this competition.


AI OCR technology is crucial for digital transformation in AI, and many companies use it to speed up data digitization. Upstage is at the forefront of AI innovation in industries like Hanwha Life, Samsung SDS, and POSCO Group, developing an OCR Pack that leverages the best OCR technology available.


Upstage has introduced an AI Pack with a no-code/low-code solution and an API series that enables customers to use tailored AI technology based on customer information, product and service characteristics, and recommendation technology. With Upstage AI Pack, customers can easily process data, create AI models, manage metrics, and access up-to-date AI technology with continuous updates and support. 


Sung Kim, Upstage's CEO, expressed his excitement for winning first place in all four categories of the prestigious ICDAR 2023 AI OCR competition. Kim stated, "We are thrilled to have won first place in four categories at the ICDAR 2023, the most prestigious AI OCR competition in the field." He added, "We are delighted that Upstage has once again been recognized for its globally renowned AI technology, which will assist in the digital transformation and global innovation of industries requiring document automation."


In addition to AI Pack and API series, Upstage launched AskUp Biz, the business version of Korea's most popular Chat AI, AskUp. Upstage's OCR technology integration has earned it the nickname "ChatGPT with eyes." AskUp Biz is optimized for the business environment and offers three services: AskUp Doc, which uses Chat AI to read and extract information from various documents, AskUp Web, which provides website information to visitors, and AskUp Slack, which serves as a business tool on Slack. Since its launch, AskUp Biz has garnered significant attention, with hundreds of companies applying for the service.

 
 
 

※ Image Description: Leaderboard of the best result of all teams in the ICDAR 2023 Competition on Hierarchical Text Detection and Recognition, including average over ten Indian languages.

 
 
  • Upstage | Keunkyo Kim, PR Director | keunkyo@upstage.ai Upstage | Sungbeom Bae, PR Manager | sungbae@upstage.ai

    Download press release

  • Upstage, founded in October 2020, offers a no-code/low-code solution called "Upstage AI Pack" to help clients innovate in AI. This solution applies the latest AI technologies to various industries in a customized manner. Upstage AI Pack includes OCR technology that extracts desired information from images, recommendation technology that considers customer information and product/service features, and natural language processing search technology that enables meaning-based search. By using the Upstage AI Pack, companies can easily utilize data processing, AI modeling, and metric management. They can also receive support for continuous updates, allowing them to use the latest AI technologies conveniently. Additionally, Upstage offers practical, AI-experienced training and a strong foundation in AI through an education content business. This helps cultivate differentiated professionals who can immediately contribute to AI business.

    Led by top talents from global tech giants like Google, Apple, Amazon, Nvidia, Meta, and Naver, Upstage has established itself as a unique AI technology leader. The company has presented excellent papers at world-renowned AI conferences, such as NeurIPS, ICLR, CVPR, ECCV, WWW, CHI, and WSDM. In addition, Upstage is the only Korean company to have won double-digit gold medals in Kaggle competitions. CEO Sung Kim, an associate professor at Hong Kong University of Science and Technology, is a world-class AI guru who has received the ACM Sigsoft Distinguished Paper Award four times for his research on bug prediction and automatic source code generation. He is also well-known as a lecturer for "Deep Learning for Everyone," which has recorded over 7 million views on YouTube. Co-founders include CTO Hwal-suk Lee, who led Naver's Visual AI/OCR and achieved global success, and CSO Eun-jeong Park, who led the modelling of the world's best translation tool, Papago.

 
Previous
Previous

Upstage opens 'ChatGPT UP' lecture to teach you how to use ChatGPT

Next
Next

Upstage Launches OCR/Recommendation API Series